Google’s Dangerous Identity Crisis

By

Is Barack Obama planning a coup? There are many ways to answer that question — “Why are you asking this question?” “What on earth would make you think that?” and, most simply and most accurately, “No.” Until this weekend, if you asked that question to Google — one of the most important sources of information on the planet — the answer was a paragraph excerpted from a website called Secrets of the Fed: “not only could Obama be in bed with the communist Chinese, but Obama may in fact be planning a communist coup d’état at the end of his term in 2016!”

Over the weekend, the Outline published a lengthy report on the barely regulated Wild West that is Google’s “featured” snippets — the highlighted boxes that sometimes appear at the top of Google search results. For an increasing number of searches, Google’s program will attempt to highlight and excerpt the best answer to your query, without you even needing to click on a link. No longer can Huffington Post capitalize on “Super Bowl start time” search-engine optimization; if you ask Google “what time does the Super Bowl start?” a big box (in this case excerpted from NJ.com) will tell you: 6:30 p.m. EST.

The main thing to understand about these boxes is that they are chosen and filled out not by a dedicated editorial staff, but by Google’s search algorithm — a complex hive of weighting and ranking functions that determine where a link falls on the list. When you ask Google a question, its algorithm picks out whatever it thinks is the correct or best answer, and summarizes it prominently atop the results. It’s impossible to say exactly how it works — partially because Google keeps it secret, and partially because the program is complex enough that it’s difficult to know how changes to code will affect search results — but generally speaking, Google pulls the answer for its highlighted box from one of the first three results.

The problem, naturally, is that Google’s highlighted answers work terribly for queries that don’t have a definitive answer. Answers to queries about facts and figures, like historical names and dates, are accurate most of the time (but not always, about which, more below). Asking highly subjective questions like “how do I get a date?” yields more dicey results.

So why would Google bother to do this? For much of the web’s history, search engines operated more as digital phone books than as anything else — directories that helped you locate websites you needed to see. As internet usage patterns have shifted to mobile apps and social networks, search engines are used more often as oracles for seekers to interrogate. This trend is encouraged and exacerbated by the Google Home, the Amazon Echo–like home-assistant chatbot speaker that’s supposed to answer any question automatically and accurately (key word: supposed to).

For devices without a screen or easy means of user input, the system needs to figure out what the “best” answer is on its own. Oftentimes, “best answer” means “top result,” which itself translates to “what most users clicked on.” If you ask Google a question like “is Obama planning a coup?” you get … an odd response — and no accompanying list of results to compare against.

The highlight box raises a lot of existential questions about what Google, the search engine, is, and what its obligation is to us. On the one hand, users obviously need accurate information. They don’t need the website Secrets of the Fed bolstering conspiracy theories of a communist coup from our secret Muslim ex-president.

On the other hand, at its core, Google is an index — a searchable database of what is available on the internet. This separates it from a product like Facebook’s News Feed, which is just a list of things your friends have shared, and makes no claim to completeness. Google can be a useful source of information, but it’s more accurately understood as a snapshot of what people are putting on the World Wide Web. Needless to say, sometimes those notions can be diametrically opposed. Google shouldn’t tell me that Obama is planning a coup as a fact, but it needs to serve up something if I ask for “Obama coup conspiracy theory.”

And, anyway, what do people want when they search “Obama coup conspiracy theory”? Late last year, the company caught flack for featuring Holocaust-denying sites high up in results for queries like “did the Holocaust happen?” But when Google is understood as providing a picture of what you can find on the internet, it’s not surprising that a question with its own implicit bias (i.e., that there is a chance that the Holocaust might not have happened) would reveal results that align with that bias. Simply searching “Holocaust,” on the other hand, brings up, as expected, reams of accurate information.

This is the crux of the problem: Google can only show you information if it exists on the web. There are no news stories about Obama not planning a coup, just as web pages about the Holocaust tend to take as a given that it happened. Google can’t refer users to a web page that doesn’t exist, and it is — so far — not in the business of crafting rebuttals itself, which is why conspiracy theories dominate briefly, until they can be noticed and rebutted.

Google wants its search engine to be an artificial intelligence, but right now, its highlighted answers are just the dressed-up results of an advanced text-matching algorithm. For AI to really work, it needs a layer that evaluates not just whether the information is relevant to the query, but whether or not it’s accurate, too. It’s easy to tell when facts are accurate; it’s nearly impossible to know when advice is. And that doesn’t even get into the more holistic concerns about bias, or incorrect assumptions, in question wording.

The easiest solution would be for Google to hire a team of human editors to monitor and course-correct as needed, but that will almost definitely never happen. Google is among the tech companies pushing harder for AI breakthroughs, and conceding a need for human editorial involvement counteracts those goals. (To hire human editors would also cement Google’s new identity as “source of accurate information,” instead of just “source of what people on the web say is accurate information.”) In reality, Google’s best editors are its user base, and the company might do well to explain more clearly why a certain snippet was chosen, and more enthusiastically solicit user feedback. These highlighted results showed up seemingly out of nowhere, as a fully formed final product. What they need is a big “BETA” tag hanging off of them, like Gmail had for years.

Google’s Dangerous Identity Crisis