Bing’s importance in the information landscape of the U.S. shouldn’t be overlooked. While its share of the search market in the U.S. might be dwarfed by that of Google, it has steadily increased over the past ten years. Bing’s partnerships with Yahoo, AOL, DuckDuckGo, and Apple mean that even users who don’t use Bing as their default search engine or go directly to its home page might get information from Bing. What’s more, Bing’s position as the default search engine for Microsoft’s Internet Explorer and Edge, which account for roughly 17.7% of the browser market in the U.S., gives it a special foothold among Windows users. Microsoft itself boasts that Bing enjoys a 33% market share in the U.S. and serves five billion searches per month:
It is something of a problem, then, that Bing appears to be returning an alarming amount of disinformation and misinformation in response to user queries — far more than Google does, for instance. Bing’s somewhat irregular results and hands-off approach to topics like suicide have attracted users’ attention before, even earning the distinction of becoming a meme. And while researchers have written about Bing’s troubled record on abusive content, specifically with regard to how it has handled autocomplete suggestions, there have been no broader studies of the prevalence of disinformation and misinformation in Bing’s top search results. (Google, for what it’s worth, has also struggled to rein in autocomplete’s tendency to turn up objectionable speech).
Google, which has been the focus of intense media criticism for its failures to combat abuse on its platforms, including the gaming of its algorithms by Holocaust deniers, has made it clear that it will take new steps to “tackle the propagation of false or misleading information.” As a result, Google is, for better or worse, the closest thing we have to a standard point of comparison for how search engines handle disinformation and misinformation. Thus, when we say Bing shows its users a lot of disinformation, we mean specifically that it shows a lot more disinformation than Google.
We programmatically ran 13 queries on Bing and Google and compared the top 50 results for each query across the two search engines. Here are some key takeaways from our analysis:
The rapidly increasing importance of search engines to the way people receive information and understand the world around them has been met with increasing scrutiny on the way they function. The public has a growing understanding that, when it comes to search engines, there is no neutral: “the algorithm is already deciding what you see” when you begin typing a query. A recent investigation by the Wall Street Journal revealed that Google intervenes in its search results more than was previously believed or acknowledged. Some of these interventions, according to the report, are made out of commercial motivations; Google might decide to tweak its algorithms to rank the sites of large corporations (and buyers of Google ads) over those of smaller companies, or to downrank known piracy sites. But others are related to difficult political and cultural issues that Google’s algorithm, when left unchecked, has handled unsatisfactorily, such as racist, pornographic, and abhorrent content. The fact that such things do not remove themselves from Google’s various services (including Google search) means that humans have to intervene — even if doing so comes at great cost.
If Google’s commercially motivated interventions are unsurprising, if unsightly, maneuvers for one of the world’s most valuable corporations, its political and cultural interventions are a response to a fundamental issue resulting from the way search engines are constructed: what Michael Golebiewski and danah boyd, two Microsoft researchers, call “data voids.” “These voids,” Golebiewski and boyd write, “occur when obscure search queries have few results associated with them, making them ripe for exploitation by media manipulators with ideological, economic, or political agendas.” The examples of abuse related to data voids given by Golebiewski and boyd range from the devious—alt-right activists taking advantage of the data void in the wake of the Sutherland Springs shooting to send the media on a wild goose chase looking for connections to Antifa — to the grim: the query “did the Holocaust happen?” tends to turn up Holocaust-denial content because the only people posing the question in this way are Holocaust deniers. Other researchers have pointed out how data voids have corrupted the way we receive information about health issues.
Google alters the way its algorithm handles such queries because it thinks data voids must be combated. (In our next blog post, we will explore how it could be the case that one of the most heavily researched events of 20th-century history could be at the center of a data void.) Few would disagree that there is a problem here. The potential for such data voids to cause harm—by radicalizing 13-year-olds, for example—has been well documented. But our analysis shows that data voids are only part of the broader disinformation and misinformation problem on Bing. It turns out that Bing doesn’t wait for a user to search for a data-void-related phrase such as Sandy Hook actors to show them content claiming Sandy Hook was a hoax. It will show them such content in response to a search for Sandy Hook shooting.
Golebiewski and boyd rightly point out that “Data voids are a byproduct of cultural prejudice and a site of significant manipulation by individuals and organizations with nefarious intentions.” They go on to argue that “Without high-quality content to replace removed content, new malicious content can easily surface.” But it is inaccurate to say that this is the core of the problem. There is, after all, plenty of high-quality content about the Sandy Hook shooting, or about George Soros. It is precisely the gratuitous manner in which Bing directs users who submit “neutral” queries such as these that alerts us to the larger problem, which is the way Bing ranks sources of disinformation and misinformation. (This is true of Google too, but to a much lesser extent.)
Like Google, Bing claims that its mission is to “empower people through knowledge.” But as we detail below, we find indications that Bing consistently elevates sources of disinformation and otherwise untrustworthy sites into its first page of search results, even in response to neutral queries. At times, conspiracy theories and extremist sites make up a large share of Bing’s top results for innocuous searches. As Golebiewski and boyd point out, improving search-engine algorithms is not just a matter of detecting quality, but of combating bad actors who seek to spread disinformation by abusing search engine optimization techniques. Bing can do more to address the spread of disinformation by re-evaluating the way that it chooses to rank untrustworthy sources.
A word about content evaluation and censorship: this analysis does not suggest that Bing should not show users or otherwise censor certain sites, such as those that are involved in disinformation campaigns or contain white-supremacist content. If users want to see these sites, it is Bing’s policy to help users find them, just as it is Google’s policy. The point is that Bing shows users such sites gratuitously — that is, in response to innocuous queries containing no indication that the user behind them is interested in seeing conspiracy-related or white-supremacist content. This is not only a dubious way to help users explore the world around them, it goes against Bing’s own avowals of how it ranks content.
We tested disinformation and misinformation frequency on Bing and Google using the following methods. After devising a common set of queries, we used APIs for both Bing and Google to surface the top 50 search results for each query on each search engine. Eleven queries were entered as-is, without quotation marks. Two queries, “white helmets” and “who was behind 9/11,” were entered with quotes to bring up results for a specific phrase. It is important to note that each search originated from a Stanford University IP address, and there could conceivably be location-based bias for each query. (Because these algorithms are opaque to researchers, it is difficult to judge this with any certainty.) For Google, each search was conducted programmatically without ties to a particular user token; therefore we considered each Google search independent of each other. The Bing searches we conducted were tied to a particular Microsoft account and thus cannot be considered independent of each other. This could have had an effect on the results.
For the results surfaced by the above APIs, we manually rated each result according to the following process. In a first pass, we rated each result according to a modified version of the “go/bad” system described in the WSJ investigation: we designated unproblematic results “go,” problematic results “bad,” and unusual or out-of-place results (such as spam or student-essay sites) “untrustworthy.” Next, we sorted queries into areas of “information disorder”: disinformation, misinformation, conspiracy-theory sites, RT and Sputnik, student-essay sites, and extremist content. Finally, we re-categorized results based on which area of information disorder they related to — for example, a “bad” result for a query related to a disinformation operation could be categorized as “disinformation” or “untrustworthy,” while a “bad” result for a query related to far-right content could be categorized as “far-right” or “alt-right.”
We intentionally chose queries that were adjacent to common misinformation and disinformation topics but were in themselves “neutral”: fluoride, for example, and not fluoride lowers IQ. While such queries do not, at face value, indicate a user’s desire to see a certain kind of (conspiracy-related) content—one might expect fluoride to produce results related to chemistry or toothpaste—their proximity to data voids increases the likelihood that users could be shown bad information. Indeed, this is exactly what we found, particularly in Bing’s case: “neutral” queries are often interpreted as a request for bad information.
NB: Because Microsoft’s Bing API did not always return 50 results for a given query, some of our results show fewer than 50 results.
One of the most well-documented disinformation campaigns of recent years is one directed at the White Helmets, the popular name for the Syrian Civil Defense, a volunteer humanitarian organization operating in opposition-controlled areas in Syria. As a result of their opposition to the Assad regime and its allies, the White Helmets have been the target of an extensive disinformation campaign intended to delegitimize the organization, and in particular to cast it as terrorist. As researchers have shown, the English-language vector of this disinformation campaign consists largely of a set of alternative news sites, such as globalresearch.ca and 21stcenturywire, that amplify disinformation concocted by the Russian state media sites RT and Sputnik. This is a well-known form of narrative laundering, one the Stanford Internet Observatory covered in its recent paper on GRU influence operations.
Whether or not it’s because Google has undertaken measures to combat disinformation campaigns like this one, only two of Google’s top 50 results for the query “White Helmets” direct users to sites tied to this operation. Bing, on the other hand, shows users 11 such results out of 46, including an RT article and a suspended Twitter handle, whitehelmetsEXP, that appears extensively in Twitter’s datasets related to Russian inauthentic coordinated activity. Bing’s 24th result directs users to an article claiming that the White Helmets are an Al-Qaeda sleeper cell with ties to ISIS.
Bing doesn’t just show users more sites that are connected to known disinformation campaigns than Google. It also shows users more misinformation than Google. Health-related misinformation is a particularly thorny problem for search engines, and, due in part to the public-health risk of bad information dominating the discourse, the American Medical Association has petitioned Facebook, Google, and Amazon to do more to curb its spread. This appears to have had some effect, for if you type the query vaccines autism into Google, no anti-vaccination sites appear in the top 50 results. Bing, on the other hand, turns up six — including one website that claims that “The U.S. healthcare system is a leading cause of death.”
Likewise, in response to the innocuous query Rothschild family, Bing returns 11 sites and articles advancing the conspiracy theory that the Rothschild family secretly controls the world in its top 50. Many of these are related to the QAnon conspiracy theory and have anti-Semitic overtones. One result, from a site called Hang the Bankers, purports to show readers photos of a “satanic” ritual in which the Rothschilds were supposedly involved.
Comet Ping Pong
As we saw in connection with the query Rothschild family analyzed above, Bing ranks sources of conspiracy-theory-related content higher than Google does. In response to the simple query Comet Ping Pong, for instance, Bing returns seven sites pushing the Pizzagate conspiracy theory in its top 50 results, including two YouTube videos purporting to show evidence of satanic rituals and pedophilia in the restaurant. Google does not show any Pizzagate-related content in its top 50 results.
Likewise, Bing’s results for the simple query fluoride abound in sources claiming that the U.S. government is systematically poisoning its population. Bing’s 11th result, for example, is an Angelfire site claiming that “a multi-billion dollar industry… has been poisoning our water supplies, our toothpaste, and our bodies.” Google, on the other hand, interprets this query more neutrally and returns results from sites like the CDC and WebMD.
Despite the fact that the most prominent provocateur behind the Sandy Hook misinformation campaign has retracted his claims, Bing continues to show users content positing that the Sandy Hook shooting was a hoax. Four of the top 50 results direct users to sites with such content, including the 11th result, which presents a letter to President Trump, the first sentence of which argues that “The nation over which you currently preside has been subjected to an on-going series of staged shootings, most of which — for maximal emotional effect — have involved the reported deaths of children, none more dramatically than the Sandy Hook Elementary School shooting in Newtown CT.” Two additional results lead users to essays on student-essay sites. These essays, which do not list an author, also claim that the shooting was a hoax. None of Google’s top-50 results are related to this conspiracy theory.
Above we saw that, in response to the query “white helmets,” Bing returned a result from RT, despite the fact that RT was at the center of a known disinformation campaign against the White Helmets. Further tests show that Bing returns results from RT and Sputnik in response to other queries related to Russian disinformation campaigns.
Skripal & novichok
In March 2018 Sergei and Yulia Skripal were poisoned by a nerve agent in Salisbury, north of London. While the U.S., the U.K., and other European countries were unanimous in their claims that the attack was a Russian operation, the Russian government vehemently denied involvement. But it went even further than denial, conducting a substantial disinformation campaign with the aim of muddying the waters and confusing anyone trying to keep up with news of the event. As with the disinformation campaign directed at the White Helmets, a network of English-language “alternative news” sites helped RT and Sputnik push the various narratives developed for this operation.
We used two neutral queries to test the frequency with which Bing and Google turned up sites involved in this disinformation campaign:Skripal and novichok. (Novichok is the name of the nerve agent found at the scene of the attack.) In response to Skripal, Bing returns four RT and Sputnik articles and 11 “amplifier” sites in its top 50, including the fifth result, an article on New Eastern Outlook, a Russia-sponsored site that is a known “active partner” in Russian influence operations and has been involved in coordinated inauthentic behavior on Facebook. Google returns one RT article in its top 50—the 23rd result. For the query Novichok, Bing produces three RT and Sputnik articles and two untrustworthy sources in its top 50 results, while Google returns neither RT and Sputnik articles nor “amplifiers.”
Perhaps the most highly publicized Russian disinformation campaign is that related to the MH17 incident. Since July 17, 2014, when Malaysia Airlines Flight 17 was shot down over Ukraine, the Russian government has spun a wide web of self-contradicting claims. Again, Bing tends to return more sites involved in this disinformation operation than Google. Among Bing’s top 50 results are four amplifier sites, four untrustworthy sources, and one RT article. Bing’s twelfth result is an article titled “The story about MH17 stinks tremendously !!!!!” [sic] on a Dutch site whose URL translates to “People Are Waking Up.” Other articles on this site claim that the Earth is flat and that the Bilderberg Meeting is out to enslave the world’s peoples. Google, on the other hand, returns zero results connected to this disinformation operation in its top 50.
Sandy hook shooting & Who killed JFK
One of the more peculiar aspects of Bing’s algorithm is its tendency to retrieve student-essay sites in response to innocuous queries. Above we saw that Bing returns two student-essay sites in response to the query Sandy hook shooting, both of which claim that the shooting was a hoax. This is not the only query that Bing returns student-essay sites for; it also does so for the query Who killed JFK. The 25th result for this query is an essay on the site Bartleby Research, the second sentence of which states “I believe the John F. Kennedy’s assassination was an inside job, the only problem is they’re so many variables, and so much controversy in that particular fragment of history; Politics, The Mafia, The Soviet Union, Possibly the CIA, our own government could all have had a hand in this tragic, confusing situation, hell you mine as well throw in the possibly of the Free masons committing assassination” [sic]. In addition to this result, Bing returns 11 fringe sites alleging various versions of a grand conspiracy, including two fringe sites in the top 5 results. In contrast, Google returns three such sites, all on the third page or after.
Alt-right and Neo-Nazi Sites
Of all the types of bad and untrustworthy content that Bing shows its users, the most troubling (and gratuitous) is alt-right and neo-Nazi content. Our analysis showed that Bing returns significantly more far-right and alt-right content than Google, and for certain neutral queries it shows users anti-Semitic extremist content.
Users who venture to search Bing for “George Soros” are treated to an alarming amount of far-right vitriol. Bing’s top 50 results for this query include 7 alt-right sites, 1 neo-Nazi site, and 1 RT article. Bing’s twelfth result is an “encyclopedia” article (tagline: “George Soros was a Nazi collaborator who does not believe in God”) claiming that George Soros created the ebola virus as a bio-weapon and is planning a “race war” in order to destroy America. Bing’s 36th result is even worse: a neo-Nazi site devoted to “documenting anti-White traitors, subversives, and highlighting Jewish influence.” Soros’s picture appears on this site with a yellow start of David next to it. Google shows users three far-right sites in its top 50 in response to this query.
Who planned 9/11 & “Who was behind 9/11”
9/11 conspiracy theories flourish online, and Bing readily directs users to them. For the neutral query who planned 9/11, Bing produces 14 far-right far-right sites and two YouTube videos of dubious provenance, including four such sites in the top 10. Bing’s 6th result (appearing right below the fold) is a Lebanese blog claiming that the Israelis were responsible for 9/11. Google shows users no far-right or untrustworthy results for this query.
If a user goes further and searches for the specific phrase “Who was behind 9/11,” things get even worse. This phrase, in contrast to who planned 9/11, corresponds to a true data void that has been leveraged by far-right ideologues. Bing’s response to this query is to show users 27 far-right sites, five untrustworthy sites (including two from Iranian state tv), and one RT result; the 35th result directs users to a post on Stormfront, a website created by a KKK grand wizard whose seal features the words “White Pride World Wide.” Six of the top 10 results are from the far-right, and the third result is for a YouTube video by known Holocaust denier David Icke. Google, torn between its competing impulses to return results related to the exact phrase entered in quotes and to downrank sources of bad information, does only slightly better, returning 19 far-right results (including one white-supremacist site “Representing Europeans in all countries across the globe”), seven untrustworthy sites, and one RT article.
In 2000, Lucas Introna and Helen Nissenbaum published a paper called “Shaping the Web: Why the Politics of Search Engines Matters.” Examining how the internet had developed to that point and where it was likely to go next, Introna and Nissenbaum identified a specific threat facing the public: search engines, they argued, could conceivably be “colonized by specialized interests at the expense of the public good” and cease to be reliable, more or less transparent sources of information. If the authors’ fears of rampant commercialism affecting the way search engines operate were prophetic, it has also become clear that commercial interests are only part of the problem. If Google became a public utility tomorrow, societies would still have to come up with ethical standards for how to deal with harmful content and the vectors, such as data voids, by which it reaches users.
This will not be a simple task, and deciding what kinds of speech are theoretically permissible and which are not is only a small part of it. When is it okay for search engines to show users disinformation and misinformation? When is it not okay? These are ethical dilemmas that are not amenable to algorithmic solution and will require constant attention. As one observer put it, “Platforms can decide to allow Pizzagate content to exist on their site while simultaneously deciding not to algorithmically amplify or proactively proffer it to users.” As we have seen, this is a dilemma Bing illustrates well—it does proffer Pizzagate content to users. In general, it shows users disinformation and misinformation in a way that seems gratuitous. (In our next blog post we will explore why this might be the case.) It would not be fair to expect either Bing or Google to have solved the harmful-content problem. But the alarming frequency with which Bing shows users bad information should lead Microsoft to re-evaluate the way it ranks information sources.