big bad data

Geeks Cry Foul at the New York Times’ ‘Big Data’ Series

These server farms are eeeeevil. Photo: Bob Sacha/Corbis

Writing in a general-interest publication for a mass audience, you get used to complaints from specialists. Every time the New York Times, The Wall Street Journal, or another national newspaper or magazine takes on a technical topic with atomized details, you can bet that its reporters’ mailboxes are filling with complaints from industry insiders who accuse them of oversimplification and excessive glossing. It’s part of the game.

Even by those standards, though, the tech sector’s response to the Times’ latest big heave — a multipart series on data centers and their use by the likes of Facebook, Amazon, and Microsoft, based on a year’s worth of research — has been particularly noisy, a surround-sound chorus of opposition and outrage.

The first article in the series, “Power, Pollution, and the Internet,” reports that always-on data-center needs are driven by excessive fear of outages and puts forth the controversial statistic that “only 6 percent to 12 percent” of the power used to keep data centers humming is actually being used to perform work at any given time.

The Times faced some mockery for its dumbed-down, explaining-computers-to-your-bubbe tone, which included defining a “server” as “a sort of bulked-up desktop computer, minus a screen and keyboard, that contains chips to process data.”

But when it came to the actual thesis of the piece — namely, that Silicon Valley’s wasteful power usage is a vast industry secret that belies its ecofriendly reputation — the gloves came off.

Wired’s Robert McMillan wrote, “Largely lost amid the Times’ multiple-page analysis is the fact that the data centers built by a handful of internet giants are looking less and less like the 40- by 60-foot rental space that was home to Facebook back in 2006 — and that these advances are just beginning to trickle down to the rest of the industry.”

Diego Doval, a popular tech blogger and former CTO at Ning, called the Times story, “a mix of half-guesses, contradictions, and flat-out incorrect information that creates all the wrong impressions, misinforms, and misrepresents the efforts and challenges that the people running these systems face everyday.”

The Verge’s Tim Carmody added: 

The report also presents a distorted and outdated view of the internet and cloud computing. It focuses on frivolous media and entertainment, or “fantasy points and league rankings, snapshots from nearly forgotten vacations kept forever in storage devices.” It doesn’t really grapple with the cloud as an increasingly-essential element of infrastructure, powering industry, government, finance, and commerce, as well as personal communication and data storage.

Silicon Angle’s John Furrier concluded, “Bottom line: the entire New York Times’ article and current cloud series … is not only irrelevant to the direction of modern society, but their entire article is filled with inaccuracies and old data.”

The second article in the series, which came out yesterday, has inspired no kinder reaction.

Even Michael Manos, a data-center guru who was quoted in the article, had bones to pick with how it turned out: “After just two articles, reading the feedback in comments, and seeing some of the reaction in the blogosphere it is very clear that there is more than a significant amount of misunderstanding, over-simplification, and a lack of detail I think is probably important.” (Still, Manos praised Times reporter James Glanz, who he said was “incredibly deeply engaged and armed with tons of facts.“)

The fact that data centers use an unholy amount of electricity is a fair and fine point, and it could have stood on its own. The second article, about the messy tangle of local politics and environmental regulation involved in locating big data centers in rural areas, had a genuine news angle — a story about a ridiculous game of chicken between Microsoft and a utility board in Washington state that had fined it for not using enough power. But the hook, which at its heart is a story of badly written regulation and misaligned incentives, was used in the service of a larger, more insidious-seeming narrative about tech companies burning through fuel for no reason, with no regard for political or environmental consequences. (Microsoft has responded to the story, saying that it “uses what we believe are a few isolated situations to paint a negative picture of the relationship between big data center operators, including Microsoft, and the local community.“)

The problem with the Times article isn’t that it’s inaccurate. In a prize-bait piece like this, you can be sure that every fact has been checked a dozen times. Rather, the issue is a sort of willful commingling of issues — the kind that’s usually shaped by too many editors wrangling too many ancillary sub-narratives into a finished story.

With the extraordinary amount of space the Times has been giving the big-data series — 7,500 words and counting — turning a year’s worth of research into a narrative with thrust and arc that is accessible to a general audience requires major compression and conclusion-drawing. And so, what may have started in James Glanz’s mind as a focused, thesis-driven investigation — “Big data centers use a lot of power and maybe sometimes they don’t have to use as much” — became, over the course of a thousand editor’s tweaks, a grand allegory about The Internet in Our Time, complete with broad-brush commentary meant to bring the issue home to the average Times reader.

The tech press’s (legitimate) beef isn’t that arcane details are being left out of the Times series. It’s that the overarching narrative that will stick in readers’ minds — that Facebook and Microsoft and other companies are destroying the environment in order to give you 24–7, no-downtime access to stupid cat videos — is simply too pat.

Keeping our Internet services fully operational does require a gigantic amount of power. Nobody’s disputing that. But the connection from observation A (“Look how much power these things use!”) to conclusion B (“It’s all going toward playing cat videos!”) doesn’t do Glanz’s considerable research justice. As Carmody put it, “It’s only when we recognize that the internet isn’t a pointless distraction, but is becoming as fundamental to our lives as roads, plumbing, and petroleum, that we understand why data usage and energy costs continue to grow and grow.”

It’s not the Times’ job to make the tech press happy — indeed, giving generalist glosses to stories that have been playing out for years in industry publications is what big, national newspapers do. Unfortunately for tech bloggers, no amount of informed complaining is likely to make a difference. The big, moral takeaways from deep-dive Times investigations tend to stick, as G.E. learned last year, even if they’re not exactly right.

Geeks Cry Foul at Times’ ‘Big Data’ Series