Math Is Biased Against Women and the Poor, According to a Former Math Professor

By
Photo: Image Source/Getty Images

When former Wall Streeter and data scientist Cathy O’Neil says, “We’re living in the middle of an arms race,” she isn’t referring to nuclear threats in North Korea. Rather, she is hinting at what she calls “weapons of math destruction,” biased mathematical models that are based on fallible, and often incorrect, assumptions about people.

“People assume big data is always better,” she says from underneath a mop of blue hair at a Think Coffee in lower Manhattan. “But there’s a lack of accuracy and with big data, we’re just throwing it at everything and assuming that it works.”

The former tenure-track math professor at Barnard College quit in 2007 to join renowned hedge fund D.E. Shaw, once part-owned by Lehman Brothers, and apply her skills to finally make some cash. Within the year, the subprime lending industry exploded, unemployment rose, and math experts “wielding magic formulas” were to blame. Though she wasn’t working on issues directly connected to the crisis, O’Neil felt like she had played a small part in what happened. “I wasn’t a rich person,” she says. “I was still worried about our money. That isolation, that got to me.”

O’Neil eventually left finance, rebranded herself a data scientist, joined Occupy Wall Street, and started her blog mathbabe.org to reveal the way biased math models hurt the poorest of the poor. Now she’s written a book, Weapons of Math Destruction, calling out the worst offenders.

The prime example of her thesis is recidivism models, which are used across the country in sentencing convicts. “People are being labeled high risk by the models because they live in poor neighborhoods and therefore, they’re being sentenced longer,” she says. “That exacerbates that cycle. People are like, ‘Damn, there are some racist practices going on.’ What they don’t understand is that that’s never going to change because policemen do not want to examine their own practices. What they want, in fact, it is to get the scientific objectivity to cover up any kind of acts for condemning their practices.”

Perhaps the starkest example of how big data contributes to inequality, though, is how it affects women in the workplace. O’Neil references San Francisco–based start-up Gild, which attempts to make hiring easier by quantifying an applicant’s social capital through their engagement with influential industry contacts. Because a certain subset of talented engineers were all frequenting a Japanese manga website, that nudged up their hiring score. Few women, however, were visiting the site because of its sexual tone, which meant they didn’t get the score bump. “If you Google for high-paying jobs, the web just doesn’t think of women as successful, and that translates into every machine learning algorithm that has to do with résumés,” O’Neil says. “An engineering firm wants to hire an engineer, but in order to build an algorithm to help it, it needs to define success. It defines success with historical data as someone who has been there for two years and has been promoted at least once. The historical data says no woman has ever been here for two years and been promoted, so then the algorithm learns that women will never succeed.”

The big data monster also affects students trying to pay college tuition. “There is an arms race for prestige among the elite” because of the U.S. News college rankings, she says, which discriminates against talented but poorer students. O’Neil, who supported Bernie Sanders in his presidential run, advocates to make public colleges and universities free. “It could be a very simple world if we had state colleges that were either free or very affordable,” she says. “It’s something that we thought about with high-school education 100 years ago. We didn’t start out in this country with free high school.”

Of course, data is not inherently bad, O’Neil says, but individuals and society as a whole need to be more careful with so-called “natural conclusions.” “If we hand over our decision-making processes to computers that use historical data, it will just repeat history,” she says. “And that simply is not okay.”