When Kira Radinsky began studying at Technion-Israel Institute of Technology as a 15-year-old, she had interests uncommon for a teenager. “My main passion at the time was trying to combine biology and computer science to prevent aging and prevent disease,” she says.
While the fountain of youth has remained elusive, Radinsky did settle on a method for slowing the spread of disease. It hinged on an algorithm that analyzed historic news events to make predictions about the future. That same algorithm can make predictions about social unrest, environmental catastrophes, and more. Today, Radinsky is using her methods to help businesses synthesize reams of data and make better decisions with the information learned. We recently spoke with her to find out her strategies for looking into the future and learning from the past.
“Predictive data-mining algorithms” are your specialty. Can you explain what that means?
When I began, I was taking all the news since 1880 and making a computer read it like a human being, looking for patterns that repeat and using all of that to predict future news events. Now I’m using similar algorithms to help businesses find customers who are going to buy and who’s going to churn. Think of it as microeconomic modeling using data mining.
How did you get started with this?
When I first started my Ph.D., my adviser was very interested in semantic analysis. I was looking for something that would show me patterns of words and how they behave over time.
At the same time, there was news about how the world was going to end because of the Mayan calendar. There were also thousands of dead birds falling from the sky in Beebe, Arkansas. And there were thousands of dead fish washing ashore. The first thing I did was try to find those elements that cause birds and fish to die.
What I found was that six months before people were searching for the terms “fish death” and “bird death,” people were searching for “oil spill.” It turns out there’s oxygen depletion in water after an oil spill, and this process takes about six months. This got me thinking about how we can make predictions. If you looked at the news, nobody knew the cause [of the bird and fish deaths], but by analysis, we could find some potential answers.
Predicting the 2013 cholera epidemic in Cuba, its first in over a century, is one of your big successes. How did you arrive at that prediction?
By looking at all this news, we found that if you have a drought and a year and half later there are floods, the probity of cholera is much higher. But that only happens in countries with low GDPs and low concentrations of clean water, because it turns out that cholera can be treated very easily if you send clean water. Everybody knows that cholera is a waterborne disease, so it was no surprise that it comes after floods. It was surprising that it comes after drought, though.
Have you had any other notable hits?
We were able to predict the 2011 Sudan riots, and afterwards, the rise in the oil prices. The pattern there was, if you have a subsidized product and you stop subsidizing it, you’re going to have student riots. And if, after the student riots, a policeman kills a student, you’re going to have much bigger riots. The pattern was based on events in Egypt.
What about misses? When does the algorithm go wrong?
This is a probabilistic system. We don’t say if something is going to happen, we give it a probability.
What are some of the other predictions you’ve made?
We’ve seen that if you have areas with a lot of diversity and a minority is killed by the police and the minority wasn’t armed, you’re going to have riots. It turns out there are many different times for those riots. One is the first news outbreak. The second is the funeral. Another one is the trial — usually the police are not found guilty, so there are big riots then.
But the biggest one is if the mayor or another high-ranking politician says he agrees with the verdict. If someone powerful says he agrees with the court, there are going to be big riots. What was interesting about Ferguson is that Obama didn’t say much about his opinion.
What are your long-term goals for your prediction algorithms?
My biggest passion is making decisions somewhat more scientific. I’m not saying we can predict the future. Sometimes we don’t have the data. But it kinda feels weird that doctors are making decisions based on a gut feeling and not based on all the data we already have.