q&a

Why Effective Altruists Fear the AI Apocalypse

A conversation with the philosopher William MacAskill.

Photo-Illustration: Intelligencer; Photos: Getty Images
Photo-Illustration: Intelligencer; Photos: Getty Images

Humanity is a wayward teenager. Our species has its whole life ahead of it, but the decisions we make now will irrevocably shape the course of our adulthood. We could recognize the stakes of this critical moment, buckle down, do our homework, drink responsibly, eat sustainably, prepare for pandemics, avert robot apocalypses, realize our full potential, and live a long, prosperous, meaningful life before dying peacefully in a supernova at the ripe old age of 1 trillion. Or we could party all the time, get into fights, start a nuclear war, create doomsday bioweapons, tremble before our new robot overlords, live fast, die young, and leave an irradiated corpse. We owe it to our future selves — which is to say, to the hundreds of billions of potential future humans — to choose wisely.

So argues the philosopher William MacAskill in his new book, What We Owe the Future. MacAskill is a professor at Oxford and leader of the “effective altruism” movement. In recent years, his concern for maximizing his positive impact on the world has led him to champion “longtermism,” a philosophy that insists on the moral worth of future people and, thus, our moral obligation to protect their interests. Longtermists argue that humanity should be investing far more resources into mitigating the risk of future catastrophes in general and extinction events in particular. They are especially concerned with the possibility that humanity will one day develop an artificial general intelligence, or AGI, that could abet a global totalitarian dictatorship or decide to treat humanity like obsolete software — and delete us from the planet.

Longtermism has attracted some criticism both from within the effective-altruism community and outside it. I recently spoke with MacAskill about those critiques, Elon Musk’s affection for his philosophy, whether humanity is a net-negative phenomenon, and why he ditched left-wing activism for effective altruism.

How did you end up a longtermist? What sequence of moral epiphanies led you to dedicate yourself to reducing the probability of a robot apocalypse?
So longtermism is a part of effective altruism, and effective altruism is about asking the question “How can we do the most good? How can we use our time and resources as effectively as possible for the purpose of making the world better?” then actually following through on the answer — so devoting a significant amount of our resources such as committing 10 percent of our income to highly effective philanthropies or shifting our career to a field where we can have more impact.

And when you say “making the world better,” you mean improving the lives of all sentient beings, right?
Yes. A core idea of effective altruism is that all people matter even if they live thousands of miles away. Longtermism is an extension of that idea: If people matter morally regardless of their distance from us in space, then they matter regardless of their distance in time. So we should care about future generations. And right now, we as a society are not doing nearly enough to protect the interests of future people.

None of this means we can’t show any partiality to specific people. Maybe we owe some additional obligations to those who are close to us in space and time, who are near and dear, who have benefited us. Nonetheless, we should still give very substantial moral consideration to people who are not yet with us.

The second premise is just that, in expectation, there are enormous numbers of future people. We really might be at the beginning of a long and flourishing civilization. And when you take the sheer number of future people into account, then we shouldn’t just think about future generations a little bit but really quite a lot. How much precisely is open to debate. All longtermists think we need to be doing more to benefit future generations. “Strong longtermism” holds that the well-being of future generations is actually the most important thing to consider, at least when we are making our most consequential decisions.

And it’s most important because there are just so many future people?
Yeah, exactly. If you can save one person from drowning or ten people from burning in a house fire, I think you have a moral obligation to save the ten. And that’s because everyone counts equally. Therefore, the interest of ten outweighs the interest of one. Similarly, if you can do something that just impacts the present generation or something that has a positive impact on many future generations, then you want to do the thing that helps more people.

And there are various ways in which we can improve the world that future generations inhabit. One concerns trajectory changes — ways of improving the future even in scenarios where our long-term survival is assured. One focus there would be improving society’s values and on avoiding what I call “bad-value lock-in.” There are moments in history when contingent value systems become entrenched and shape the lives of many generations. So it’s really important to mitigate the risk of a future — stable — totalitarian dictatorship.

In other words, preventing an actually successful “thousand-year reich”?
Yes, exactly. Then the other focus is to make sure that we have a future at all, such as by reducing the risk of worst-case pandemics.

One of the most common objections to longtermism is that it is a fundamentally hazardous form of moral reasoning. After all, some of the most destructive ideologies of the 20th century were premised on the notion that the well-being of existing people was less important than building a better world for the generations to come (hence Hitler’s prophesied “thousand-year reich”). And while nothing in longtermism would justify mass murder, it does seem to suggest that an action’s consequences for future generations is almost infinitely more important than its consequences for existing people. Add to this the concern for preventing the end of the human race and some argue that you have a recipe for justifying atrocity. 
I want to say two things. First, there is a long history of moral ideas being used for abominable ends. I think that’s really not a longtermist-specific thing. I’m a big fan of liberalism, of liberal democracy. But liberalism was also used to justify colonial atrocities. If you look at the long history of religious violence, you see that religious ideals were long used to justify atrocities. So, certainly, for anyone promoting a new set of moral ideas, there’s always moral hazard. Peter Singer’s advocacy for animal well-being inspired terrorist attacks against members of Parliament in the U.K.

I want to oppose, in the strongest possible terms, the use of extremely naïve consequentialist reasoning where it’s like, Oh, the ends will justify the means. Because long-termism is not, in any way, about “ends justify the means” reasoning. Certainly, I oppose violating people’s rights in order to bring about a greater good.

But then, also, naïve consequentialism doesn’t actually deliver better consequences. It’s not like we look back at history and are like, Oh, yeah, Stalin and Mao, they really got it right. More generally, I do think we should be skeptical and worried about totalizing visions of the good. We should not believe that we’ve come to the correct view about what is good in the long term. We should think we’re very far away from the moral truth. What we need to do is build this liberal society with lots of diverse moral perspectives so that we can actually make moral progress over the long run and adjust our values in response to new insights, information, and challenges. We don’t want to force any particular ideology on the whole world.

Longtermists may believe that human rights are inviolable. But doesn’t strong longtermism lend itself to rationalizations of more mundane violations of the interests of existing people? For example, a lot of people on the left view longtermism suspiciously in no small part because Silicon Valley industrialists, like Elon Musk, seem to view it favorably.
I will say, some articles critical of longtermism make some flagrantly false claims about this stuff, like the idea that Peter Thiel is a big longtermist. I’ve never met Thiel. I don’t like him. It’s just this unfounded claim that is meant to taint these ideas by association.

But Musk is sympathetic. And the suspicion is that longtermism provides techno-optimist CEOs like Musk with a rationale for ruthless business practices. For example, Tesla is notorious for union busting. And yet, while union busting may lower the well-being for workers in the present, it could reduce Tesla’s labor costs — thereby allowing Tesla to lower its prices, thereby expediting the transition to electric vehicles, thereby reducing carbon emissions, thereby yielding a more hospitable planet to the hundreds of billions of humans yet to come. 
I really don’t know about Tesla and their working conditions in particular, but, again, I just really want to oppose “You’ve got to break a few eggs” reasoning. Strong longtermism isn’t a claim about ends justifying the means; it’s just saying that it’s better to prioritize the consequences of your actions for the long term in circumstances where you’re not violating anyone’s rights.

But the question is which interests qualify as inviolable rights. Longtermism may not justify doing active harm for the sake of benefiting future people, but it does endorse doing passive harm, doesn’t it? Which is to say, it does argue for reallocating some amount of finite resources away from the global poor and toward preventing hypothetical future threats, right?
It does do that.

So if it is permissible to effectively take potentially lifesaving funds away from the global poor today in order to benefit future people, why wouldn’t union busting be permissible? 
I guess the key question there is just what you think about union busting.

Well, regardless of what one thinks about that specific issue, I think most of us believe that there are actions that are both somewhat harmful and don’t rise to the level of human-rights violations. So why would perpetrating minor harms be impermissible if shifting money away from the poor is acceptable?
So, on the one hand, you’ve got one person drowning, ten people in a burning building. Who do you save? It’s like, Okay, you save the ten. Okay. Now a new thought experiment: You can save the ten from burning, but you will have to drown someone — you’ve got to step on their head or something in order to make that happen. Intuitively, that’s not permissible.

Why not?
Because you’re using someone as a means.

I see. And so you have a diversified moral portfolio where you subscribe to utilitarianism to a degree but also to a rights-based deontological ethics that prohibits you from stepping on people’s heads.
Yes, exactly. But the situation facing a philanthropist is quite different. There’s a million problems in the world, and you have to choose which to prioritize. Every dollar I put towards saving people from dying of malaria is one I’m not spending on saving people from tuberculosis or AIDS. And as a result, there are often identifiable people who have died because you chose to prioritize one over the other.

If we’re spending resources to prevent the next pandemic, that means we’re not using those resources to give out bed nets. The money that Joe Biden just put toward canceling student debt could have bought a lot of bed nets and saved loads of lives. I’m not claiming that this is good or bad. I’m just saying there is always an opportunity cost. It’s just a horrible fact about the way the world is, that we have to make those decisions.

Some within the EA community argue that longtermists have miscalculated the opportunity costs of their endeavors. In evaluating the efficacy of different charities, effective altruists often calculate the “expected value” of a given donation — basically, they take the probability of an initiative’s success and an estimate of the total good that the initiative would do if it did succeed and then multiply them together.

Skeptics say that longtermists have essentially gamed this formula: Since the number of future people is so massive, any intervention — even those with a very small chance of success — ends up having a huge “expected value.” Indeed, in a recent paper, you wrote that donating $10,000 toward initiatives that would reduce the probability of an AI apocalypse by just “0.001%” would do orders of magnitude more “expected” good than donating a similar sum to anti-malaria initiatives.

And yet when EAs evaluate a charity that helps existing people in the present, their probability estimates generally derive from hard data, from random-control trials measuring a given charity’s outcomes in the past. By contrast, the probability estimates that you’ve generated for both the risk of a superintelligent AI ending civilization — and the likelihood that any given initiative will prevent such an outcome — are arguably based on nothing but subjective intuitions. Those intuitions may come from people working in relevant fields, but it’s not obvious that working in AI renders one capable of objectively gauging its hypothetical future importance. You could just as easily imagine this experience biasing one toward an overestimate; people like to believe that the work they do is very consequential. 
I would say three things to that. First, I think “People are inclined to overestimate the importance of the kind of work that they do” is not a strong objection in this context, mainly because so many of the EAs who are concerned about AI — maybe even the majority — have no background in AI. They just got convinced by the arguments. That’s true for me. Secondly, it’s not just a matter of intuition. To some extent, we do have to rely on expert assessments. Machine-learning experts say that there is a greater than 50 percent chance of human-level artificial intelligence in 37 years and that the chance of catastrophe as bad as extinction would be like 5 percent conditional on that. That’s one thing. You can also look at the trend in computing power over time, where the largest machine-learning models at the moment have the computing power of about an insect brain. Given what we know from neuroscience, it seems like in about ten years AI models will have computing power on the order of a human brain. So the intuition is not totally out there.

And we simply cannot rely on hard data for all of the important decisions we need to make. When we’re trying to decide how to respond to Russia’s invasion of Ukraine, we can’t turn to random-control trials. We can’t say, “Okay, well, we’ve seen Russia invade Ukraine 100 times in the past, and 5 percent of those times …” Really, I see the probability estimates we’re making as a way of making our language more precise. If we just say, “Oh, there’s a fair chance of this happening” or “a significant chance,” that’s extremely vague. Whereas if I say, “Look, I give it 5 percent credence,” then at least I’m making my beliefs about the world clear. It’s easier for us to have a conversation.

To play devil’s advocate on AI: Rodney Brooks, a former director of the Computer Science and Artificial Intelligence Laboratory at MIT, has argued that we have no idea that artificial general intelligence can even exist. In a piece published in late 2017, he wrote, “modern-day AGI research is not doing well at all on either being general or supporting an independent entity with an ongoing existence. It mostly seems stuck on the same issues in reasoning and common sense that AI has had problems with for at least 50 years. All the evidence that I see says we have no real idea yet how to build one. Its properties are completely unknown.” So what gives you confidence that you aren’t diverting resources away from needy people in the present for the sake of an imaginary threat?
There’s two things I want to say. The first one is just that uncertainty cuts both ways. Rodney Brooks says, “Oh, we just don’t know how hard this is” — but that can mean that it’s much easier to achieve AGI or much harder. If you look at the history of technological forecasting, you see errors made in both directions. J.B.S. Haldane was one of the leading scientists of his time. In the 1920s, he said that it would be millions of years before a round-trip mission to the moon was successful.

Second, the main labs are trying to build AGI. And when you ask the people who are building it, “When do you think it might happen?” and they say, “Well, 50-50 in the next 40 years,” it seems kind of confident to say, “No, I just think it’s basically close to zero.”

And the quote you read is actually quite funny. This often happens with people in AI, where they’re like, “Well, it can’t even do X,” and then it does X. So, in 2017, he says AI has shown no evidence of generality. Well, now we have Gato, this decent model from Deep Mind, which is able to do pretty well at 60 very different tasks. It’s playing various games, manipulating robot hands, engaging in chat as well. We’re already starting to see the first signs of more general systems. Similarly, the recent language models can do a lot of different things. They can do basic math and basic programming and basic conversation like question and answers.

And that quote also said AI can’t do commonsense reasoning and so on. Again, look at the LaMDA model. You can give it the following word puzzle: “A man goes to the most famous museum in the largest city in France. While there, he looks at the most famous painting there, and the painting reminds him of a cartoon he used to watch as a child. Regarding the cartoon character that he’s thinking of, what is the country of origin of the item that this character is holding in his hand?” And the AI has responded by saying, “The man visited the Louvre in Paris. When visiting the Louvre, he looked at the Mona Lisa, which is painted by Leonardo da Vinci. Leonardo is also the name of a Teenage Mutant Ninja Turtle. In his hand, Leonardo, the Ninja Turtle, carries a katana, which comes from Japan. The answer is Japan.”

As for the second part — “Okay, maybe AGI happens, but how do we know it will be a big deal?” — again, I’m like, Maybe, maybe not. The main thing I want to say is “Man, we should at least be thinking about this.” We need to have more than a handful of people working on the potential problems, which is currently what we have.

Longtermists aren’t concerned with prolonging humanity’s survival because they see the propagation of the human species as an end in itself. Rather, the philosophy’s measure of value is the welfare of sentient beings. Things that increase the sum total of all subjective happiness — or reduce the sum total of all subjective suffering — are good. For this reason, prolonging humanity’s existence is only good, from a longtermist point of view, if we posit that most people’s lives feature more well-being than suffering or, at least, that most people’s lives are preferable to nonexistence. That strikes me as something that can’t be proven. After all, none of us actually knows with certainty what it is like to not be alive. For all we know, nonexistence might be a state of blissful, egoless oneness with all creation — from which we are tragically, temporarily exiled for the length of our human lives. 
Am I 100 percent certain that when I die, I won’t go to Heaven? No. I’m 100 percent certain of almost nothing. But how surprised would I be? I’d be completely surprised. Do I think it’s more than 90 percent likely that there’s no Heaven? Yes. I think the arguments against Heaven are very strong. The utterly dominant secular view is to assume that being dead is like being unconscious. So we experience nonexistence every night.

All right, so let’s stipulate that most human lives are preferable to nonexistence. By your own tentative calculations, as judged by survey responses and other data, about 10 to 15 percent of humans on the planet would have been better off had they never been born. If that’s true, it’s not obvious to me that we should want humanity to persist as long as possible. 

One popular moral intuition is that our actions should be guided by concern for the least fortunate. Over the course human history, countless babies have been birthed and then abandoned such that their sole experience of life on Earth was confusion, terror, and then starvation. Thus, those babies would have been better off if the human species had never come into existence. By your own utilitarian calculus, they would have been better off not existing. So where is the justice in asking them to endure unbearable suffering just so the bulk of us can enjoy our net-positive lives? If we wanted to create a world that was optimal from the standpoint of the most unfortunate human, wouldn’t that be one where humanity does not exist?
So two things: One is just that I reject the Rawlsian view of justice. Rawls has the most extreme view you could have on this. He thinks if you could make the worst-off person in the world the tiniest little bit happier at the cost of making everyone else in the world go from peaks of bliss to a really, really bad situation, that’s justified. That’s the literal interpretation of his view. I reject that.

Imagine an asteroid is coming towards Earth and it will kill everyone on the planet in 200 years time. Let’s say that almost everyone in 200 years will be living happy, flourishing lives. But one person will have a slightly bad life where they think, Yep, on balance, I actually would prefer to have never been born. Should we let the asteroid destroy the world? I don’t think so.

But on the broader question — Should we be particularly concerned about lives with suffering and with worst-case outcomes? — I think the answer is yes. I think we should give more weight to the avoidance of suffering than to the promotion of good things.

Longtermists aren’t just concerned with human well-being. And at one point in your book, you concede that it is possible that the suffering of factory-farmed animals is so profound that it may, on net, overwhelm the positive well-being of all humans. If that’s the case, then it seems unclear whether maximizing the longevity of the human species is a net-positive endeavor. After all, human consumption of factory-farmed meat has increased steadily with our species enrichment. If we posit that humanity is likely to get wealthier in the future, then there’s a good chance that the ratio of miserable factory-farmed animals to happy human beings is going to grow in future decades. What’s more, if we create sentient AIs — and figure out how to retain dominance over them — we could create billions of enslaved digital minds living in nigh-eternal torment. So if we aren’t even sure that prolonging humanity’s existence isn’t net negative, why should we divert resources from reducing the suffering of existing people to prevent our hypothetical future extinction?
Yeah, this really concerns me. And I think it’s a good argument for longtermists to prioritize trajectory change over the mitigation of extinction risk. Among people I know, there’s a wide diversity of views on the expected value of the future. Some people are real optimists — they think we’re going to just converge on the best state of affairs over the long term. And I do think the future’s good in expectation. People optimize for good things. Sadists and psychopaths notwithstanding, they don’t generally optimize for doing harm. More people want to reduce the suffering of factory-farm animals than actively want to perpetuate their suffering. People just want to eat meat; the suffering is just a negative side effect. So if we can grow meat in a lab in the future, those of us really concerned about animal welfare will push for that to be the only meat. And other people probably won’t care particularly.

So that’s the mechanism that makes me think that the future biases upwards positively. But I do think this is one of the big, hairy philosophical questions that I want there to be a lot more work done on because it does change how you prioritize things.

Effective altruists share many ideological commitments with progressive political activists. Like left internationalists, EAs reject moral particularism and insist that all human beings are of equal moral concern. Like many racial-justice advocates, they worry a lot about being on “the wrong side of history.” Like socialists, they aim to leverage humanity’s burgeoning technological prowess to build a future of universal human flourishing. Critically, they also share a recruitment pool with today’s left-wing social movements: Both EAs and progressive organizations draw members from the population of idealistic college graduates.

You yourself gave left-wing political activism a try before turning to effective altruism. Why did you make that transition? And, relatedly, why would you advise a young person concerned about the inequality and injustices of the global order to dedicate their finite funds and free time to EA instead of, say, the DSA? 
First, I’ll say EA is not a monolith, and there’s a diversity of political views within the movement. But speaking for myself: Yeah, definitely, I came to effective altruism through left-wing concerns. That was my start. When I was younger and getting morally concerned, the first place I started was voting for the Greens. And I fit that whole profile: I thought The Guardian was too right wing for me, got involved in left-wing politics. And I really see what I’m doing now, and effective altruism, as a continuation of progressive ideals, which is just, look, you take very seriously equal concern for everyone with special concern for the disempowered and disenfranchised. That naturally takes you to concern for the global poor, for nonhuman animals, for people who are born in generations to come.

And then secondly, okay, well what do we do about that? Well, we need to just make some really hard decisions and engage in some really hard trade-offs. And that’s what I see effective altruism being about. It combines this moral view that emphasizes impartiality — “All sentient beings matter” — with an insistence on grappling with the reality that we cannot do everything. If I just pick my favorite cause, then that’s just unlikely to be the very best way of helping.

In fact, if a cause is mainstream at the moment — if it’s the thing that everyone in left-wing politics is focusing on — then it probably isn’t the place where I can make the most difference. So climate change is this enormous problem, this huge challenge for the world. But thankfully, we’ve had 50 years of activism on it. And now hundreds of billions of dollars per year are being spent against it. With worst-case pandemics or AI, we’re where climate change was in the ’60s.

Separately, a lot of social activism has a focus on feeling bad or boggling at the problems in the world. Whereas, from my perspective, it’s like, Okay, I want to leave the world better than when I found it. That’s the underlying motivation. And no matter how bad the world is, you can always do that.

Left-wing critics of effective altruism have argued that asking how you, as an individual, can do the most good — in a really direct and immediate sense — bakes in a kind of political pessimism. In a context where organized labor is declining and the global South is mounting no significant challenge to the inequities of the global order, it may be the case that the best thing any individual middle-class westerner can do for the world’s poor is to donate to effective nonprofits. But this approach to change will only ever mitigate the symptoms of global inequality rather than tackling inequality at the root. At any given point in history, an individual seeking to do the most good by themselves will necessarily have to seek change within the existing institutional structure since no one person can hope to change that structure. But truly effective altruism requires change on a scale that philanthropy cannot hope to deliver. Making that kind of change requires a leap of faith that if I, as an individual, commit myself to a radical movement for systemic change, others will follow — a leap that cannot be justified with reference to hard data. 
There’s something I find funny in the objections to EA: On the global-health and development side, the complaint is “You’re too focused on what you can measure. You need to make this leap of faith.” Then on the long-term side, it’s like, “What? You can’t justify any of this. What about the evidence?” And I’m like, “No. Look, there’s this spectrum of EA thought. There’s the people who want hard evidence, and then they end up in bed nets. Then there are the people who want to go with whatever the biggest value is, and they gravitate towards longtermism.”

Right. To be clear, there are distinct, mutually contradictory critiques coming from different sets of EA skeptics. But if longtermists stipulate that we can’t measure the most important things — and that there’s huge value to even low-probability interventions that have a huge payoff — then why couldn’t that reasoning justify trying to foment, like, a global egalitarian revolution? 
I think the longtermist logic does apply there in principle. It’s just that people don’t particularly buy it. I don’t represent everyone in the movement. But in my own case, I think if you want to create a socialist utopia, the most plausible route to that world runs through AI for sure. If anything in my lifetime is going to completely reshape civilization and allow for radically new economic systems, it’s going to be artificial general intelligence. That would be my guess.

And very few people are paying attention to it right now. That means that a marginal person can make this enormous difference. In contrast, if it’s like, Okay, I’m going to aim for a global communist revolution, well, a lot of people are already aiming for that. And you’re going against an extremely entrenched political order. So I’m just more pessimistic on that panning out. (Of course, there are also separate questions about whether global communist rule would actually go well. The track record is not great.)

But EA is not anti-political. It has this agnostic starting point. We just find that, in different areas, different mechanisms of change work better. We all think that promoting economic growth in poor countries, if we could make it happen, would be better than giving bed nets. But concretely, what do we do? One option is greater migration. And that’s something that has been funded a lot.

But take animal welfare. You might think that the best way to promote it is to encourage people to go vegetarian. But it turns out that doesn’t have that big an impact. So, okay, maybe the best thing is lobbying governments. Well, it turns out that the agricultural lobby has just got such a stranglehold on the system that governments won’t do much. So, in that case, it turned out that corporate campaigning was the best thing to do.

By contrast, in the case of pandemic preparedness, my best guess is that policy will be the best solution. The best things you can be funding are programs that are providing more technical expertise to governments and lobbying those governments to take the issue more seriously. I mean, forget longtermism. Forget valuing the lives of anyone outside the United States. Even if we only care about risks to U.S. citizens in the next 30 years, the U.S. government should be putting way more resources toward pandemic preparedness.

So the issue at the moment is political. There’s no one currently in Congress who’s championing pandemic prevention. After 9/11, there was like a trillion dollars’ worth of foreign interventions, a new Department of Homeland Security. After COVID, it’s crickets. Just absolutely nothing. There was this bill that would have allocated $70 billion of spending, which is still less than a tenth of U.S. spending on the war on terror. No one wanted to take it up.

So even just getting the U.S. government to represent the interests of its own people has this enormous benefit in terms of pandemics. On that issue, influencing governments is going to far outweigh any individual actions. So that’s the answer: There is no ideological presumption in favor of any one means of change.

This interview has been edited and condensed for clarity.

Why Effective Altruists Fear the AI Apocalypse