In early July, the Associated Press made a deal with OpenAI, maker of ChatGPT, to license “part of AP’s text archive” and get access to “OpenAI’s technology and product expertise.” A few days later, OpenAI announced a $5 million grant, accompanied by $5 million in software use “credits,” to the American Journalism Project, an organization that supports nonprofit newsrooms. Meanwhile, Google has reportedly been presenting major news organizations, including the New York Times, the Washington Post, and the Wall Street Journal, with a new software “personal assistant” for journalists, code-named Genesis, which promises to “take in information — details of current events, for example — and generate news content,” with a pitch described by some in attendance as unsettling. A number of news organizations, including G/O media, which owns Gizmodo, Jezebel, and The Onion, are experimenting with blog-style content generated from scratch, and plenty of others, with varying degrees of transparency, have started to dabble.
Last week, Semafor reported that the next significant meeting between news organizations and AI firms might occur in court: Barry Diller’s IAC, along with “a handful of key publishers,” including the Times, News Corp, and Axel Springer, are reportedly “formalizing a coalition that could lead a lawsuit as well as press for legislative action.” They’re not looking for small grants or exploratory collaborations. In their view, AI companies are systematically stealing content in order to train software models to copy it. They’re looking for compensation that could “run into the billions.”
These are, it is fair to say, the inconsistent actions of a mixed-up industry confronting speculative disruption from a position of weakness. This is not ideal if you’re the sort of person who places much stock in a functional Fourth Estate, but it’s also not unique: In conference rooms around the world, white-collar workers are stumbling through mind-numbing conversations about incoherent presentations on the imminent approach of AI with the assignment or intention of making some — any! — sort of plan. It’s also understandable. It’s easier to get the leadership at OpenAI and Google to talk about the apocalypse than it is to get a clear sense of even their own plans for making money with large language models, much less how those plans might affect the reporting and distribution of the news. The media industry’s particular expressions of panic are a result of a comprehensive sense of exposure to these new forms of automation — which is arguably the best way to think about artificial intelligence — combined with a sense of profound confusion about what the challenges are and for whom.
The industry’s scattered early responses to AI do, however, seem to contain some assumptions, and from those assumptions we can extrapolate some possible futures — if not the likely ones, then at least ones that people in charge of the news business are most excited about or of which they are most afraid. The news media’s flailing early responses to AI are, in their own ways, predictions. There are, so far, a few dominant schools of thought about this.
Theory 1: AI replaces journalism
At one extreme, you have online-first news organizations that are ready to start generating more content now in the most straightforward way possible: asking tools based on new large language models to compose stories about x or y for direct or lightly edited publication. CNET, the tech-news site, adopted this strategy early but pulled back after its little-read content was found to be full of egregious errors; undeterred, G/O is giving a similar strategy a shot with an added dash of antagonism toward its unionized employees.
It’s tempting to get stuck on the question of whether AI tools are (or soon will be) capable of producing plausible versions of much of the content already published by these publications. As a strategy, however, the all-in-on-AI approach renders that question irrelevant. If a bot can’t convincingly mass-produce content people want to read, or at least content against which views can be somehow harvested by publishers for ads, then the plan fails. If it can — that is, if G/O can replace a bunch of its content with AI-generated topical blog posts and retain some sort of profitable readership — then it would seem the plan still fails, because if G/O can, then anyone can and will. The cost of a skimmed-and-discarded blog post will approach zero and, like other forms of content that have already been substantially automated (quarterly-earnings summaries, weather reports, basic sports results), will cease to produce much value on its own.
Another issue with this strategy is that it pits small publishers using tech companies’ tools against the big tech companies themselves. Google is testing out a feature that provides AI-generated answers at the top of its search results. I’ve been using it for a few months, and its prospects are interesting and complicated. The tasks for which it seems most competent — providing some quick background on a broad subject, news event, or concept; suggesting products or content; reciting objectivish facts about a person or thing — are the ones that make it seem the most like a plagiarism machine. It’s better than I expected it to be in a narrow sense. But it’s also been surprising in a psychological and behavioral sense in that Google’s credible, relevant-seeming AI content has become, at least to this user, something like an ad: a second-tier block of content over which my eyes are learning to skim. I bring this up not to make a prediction about the limits of AI-generated mass-media content but instead to emphasize the hopelessness of competing with it as a traditional publication that hopes to harvest search traffic with definitionally worse and less-relevant AI-generated mass-media content. Again, the answers to the most obvious question — what sorts of content can a machine generate while maintaining the interest of human readers —become irrelevant.
The theoretical fully automated G/O blog doesn’t win, in other words. More likely, it faces relegation, in the eyes of digital platforms, to the status of spam. It’s unlikely that any existing news organization will end up exploring such a possibility to its full expression despite its superficial appeal to management. If it did, it — or at least we — would face a final problem: a fully automated news product, produced by incurious synthesis machines, based on already available information, and without a shred of oppositional agency, would amount to, at best, low-value aggregation, and, at worst, something like advertising. Or propaganda! It would be automating the parts of “the news” that are already cheap.
Theory 2: AI improves journalism
Much more common, so far, are tentative plans to figure out how text-generation tools might be useful in human-centric news production and distribution. In April, Insider editor Nicholas Carlson gave his newsroom permission, or direction, to use tools like ChatGPT. “My takeaway after a fair amount of experimentation with ChatGPT is that generative AI can make all of you better editors, reporters, and producers, too,” he wrote in a memo to staff, at the same time cautioning that such tools can provide incorrect information and effectively plagiarize. “We know it can help you solve writing problems. But your stories must be completely written by you,” he wrote, leaving staffers with either a number of follow-up questions or perhaps just a bit of leeway. (In an interview with Axios, he framed generative AI as a tsunami: “We can either ride it or get wiped out by it.”) Farhad Manjoo, a columnist at the Times, shared how he was already using ChatGPT for work in minor assistive ways that felt, to him, helpful and ethical: brainstorming transitions; remembering words that had escaped him; getting “unstuck.”
Some version of this strategy will occur naturally in newsrooms, as staffers like Farhad — an opinion columnist, not an investigative reporter — experiment with new tools that make their jobs easier, stepping on a few rakes along the way. Google’s “Genesis” story-generation tool represents a fuller expression of this approach. As a piece of software that will attempt to synthesize chosen and at least theoretically novel information into a newslike form, it promises to replace or streamline an important step in the production of stories. (It wouldn’t be the first time Google partially automated the practice of journalism: Lots of stories, big and small, carelessly and carefully reported and composed, start with a Google search.)
An optimistic theory about tools like this is that they’ll simply make certain journalistic operations more productive, partially automating friction-filled processes — typing, transcription, maybe even puzzling over structure — and freeing up resources to pursue, say, actual reporting. A great deal of what news organizations publish, including the incremental revelations of new or rare information most commonly associated with “the news,” is presented in fairly standardized forms and styles, the thinking goes. The news is made novel, and valuable, by what it reveals, not the particular style of its revelation or distribution.
This school of thought places tools like ChatGPT on a spectrum with earlier forms of automation in and around the news business, from word processors and spell-checkers to internet publishing and social media; it also takes an idealistic view of journalism and its function in the world. It addresses concerns about accuracy, bias, and plagiarism by emphasizing the final role of the human writer, who becomes, I guess, a sort of editor, or auditor, for content produced not by but with the help of AI tools.
This optimism depends on a few different assumptions. One is that ChatGPT-esque tools will prove genuinely useful to reporters in the long term, providing meaningful help to people working to report, contextualize, and analyze new information about the world — an assumption that seems to be, at this point, helped along by brief experiences testing out ChatGPT and an unfortunate susceptibility to narratives of inevitable tech. Another is that, in the actual workplaces in which journalism is produced, the time and resources freed up by easier content production would be automatically reallocated to, say, accountability journalism, in-the-field reporting, or whatever else news executives like to say they want more of. But why not … decreased staffing levels? Higher expectations for total volume of content produced? In a moment where many newsrooms are suffering crises of distribution and revenue — problems for which generative AI offers no obvious help — the most obvious use for any form of automation is cost reduction.
When Manjoo describes using generative tools as like “wearing a jet-pack,” he’s describing the process of using them on his own terms in a professional context where most other people — managers, co-workers, peers — don’t. It’s interesting! Early-adopter office workers in various industries have been sharing similar experiences since ChatGPT came out about how the ability to outsource rote email writing or basic programming tasks has freed up time or energy. It’s also maybe fleeting. In the longer term, it won’t be workers who decide how new productivity tools are deployed or how potential gains in productivity are absorbed — it’ll be people in charge of the news business, who have their own priorities that, to put it gently, don’t always align with those of their staff.
Theory 3: AI swallows journalism
The automation-curious newsroom will provide fertile ground for labor disputes, as jobs are subtly — or maybe rapidly — altered with the deployment of new machinery and by changing expectations from management. As both an actual phenomenon and a threatening discourse, automation has tended to give leverage to owners and management, and it’s not clear why we should expect this time to be any different. AI wasn’t the primary motivator for the WGA and SAG-AFTRA strikes, but it has become, in part owing to open fantasizing from studio executives about replacing writers and actors, a core subject in the dispute.
In journalism, however, it seems like an early fight over leverage and AI might unfold between news organizations and tech companies — between owners and owners — in the form of lawsuits and content-sharing deals. Semafor’s reporting suggesting that a consortium of news organizations want billions of dollars in compensation from AI firms implies some interesting predictions on its part. Let’s say the consortium gets it — a legal question without clear precedent — and Google and Microsoft-funded OpenAI end up paying hefty fees to use news organizations’ content for training or as the source of fresh information to keep their products current. What has become of the news industry in this scenario? Are news organizations still producing content for people to read and just allowing voracious tech companies to ingest and learn from it as well? Or is their provision of content to AI companies more specific and deliberate with reporters providing models with high-value information — quotes from sources, information gleaned from real-world reporting, documents not otherwise publicly disclosed — for repackaging elsewhere?
This would turn news organizations into diminished wire services without the need to actually write whole stories for distribution — they would just be filling in blanks for some yet-to-be-determined AI-powered news-production apparatus. Functionally, it would also make them a bit like the web scrapers AI firms already use to build their training sets and their reporters’ jobs a bit more like those of the armies of contractors whom AI companies already rely on to keep their products working.
An outcome along these lines — news organizations becoming training assistants and data providers for AI giants — seems, despite or maybe because of the big ask up front, to anticipate total domination by tech companies and the obliteration of the media as a distinct industry. Media owners seem to be trying for a sort of fantasy do-over of the social-media era, where, sure, their businesses still get marginalized, but at least they get paid. Understood as a legal gambit, it might overestimate news organizations’ value to companies that are selling all-purpose text generation and have plenty of other material to draw from. If one possible objective is an end to unauthorized story scraping, well, great, but chances are Google, which is probably the most advanced web-crawling operation in the world, will find a way to get what it needs to spit out competent search blurbs.
Like many discussions about AI, these approaches offer speculative solutions to speculative problems. If not exactly nice to think about, they’re still less grim than the situation immediately in front of newsrooms in 2023. Also absent is the public — the readership, the people who will be theoretically served or pleased or compelled to action by journalism in the coming years. How much automation do they want in their journalism? How much do they think they want? How much do they want to want? People consume news and engage with news organizations for all sorts of reasons, some more conscious than others: boredom; a sense of responsibility; ideological commitments; hope; fear; vindictiveness; feeling smart; getting mad. The success of the Times can surely be credited in part to its incomparable package of quality news content, but it can’t be fully understood without, basically, a sociological analysis of American liberals and their consumption habits. Could ChatGPT come up with a plausible Thomas Friedman column? Absolutely. Would his many fans want to read it, and pay for it, knowing that its carefully mixed metaphors were generated by a piece of software trained on his archives? I sincerely doubt it! (The fundamental unknowability of audience motivations is just as complicated in the context of Hollywood. Is copying and automating an actor’s likeness predatory? Obviously. Will it produce anything people want to see, hear, or talk about? Less clear. Will Hollywood executives give it a shot anyway? You bet.)
Some narrow aspects of news production probably will be easier to automate than skeptics suspect or admit. Bloomberg has been using AI tools to help streamline basic financial reporting for years, and wire services have been using story generators to write sports recaps and summarize earning reports for more than a decade. Simple summarization isn’t glamorous, but it’s useful and accounts for a lot of what the news-consuming public actually reads.
Inevitable attempts to automate the rest, however, are likely to collide with the fundamental messiness of journalism as a concept and as a practice — that is, the publication and contested contextualization of information that isn’t already out there on the internet waiting to be rolled into a training set or automatically synthesized into an aggregated stub of news. This collision is not an easy scenario to predict or control. An underrated risk is that publishers kill themselves trying.