Research
Writing, automation and creative leverage
Language systems as leverage for writing and operations
What GPT-2 made visible
In February 2019, OpenAI published a language model they declined to release in full. The stated reason was concern about misuse: the system was capable of generating long-form, coherent, stylistically consistent text from a short prompt. Whatever one thinks of the decision, the more interesting thing was the demonstration itself. A short passage of text, fed into a model, and the system continued it. Not perfectly. Not reliably. But well enough that the result required careful reading to distinguish from something written by a person in a hurry.
This was not the first language model and it will not be the last. But GPT-2 landed differently from its predecessors because it operated at a scale and fluency that moved the capability from the technical press into the general conversation. The question being asked in March 2019, in various forms across technology, media and creative industries, was: what does this mean for writing?
It is a reasonable question. It is also, if approached carelessly, a question that produces bad answers.
The wrong frame
The instinctive frame is substitution. Can a model write what a person would have written? And if it can, what happens to the person?
This frame produces anxiety, and anxiety produces bad reasoning. It causes people to focus on the output, the paragraph, the sentence, the tone, rather than the process that makes the output worth producing in the first place.
Writing is not primarily about producing text. It is about thinking in public. The discipline of forming a sentence forces the formation of a thought. The act of structuring an argument requires the argument to exist. Much of what makes writing difficult is not the mechanical production of words: it is the prior work of knowing what you are trying to say, and why it matters, and to whom.
No language model in 2019, nor any that exists at the time of writing, does that prior work. The model continues from wherever you start. It has no stake in whether the argument is true, no capacity to notice when the reasoning is weak, no sense of whether the piece is worth writing at all. It can match a voice. It cannot choose a position.
The substitution frame misses this because it is focused on surface outputs rather than cognitive process. It asks whether the model can produce something that looks like writing. The more productive question is what writing is actually for, and what changes when certain parts of it can be accelerated.
Creative leverage versus creative replacement
The distinction that matters here is between creative leverage and creative replacement.
Creative leverage is using a system, any system and not only AI, to explore more territory in less time. A writer who can generate twenty rough variations of an opening paragraph in the time it would previously take to produce two has more material to make choices from. They can see more options, reject more of them, and arrive at something better than they might have reached by labouring over the first attempt. The direction, the judgement, the final selection: those remain human. The system expanded the possibility space; the writer navigated it.
Creative replacement is something different. It is using a system to avoid the cognitive work of deciding what to say. The prompt is vague, the output is accepted without scrutiny, the result is published without the author truly owning it. The system did not expand a possibility space. It filled a gap that the writer did not want to occupy. The work that gets published is nobody's work in particular.
The difference matters for quality, obviously. But it matters for something else too: voice. A writer's voice is not simply a stylistic pattern. It is the accumulated record of choices made under pressure: this word, not that one; this argument, not a smoother-sounding alternative. Voice is what happens when someone thinks clearly enough and consistently enough that their reasoning becomes recognisable. It cannot be inherited from a model, because the model has no reasoning of its own to give.
This does not mean assistance corrupts voice. A sharp editor makes a writer's voice more itself, not less. The question is who is doing the thinking. Assistance that sharpens thinking is leverage. Assistance that replaces thinking is substitution, regardless of how fluent the output looks.
Writing as an operational surface
There is a second dimension to this that gets less attention than the creative one. Most writing that organisations produce is not creative in the artistic sense: it is operational. Briefs, proposals, updates, summaries, documentation, follow-ups. The volume is large, the quality requirements are specific, and the cognitive cost of producing it is real but often underestimated because it is spread across many people in small doses.
If language models make operational writing faster and more consistent, the gain is not aesthetic. It is structural. Organisations that spend less cognitive energy on mechanical writing have more available for the decisions the writing is meant to support. The brief gets produced without a week of internal back-and-forth. The summary of a meeting is accurate and available immediately rather than reconstructed from notes three days later. The follow-up happens the same day instead of getting deprioritised because someone had to find the words.
This is not a trivial efficiency. It is a change in the operating tempo of an institution. And it is more immediately achievable in 2019 than the creative applications, because operational writing has clearer success criteria. The brief is good if it accurately captures the requirements. The summary is good if it matches what was decided. There is less ambiguity about what the output should be, which means there is less ambiguity about when the assistance is working.
At MSG, this distinction shapes how we think about language capabilities in the products we are building. The creative surface, where voice and reasoning and argument matter, requires a fundamentally different approach from the operational surface, where accuracy, consistency and speed are the relevant measures. Conflating them produces systems that are unsatisfying in both directions: not expressive enough for creative work, not precise enough for operational work.
Where direction comes from
There is a quiet assumption embedded in the anxiety about writing automation, and it is worth naming: the assumption that direction is obvious. That once a system can produce text, the only remaining problem is the production itself.
In practice, direction is the hard part. What angle on this topic is worth taking? What does the reader actually need to understand, and what is already clear to them? What does this piece need to do: convince, inform, document, inspire? What would make it worth reading rather than merely competent?
None of these questions are answered by a language model. They are answered by the person who understands the purpose of the work: the writer, the founder, the strategist, whoever is responsible for the thinking that the text is meant to represent. The model can execute on direction very capably once direction exists. It cannot provide direction from nothing.
This is encouraging rather than alarming, if you sit with it for a moment. It means that the people who think clearly about what they want to communicate retain full control over what actually gets said. The leverage goes to those who know what they are trying to do. The people most at risk of displacement are those who were primarily producing text without a clear sense of why, which is a reasonable description of a significant volume of content that exists in the world without making much contribution to anything.
What this changes in practice
For anyone building with language capabilities in this period, the practical implication is to invest more in the front end of the writing process, not less. Clearer briefs. More disciplined thinking about audience and purpose before any word is written. Stronger positions stated at the outset, so that assistance can execute against something specific rather than filling a void with fluency.
The models will improve. The fluency will become harder to distinguish from considered writing. The volume of text that can be produced without effort will increase substantially. What will remain scarce is exactly what has always been scarce: someone with a genuine point of view, expressed clearly, in service of something that matters to the people reading it.
That is what voice is. It was never about the mechanics of sentence construction. It was always about whether anyone was really in there, thinking.
The model layer in March 2019 is a tool. A capable one, with implications that are still becoming clear. But it is a tool in the hands of whoever decides what to say. That decision remains, as it has always been, the most important part of the work.