Research

Stable Diffusion and open creative models

Open generative image models change who can experiment with visual systems.

Before the wall came down

For most of 2021 and into 2022, generating images with a model meant queuing for DALL-E access, applying to a waitlist, or working inside a closed beta. The results were remarkable. Text-to-image generation had crossed some threshold where the outputs were genuinely surprising, even to people who had spent time thinking about this. But access was narrow. If you wanted to experiment at any depth, you needed permission, and permission was rationed.

That arrangement created a particular kind of creative class: people inside labs, people with API keys, people who could see what these tools actually did when you pushed them. Everyone else was reading about it. The gap between what was known inside and what could be practised outside was large, and widening.

Stable Diffusion changed that.

Released publicly by Stability AI in August 2022, Stable Diffusion was the first major open-weight image generation model that a developer, a designer, or a curious person with a reasonably modern machine could actually run. Not through an API call that returned a result: locally. On their own hardware. With full access to the weights, the architecture, and the community of people who immediately began adapting, fine-tuning, and building on top of it.

The practical effect was immediate. Within weeks, tooling multiplied. Fine-tuned variants appeared. Artists trained custom models on specific styles. Developers built inference pipelines. The research community, which had been working at one speed inside its institutions, now had an enormous number of additional pairs of hands.

The difference between access and capability

There is a recurring mistake in how new technologies get described, and it applies cleanly here. When a capable tool becomes broadly accessible, commentary often frames this as capability becoming democratised, as if everyone who can now reach a tool suddenly shares the capability of the people who built or mastered it. That is not quite right, and conflating the two leads to confused conclusions.

What Stable Diffusion democratised was access. That matters enormously, but it is different from democratising capability.

Capability, the ability to do something meaningful with a tool, still requires taste, context, and the ability to ask interesting questions. Stable Diffusion running on your laptop does not tell you what to make or why. It produces outputs at scale. What you do with that scale, which direction you point it, how you assess and discard and refine: that remains a human problem, and it is not a trivially solved one.

What the open model moment actually changed was the size of the cohort who could begin accumulating that capability. Previously, you could read about these systems without being able to use them in any deep way. The lag between reading and doing is enormous, because real understanding of a tool comes from operating it, from discovering where it fails, from developing intuitions that are not available in documentation. Stable Diffusion compressed that lag significantly. Suddenly, the circle of people who could build genuine intuitions about generative visual systems grew by orders of magnitude.

That is not the same as those systems being equally capable in everyone's hands. But it is genuinely significant, and it matters for how quickly the field as a whole would now move.

What open weights make possible

The release of an open-weight model is structurally different from the release of an API. When you use an API, you can build on a capability, but you cannot meaningfully change what that capability is. You query a black box and you use what comes back. The model itself, its parameters, its learned representations, the weights that encode what it knows, sits behind a wall.

Open weights remove that wall. The model becomes a substrate. Researchers and practitioners can inspect it, modify it, extend it, fine-tune it on specific domains, merge it with other models, extract behaviours, probe failure modes, and build tools that embed the model itself rather than merely calling it.

This produces a different kind of creative infrastructure. Within months of Stable Diffusion's release, the open-source community had produced capabilities that would have taken a closed lab significantly longer to ship: fine-tuning pipelines for specific artistic styles, tools for preserving identity across generated images, methods for generating consistently composed scenes, integrations with other generative systems. The surface area of what was possible grew very fast.

Some of this would raise legitimate questions: about training data provenance, about the ability to generate misleading images without gatekeeping, about what it means for professional creative work when the barrier to generating visual material drops this far. Those are real concerns, and worth thinking about carefully rather than dismissing as anti-progress sentiment. Open access carries externalities that closed access does not.

But the creative and research upside was equally real. This was not a marginal speed increase. It was a structural change in how many people could participate in developing the field.

Taste is not automated

If there is a productive frame for thinking about what this moment meant for people building products, it is this: when generation becomes cheap, curation becomes the constraint.

The bottleneck in creative work shifts. Before Stable Diffusion, if you wanted a custom illustration, a piece of concept art, a mood board for a brand direction, you needed time, skill, or budget to produce the images themselves. That production cost determined the scope of what you explored. You narrowed your search before you started because you could not afford to search broadly.

With open generative models, the production cost drops dramatically. You can generate a hundred variations of a visual direction in an afternoon. You can explore brand aesthetics at a speed that was not practically available before. The question becomes less "can we make this?" and more "which of these is actually any good, and why?"

That second question is not solved by the model. The model is genuinely indifferent to whether its outputs are interesting, coherent with your intent, appropriate for the context, or worth keeping. It will produce rubbish at exactly the same speed it produces something remarkable. Taste, the capacity to tell the difference, and to have a view about what "better" means in a given context, becomes more important as generation becomes less expensive, not less.

This might seem counterintuitive. But consider what it implies for anyone building a practice around these tools. The competitive advantage is not access to generation, which is now widespread. The advantage lies in having standards, in being able to make decisions about output, in understanding what you are actually trying to achieve well enough to recognise when you have arrived.

What this changes in practice

For TUXX, the practical shift was in the speed at which visual exploration becomes part of early product and client work. Brand direction, interface aesthetics, the visual register of a product: these conversations had always been constrained by how quickly rough concepts could be produced. Open generative models change that constraint materially. The exploration can happen earlier, faster, and at lower cost: provided the person operating the tool has a strong enough view about what they are looking for.

For Benediction Lab, the more interesting question was structural. What does it mean for research into generative systems when a major open-weight model is available to examine directly? The ability to run controlled experiments, to probe how training manifests in specific outputs, to trace the relationship between architecture decisions and creative behaviour: all of that becomes substantially more tractable when the weights are public. Closed APIs produce outputs you can study. Open models produce systems you can study.

For All Purpose and its consumer-facing products, the question was further out but still present. If generating visual material becomes nearly frictionless, the interesting work shifts to curation and voice: having a point of view about what you publish and why, rather than relying on scarcity of production to enforce quality. Consumer creative tools were going to change in ways that would take time to work out, but the direction of travel was legible.

The proprietary-open balance

The summer of 2022 felt like a turning point in a longer argument about how AI capability would be distributed. For the previous few years, the dominant pattern had been centralisation: large compute, large data, large organisations producing powerful models and releasing access on their own terms. There was nothing inherently wrong with that model, and the results it produced were genuinely impressive.

Stable Diffusion introduced a different pattern: one where the model itself, not just its outputs, was part of what got released. That created a different kind of ecosystem, with different incentives, different failure modes, and different creative possibilities.

What followed was not an either-or situation. Proprietary and open models would continue developing in parallel, each with legitimate advantages. But the assumption that serious capability would only ever come from closed, well-resourced labs had been complicated. The open-weight model had arrived and was moving fast, with a community of contributors that no single lab could match for breadth of application.

That balance is still being worked out. What seems clear, from the vantage point of mid-2022, is that the question of who gets to experiment with powerful tools was going to look very different in a year's time than it had for the preceding few years. Access had shifted. The population of people developing genuine intuitions about generative systems had grown significantly and would keep growing.

What they do with that, what questions they bring to these tools, what standards they maintain, what they choose to build, remains the more interesting problem.

---

*Sources: Stability AI public release notes; Stanford HAI AI Index Report 2022.*