Research

AI as operational infrastructure

AI moving from novelty to infrastructure

From demo to dependency

Something shifted in November 2022. Not gradually. Sharply. The launch of ChatGPT moved the public conversation about AI from a slow simmer to a rolling boil, and January 2023 finds us in the middle of a particular kind of cognitive hangover: everyone is impressed, most people are confused, and very few organisations have worked out what to actually do next.

The question "can AI be useful?" is largely answered. The answer is yes, demonstrably, across a broad range of tasks. That question no longer needs much ink. What requires serious thought now is the harder question beneath it: how do you make AI reliable enough to depend on? Because there is a meaningful gap, wider than it looks in the demos, between something that is impressive in a controlled environment and something that functions as infrastructure inside a real business.

That gap is where most of the work lives.

What infrastructure actually means

Infrastructure is not a compliment. It is a burden. Roads, electrical grids, water systems: they are essential precisely because they are depended upon, and that dependency means they must work consistently, predictably, and invisibly. Nobody marvels at the electricity when they turn on the lights. They only notice it when it fails.

That standard, invisibility under load, reliability under stress, failure modes that are recoverable, is not where most AI tooling sits as of early 2023. What exists is a landscape of genuinely impressive capabilities wrapped in interfaces that require careful prompting, produce inconsistent outputs, and offer limited ability to audit what happened or why. Impressive for demos. Fragile for operations.

The Stanford HAI AI Index has been tracking AI progress across multiple dimensions, and the technical capability trajectory is real and steep. But capability metrics and operational readiness are not the same measure. A model that can write a persuasive essay on demand is not the same thing as a model that can reliably process, summarise, and route a business document inside a workflow that expects consistent output every time. One is a parlour trick, a very good one. The other is infrastructure.

The question in January 2023 is not whether the underlying capability exists. It does. The question is what it takes to close the gap.

The trust problem is not a feelings problem

There is a tendency in AI discourse to frame trust as a perception issue, a matter of making users feel comfortable, of designing reassuring interfaces, of communicating AI limitations gracefully. That framing underestimates the problem.

Trust in a business context is not an emotion. It is a functional state. A workflow trusts a tool when that tool behaves within predictable tolerances, when its failure modes are known in advance, when the humans in the system can audit what it did and intervene when something has gone wrong. That kind of trust is built through architecture, not through copy.

For AI to earn its place as operational infrastructure rather than as an optional add-on floating above real work, several things have to be true simultaneously. The system needs to know what it is supposed to do: scope matters enormously, and models given vague mandates produce vague results. The system needs to know what it is not supposed to do, which is harder than it sounds, because the boundaries of AI behaviour under edge cases are not always obvious until you have hit them. And the humans in the workflow need genuine visibility: not just a surface showing outputs, but enough transparency to understand when an output should be trusted and when it should be checked.

Building those conditions is an architectural problem. It cannot be solved by selecting a better model. It requires deliberate decisions about scope, about handoff points, about what the system is authorised to do autonomously and what requires a human in the loop. Those decisions are, in a real sense, product decisions, and they need to be made as carefully as any other part of the product.

Embedded versus adjacent

One frame that clarifies a great deal: the difference between AI that is embedded in a workflow and AI that is adjacent to it.

Adjacent AI is what most products shipped in 2022. A button appears somewhere in the interface: "ask AI", "generate with AI", "summarise this". The user takes a deliberate step out of their workflow, hands something off to the model, receives an output, and decides what to do with it. The model is a consultant sitting in the next room. Useful, occasionally excellent. But not infrastructure.

Embedded AI operates differently. It does not wait to be invoked. It participates in the workflow as a component, processing inputs, enriching records, flagging anomalies, generating drafts, surfacing context, without requiring the user to consciously engage a separate interface. The model disappears into the operating surface. The user experiences the product as being more capable, not as having AI added to it.

That distinction matters for product strategy. Adjacent AI can be bolted onto an existing product as a feature release. Embedded AI requires a different architecture from the start: data flows designed to feed it context, clear scope boundaries defining what it is authorised to act on, audit trails that let humans understand what it did, and feedback mechanisms that allow the system to improve over time.

The bar for embedded AI is substantially higher. So is the payoff.

What this means for how we build

Orbit is conceived from the start as a system where the intelligence layer is embedded, not adjacent. The operating surface for commercial execution, the lead-to-launched-product workflow, is precisely the kind of environment where embedded AI can compound value over time, because it has access to the context that makes intelligence useful: history, relationships, status, patterns across accounts.

Orion, as the AI intelligence layer underpinning Orbit, is not a wrapper around a single model. The work happening here is about memory, context, reasoning across time. A business operating system that only knows what happened in the last conversation is not meaningfully smarter than a well-designed spreadsheet. Intelligence at the operating system level requires persistent context: the system needs to understand what happened last week, what was promised last month, what this client's pattern of behaviour has been over the relationship. That is an architectural requirement, not a model quality requirement.

TUXX, as the services arm, encounters this problem from the client side. The gap between a persuasive demo and a system a business can actually depend on is one that custom AI work runs into constantly. The work there is less about building novel capabilities and more about building reliable ones: scoping tightly, designing failure modes, creating the audit surfaces that let people trust what the system is doing. Pattern Up, operating within that context, is a direct expression of that philosophy: systems that make capability tangible and trustworthy rather than impressive and opaque.

For CheekyGains and Naira, the embedded-versus-adjacent distinction maps onto a different set of stakes. A performance coach that only activates when the user explicitly asks it something is a feature. A coach that participates in the training cycle, noticing patterns, surfacing relevant context, prompting reflection at the right moment, is a product. The difference is architectural, and it requires thinking about what the system should know at every point in the user's journey, not just when they open a dialogue box.

The operational questions worth asking

At this point in 2023, the most useful framework for any organisation thinking seriously about AI is not "which model should we use?" That question is premature and will need to be revisited regardless. The more durable questions are:

What does the system need to know to be genuinely useful? This is a context architecture question. Models are only as intelligent as the context they are given. Designing the context pipeline, what information flows in, at what points, with what structure, is as important as anything else.

What is the system authorised to do without asking? This is a trust boundary question. Clear scope is not a limitation on AI capability. It is the precondition for trusting it. A system with undefined scope is a system nobody will depend on.

What happens when it goes wrong? This is a failure mode question, and it should be asked early. Not because AI systems fail constantly, well-scoped ones do not, but because the answers shape the architecture. Recoverable failures require different designs than catastrophic ones. Knowing the failure modes in advance allows you to build in the right safeguards.

How does the system improve over time? Infrastructure should get better the more it is used. That requires feedback loops, which require deliberate design. What signals tell the system it did well? What signals tell it something went wrong? Who sees those signals, and what happens as a result?

These are not glamorous questions. They do not generate press releases. But they are the questions that separate AI as infrastructure from AI as an interesting experiment.

The longer bet

The organisations that will find themselves with durable advantage from AI are not the ones who moved fastest in late 2022. They are the ones who started asking operational questions in early 2023 while everyone else was still distracted by the demos.

The capability is remarkable. Nobody should understate that. But remarkable capability plus poor architecture produces chaos at scale. The same capability, embedded deliberately inside well-designed systems, produces something that compounds over time.

That is the bet worth making: not on the model, but on the architecture around it. The model will change. The architecture, if it is well designed, is what persists.