How do I reduce unplanned downtime in my manufacturing plant?

Reducing unplanned downtime starts with understanding which equipment fails most often and whether there are early warning signs you are currently missing. The most common root causes are reactive maintenance (no planned PM schedule), missing equipment history (no CMMS), and no condition monitoring on critical assets. The 2026 Advanced Manufacturing Outlook survey of 114 Canadian manufacturers found that unplanned downtime is the top reported production loss driver. ForgeShift Compass diagnoses exactly where your maintenance practices sit today, and the Forge Sprint then executes the fix — mapping the specific interventions other manufacturers your size have used to recover 30–50% of downtime losses.

Where do I start with AI and digital transformation in my factory?

The right starting point depends on your biggest operational pain — not a generic framework. If equipment reliability is your problem, start with a CMMS and planned maintenance. If you cannot see what is happening on your floor, start with OEE tracking. If quality escapes are reaching customers, start with a process FMEA. The 2026 Advanced Manufacturing Outlook survey of 114 Canadian manufacturers found 31% say "too many choices, unsure where to start" is their biggest barrier. The free ForgeShift Compass diagnostic at forgeshiftadvisory.com/compass maps your specific operational problem to the practice and templates that solve it, benchmarked against Canadian peers.

How do I get Canadian government funding for manufacturing technology?

Several Canadian programmes support manufacturers investing in digital and operational technology. The most relevant for manufacturing SMEs are NRC IRAP (Industrial Research Assistance Program), NGen (Next Generation Manufacturing Canada) consortium project funding, and SR&ED (Scientific Research and Experimental Development) tax credits. FedDev Ontario and BDC technology financing round out the landscape for Ontario-based businesses. These programmes can often be combined — a well-structured project may recover a significant portion of implementation cost, though the right stack depends on the nature of the work and how the project is scoped.

Why do AI pilots in manufacturing succeed but fail to scale across the enterprise?

The cause is almost never the technology — it is the absence of a data strategy, operating model, and change management programme that can carry the pilot across business units and facilities. 82% of manufacturing executives cite data-related problems as the critical risk to their AI success, according to KPMG research. ForgeShift's enterprise transformation programme addresses all three barriers in sequence: Enablement (training, upskilling, operating model design), AI Product Development (data products, infrastructure, visualisation), and Domain-Driven Value Capture (domain roadmap, use case prioritisation, value realisation).

How do I get started with ForgeShift and what does an engagement look like?

Most engagements follow the same path. Start with ForgeShift Compass — describe your operational challenge in plain language and receive a gap analysis benchmarked against 114 Canadian manufacturers. No commitment required. Forge Map then routes you to peer-verified solutions matched to your specific gap, sequenced by maturity and prerequisite. From there, the path splits by challenge type. For SME operational problems — equipment downtime, OEE, quality, scaling — the Forge Sprint is a 30-day principal-led engagement: Assess (OEE benchmarking, root cause, digital readiness), Shape (roadmap, prioritisation, ROI modelling), Forge (48-hour on-site immersion, quick-win implementation), and Traction (90-day KPIs, weekly scorecards). For enterprise-scale digital and AI transformation, the Forge Transform programme follows a structured three-phase model: Enablement, AI Product Development, and Domain-Driven Value Capture. Both engagement tracks lead into Sustain — the ongoing performance rhythm that starts on day 31. All engagements are principal-led. You work directly with Hendrik Lojek.

Do you work with both small manufacturers and large enterprises?

Yes. For small and medium manufacturers, ForgeShift delivers pragmatic problem-solving: fixing the specific operational issue costing you the most, capturing critical knowledge before it walks out the door, and building the data foundation for future digital investment — solutions that fit your size and scale. For large enterprises, ForgeShift delivers structured AI and digital transformation: moving past pilot purgatory, building AI operating models, designing data products and infrastructure, and driving domain-driven value capture across a manufacturing network. Both engagements are principal-led by Hendrik Lojek.

← The Forge Brief

How-To · 8 min read

Building a Deterministic AI Orchestrator.

How a deterministic orchestrator — explicit state, hooks, subagents — makes AI coding workflows reliable instead of unpredictable. Lessons from the build.

June 2, 2026By Hendrik Lojek

Key Takeaways

The model was never the bottleneck — the deterministic architecture around it was.
Rules that must hold go in hooks and scripts, not prose; config goes in YAML/JSON, not paragraphs.
Keep the agent thin, the skill thin, and the platform fat — and build the software process into explicit phases with gates.

I set out to make an AI coding agent reliable enough to trust with real work. What I actually learned, across 1,000 hours in Claude Code and five major rewrites of a system I call the orchestration platform, is that the model was never the problem. The architecture around it was.

If you are trying to get an agent to behave deterministically — same inputs, same disciplined process, every time — here is the path I took, the wrong turns included, so you can skip a few of mine.

Lesson 1: Context rot is the real enemy

Context rot is the real enemy — so spend context like a budget, not a bucket. The instinct is to give the agent everything: read every file, keep every tool result, never throw anything away. But output quality starts degrading long before the window is full — not because the model runs out of room, but because noise accumulates and attention thins. Chroma tested 18 frontier models and every single one got worse as input grew, and facts buried in the middle of a long context get a fraction of the attention they'd get at the edges. For a coding agent this is the primary failure mode, not raw capability. The fix is curation: load skills just in time, throw away a worker's context the moment its job is done, and compact long sessions down to state files instead of dragging the whole transcript forward. Keep the window clean and the model stays sharp.

Lesson 2: Markdown is a suggestion, not an instruction

My first instinct was the obvious one: write all the rules in a CLAUDE.md file. Do this, never do that, follow these steps. It worked about seventy percent of the time, which is another way of saying it failed thirty percent of the time, unpredictably.

The reason is built into the tool. Claude Code injects your CLAUDE.md with a note that says, in effect, this context may or may not be relevant — ignore it if it is not. That is by design: it keeps the model from being derailed by stale instructions. But it means your carefully written rules are soft guidance. The model is free to decide a rule does not apply right now. And there is a hard ceiling underneath it — frontier models reliably follow somewhere around 150–200 instructions before adherence falls off a cliff. Pile more prose into the file and you do not get more compliance; you get less.

That was the first real lesson. If a rule absolutely must hold, it cannot live in a document the model is allowed to ignore.

Lesson 3: Put enforcement outside the model

So I stopped trying to persuade the model and started constraining it. Claude Code has hooks — scripts that fire deterministically before and after tool calls, outside the model's reasoning entirely. A hook does not ask the agent to behave. It blocks the action.

This is the single biggest shift in the whole system. Rules that must hold moved out of CLAUDE.md and into hook scripts: block writes to credential files, block destructive commands, block the orchestrator from editing code when it is supposed to be delegating. The platform now runs twenty-one of these enforcement layers. The model cannot talk its way past them because they are not part of the conversation.

The mental model that emerged: treat the LLM as a powerful but nondeterministic process, and wrap it in a deterministic runtime. The intelligence is probabilistic. The guardrails are not.

Lesson 4: Config belongs in YAML and JSON, not prose

The same lesson has a quieter second half. Even non-enforcement information — thresholds, phase definitions, limits — worked badly as prose. A line like "warn the user when context gets high" is the kind of soft guidance the model rounds off.

So configuration migrated out of markdown into structured files. The context-budget guide written in English became a context-budget.yaml with exact numbers: warn at 75%, soft-block at 80%, hard-block at 85%. Phase rules became JSON the hooks read directly. The difference is decisiveness. Prose invites interpretation; a YAML threshold is a number a script compares against. Markdown for humans to read; YAML and JSON for the machine to obey.

Lesson 5: Build the software process into phases

A reliable developer follows a sequence — understand, plan, build, verify, ship — and does not skip the boring parts under pressure. An agent will absolutely skip them unless the sequence is structural.

So I built the software development lifecycle into an explicit phase machine. It started at sixteen phases, which was too granular; I consolidated to twelve, then to six: setup, discovery, design, build, verify, deliver. Each phase has entry conditions, exit gates, and a defined handoff to the next. A bug fix can skip phases it does not need; a new subsystem runs the whole sequence. The point is that "did you actually test it" is no longer a question the model answers honestly or not — it is a gate that has to be cleared.

Lesson 6: Thin agent, thin skill, fat platform

The biggest architectural reversal was about where intelligence should live. The instinct is to make the agent smart — a long, detailed prompt that knows everything. That is exactly backwards. Long-context agents degrade: the more you stuff in, the more the model's attention thins out and the late instructions get ignored.

The pattern that worked is the opposite. Keep the agent thin and disposable — a short prompt, a narrow scope, no memory. Put all the durable intelligence in the platform: the skills it loads just in time, the hooks that constrain it, the state files that persist across sessions. A worker agent is spawned, does one bounded job, and is thrown away. State never lives in the agent's head; it lives in files the platform owns. Agents are cattle, not pets.

This also fixed the role problem. Early on, the system suggested that the coordinator should delegate and the workers should implement — and the model ignored it whenever convenient. An outside analysis of this exact class of system, Praetorian's write-up on deterministic AI orchestration, named the gap precisely: the architecture suggested role separation but did not enforce it. The fix was to make it physical. A coordinator literally cannot write code — a hook blocks it. A worker literally cannot spawn another worker. Role separation stopped being advice and became a property of the runtime.

What it adds up to

None of this made the model smarter. It made the system around the model trustworthy. The throughline is the same one I apply on a factory floor: you do not get reliability by asking people to be careful — you get it by designing a process where the careful path is the only path. Variation is the enemy; the structure removes it.

For anyone building in this space, the compressed version: rules that must hold go in hooks and scripts, not prose. Config goes in YAML and JSON, not paragraphs. The process goes into explicit phases with gates. The agent stays thin, the skill stays thin, and the platform stays fat. And the model gets treated for what it is — a brilliant, nondeterministic engine that becomes dependable only when you build a deterministic machine around it.

That last part is the whole job. The intelligence was never the hard part. The orchestration was.

Sources

Praetorian — Deterministic AI Orchestration: A Platform Architecture for Autonomous Development ↗Anthropic — Claude Code best practices (CLAUDE.md, instruction adherence) ↗Anthropic — Agent Skills overview ↗