The risk nobody's talking about

Everyone's worried about AI hallucinating — making things up. That's a real risk, and it's getting smaller with every model update. The risk that isn't shrinking is the one nobody's watching: AI that gives you a perfectly accurate answer to 40% of the problem and presents it as the whole picture.

It doesn't simplify on purpose. It simplifies because that's what language models do — they pattern-match toward the most probable answer, not the most complete one. They drop requirements, ignore edge cases, and skip the parts that are hard to reason about. Then they hand you something polished and confident. Your team looks at it, thinks "this is good," and executes. Nobody checks what's missing because nothing looks missing.

That's the completeness gap. And it's the gap that hits your bottom line — not because the AI was wrong, but because it wasn't finished.

The gap doesn't close with better models. It closes with better judgment — specifically, the judgment to pause between AI output and business action, and check what was left out. That judgment is what I call the AI Instinct. Your team already has it for people — you can tell when an employee is winging it versus when they've done the work. You don't have it for AI yet. Nobody does. That's what I build.

Where most businesses sit right now

AI CAPABILITY → HUMAN JUDGMENT CALIBRATION → Two years ago. Safe but falling behind. Most companies now. Powerful tools. No calibrated oversight. THE TARGET AI deployed AND managed with organised instinct. Understands AI deeply. Hasn't deployed. Rare. the journey

The journey from bottom-right to top-right isn't about deploying more AI. It's about organising the judgment your team already has. That's the instinct.

What "correct but incomplete" looks like

This is a real AI output for a common business task. Tab one is what your team would see. Tab two is what they'd miss.

Task given to AI: "Draft a risk assessment for onboarding a major new subcontractor for a mid-sized construction company."
AI output:
Key Risks Identified:

1. Financial stability — Subcontractor may lack capital to sustain project through payment cycles. Recommend credit check and financial references before engagement.

2. Insurance coverage — Verify public liability, workers compensation, and professional indemnity coverage meets project minimums. Request certificates of currency.

3. Safety record — Review SWMS documentation and incident history. Require site-specific induction before commencement.

4. Capacity constraints — Assess whether subcontractor can resource the project without compromising existing commitments. Request current project list.

Recommendation: Proceed with onboarding subject to satisfactory completion of the above checks. Estimated due diligence timeline: 5–7 business days.
Task given to AI: "Draft a risk assessment for onboarding a major new subcontractor for a mid-sized construction company."
AI output — with completeness audit:
Key Risks Identified:

1. Financial stability — Subcontractor may lack capital to sustain project through payment cycles. Recommend credit check and financial references before engagement.

2. Insurance coverage — Verify public liability, workers compensation, and professional indemnity coverage meets project minimums. Request certificates of currency.

Completeness Gap — Contractual liability chain

No assessment of back-to-back liability provisions. If the subcontractor's insurance doesn't mirror head contract obligations, the principal contractor carries the residual risk. This is the clause that costs $200K+ when it goes wrong.

3. Safety record — Review SWMS documentation and incident history. Require site-specific induction before commencement.

Completeness Gap — Industrial relations exposure

No assessment of subcontractor's enterprise agreement status, union coverage, or workforce classification. On a union site, an unaligned subcontractor can trigger a site-wide stop. On a non-union site, misclassified workers create Fair Work liability.

4. Capacity constraints — Assess whether subcontractor can resource the project without compromising existing commitments. Request current project list.

Completeness Gap — Existing client conflict

No check for whether the subcontractor is currently engaged by a competitor or has contractual non-compete/exclusivity obligations that conflict with your project.

Recommendation: Proceed with onboarding subject to satisfactory completion of the above checks. Estimated due diligence timeline: 5–7 business days.
Completeness Gap — Termination and step-in rights

No assessment of what happens if the subcontractor fails mid-project. No step-in provisions, no performance bond requirement, no staged payment structure tied to milestones. The recommendation to "proceed subject to checks" assumes the checks will catch everything. They won't — because the checks themselves are incomplete.

100%
Accuracy — Nothing in the AI output is incorrect
45%
Completeness — Four critical risk categories missing

Everything in Tab 1 is accurate. Nothing is invented. A lot of teams would read it, tick the boxes, and proceed. Tab 2 shows what that team would miss — not because the AI lied, but because it simplified their problem into a plan that covered less than half the actual risk surface. That's the completeness gap. That's what the instinct catches.

Three layers of organised judgment

This isn't about making AI smarter. It's about making your team's oversight systematic — so the gap between AI output and business reality gets caught before anyone acts on it.

1

Plan-Evaluate-Patch

Every AI output gets audited before execution. The system forces the AI to plan without acting, then runs a second evaluator layer against your actual business logic — the rules, constraints, and exceptions your team knows but the AI doesn't. Gaps get flagged. The AI fills them before anyone sees the output. Your team gets the result after the completeness check, not before.

2

Intelligence Routing

Different tasks need different AI models with different levels of oversight. A routine email summary doesn't need the same governance as a risk assessment or a client proposal. I build the routing layer that matches model capability to task complexity and consequence — commodity models for narrow, defined tasks; frontier models with full audit trails for anything that touches your bottom line.

3

Single Source of Truth

The document your team approves is the same artifact that drives the AI. One file to update. When your team changes a process, the AI's behaviour updates with it. No version drift. No gap between what the documentation says and what the system does. This is what makes the architecture survive model changes — when the next AI ships, you update one file and the system adapts.

Before the system, I build the intent

Before building anything, I identify what your business needs the AI to prioritise — the trade-off hierarchy that governs every decision. When speed conflicts with thoroughness, which wins? When cost conflicts with quality, who decides? Human operators absorb these rules through years of experience. AI needs them spelled out. That's what I build first — and it's the reason the systems work on day one, not after months of tuning.

What "AI" means here

Everything I build runs on frontier generative AI — the same models behind Claude and ChatGPT — orchestrated into operational systems with judgment layers. No machine learning pipelines. No training data. No six-month model development. Working systems, usually inside a few weeks.

What this looks like in practice

Hundreds
Specialised AI Workflows
BHP
Former CIO / VP
2–4
Weeks to First Result
0
Vendor Lock-In

Deep, not wide

I've built hundreds of specialised AI workflows — Custom GPTs, Gemini Gems, Claude Skills, structured knowledge bases, and full operational systems — concentrated across deep client engagements. This isn't a hundred surface-level deployments. It's what happens when you go deep: each engagement generates dozens of specialised workflows because every business process has sub-processes, edge cases, and exceptions that each need their own system. Depth produces volume. Volume without depth produces demos.

3 hours 15 minutes

Crew Briefing Packs

Construction and field services teams were assembling briefing packs manually from multiple systems. Now the pack generates automatically — safety requirements, site conditions, compliance items, job specifications — all from one source of truth. Every pack runs through the same completeness audit, catching the safety items that were missed when assembly was manual and someone was under time pressure.

Skipped regularly Every job, automatically

Compliance Verification

Compliance checks that were being missed because nobody had time — they now run automatically on every single job. Same rules your team already follows, enforced consistently, zero additional workload. The system catches the gaps that used to slip through when experienced staff were on leave and less experienced people didn't know what to check.

Full day Under an hour

Proposal Assembly

Proposals that required pulling data from five different systems and took a senior person off real work for a full day. Now assembled in under an hour — with better quality because the system runs completeness checks against the original brief, catching omissions that people miss under deadline pressure.

2+ days Same-day delivery

Board & Management Reporting

Monthly reports that consumed someone's entire week — pulling, consolidating, formatting. Now the data consolidates automatically and reports generate same-day with consistent quality. The judgment layer verifies that the narrative matches the data — no more reports where the commentary says "steady growth" but the numbers show a decline.

Background

Decades in industrial operations at BHP — CIO and VP roles covering systems integration, international business development, and sales across mining, rail, ports, and energy. Advised C-level executives and Government Boards on complex operational challenges. I know what operational pressure feels like from the inside, not from a textbook.

Three things that protect your investment

These aren't marketing lines. They're the rules I work by — and they exist because the alternative burns money.

I won't sell you technology you don't need

I start by asking what we should stop doing altogether. Most businesses trying to automate a workflow should first eliminate 30–40% of the steps entirely. Automating a bad process just makes it a faster bad process. If the honest answer is "you don't need AI for this" — I'll say that. A failed AI project doesn't just waste money — it vaccinates your entire team against trying again. I'd rather tell you to wait than sell you something that creates scar tissue.

I won't lock you into one AI vendor

Everything I build works across models. When the next model ships — and it will ship sooner than anyone expects — your systems keep running because the architecture is independent of any single AI. The judgment layer persists regardless of which model sits underneath it. You're not buying a dependency. You're buying a capability that survives the next model change.

I won't pretend AI output is trustworthy by default

AI produces output that is correct and incomplete — and it does this with total confidence. No hesitation, no "I'm not sure, boss." The systems I build are designed for that reality. Every output runs through a completeness audit before your team sees it. Every system has guardrails, fallbacks, and clear handoff points where a human makes the call. The systems that work are the ones designed for how AI actually performs — not how the marketing says it performs.

Find out where the instinct gaps are

A structured diagnostic where we identify where AI is creating value in your operations and where it's creating invisible risk. You'll see the completeness gaps in your own workflows — the same kind of gaps the demo above revealed. Concrete findings, not a slide deck. If the honest answer is "not yet," I'll tell you.

Book the AI Instinct Diagnostic