Chapter 3 AI as the Specification Engine
Chapter 3 — AI as the Specification Engine
Let me show you something that would have seemed like science fiction in 2015.
I open a conversation with an AI assistant. I type:
Design a FOSM lifecycle for an invoice in a consulting firm. Include states, events, guards, side-effects, and actors. The firm does retainer and project-based billing. Invoices sometimes require client approval before payment.
Forty seconds later, I have a complete specification. States I hadn’t thought of. Guards that encode business rules I would have missed in a first-pass design. Side-effects that correctly anticipate downstream consequences. A working first draft of a lifecycle that would have taken a senior domain expert two hours to sketch on a whiteboard.
This is not a demo. This is the current state of practice. And it changes everything about why FOSMs are now the right model for business software.
The Historical Barrier
Let me be honest about why FOSM-style modeling wasn’t the default before.
In Chapter 1, we established that CRUD won not because it was better, but because it was easier. The deepest reason it was easier: specification cost.
Designing a correct, complete state machine for a real business object is genuinely hard. You have to know the domain deeply. You have to anticipate edge cases that only emerge in production — what happens when a retainer invoice is partially paid but the retainer period has ended? What happens when a client disputes an invoice that’s already been marked partially paid? What happens when the company restructures and the counterparty entity changes mid-lifecycle?
These aren’t hypothetical edge cases. They’re real situations that any consulting firm’s finance team has navigated. And the only way to get them right in a state machine specification is to either have deep domain expertise yourself, or spend days interviewing the people who do.
Then the business changes. A new billing type is introduced. A new approval requirement is added. The state machine needs revision. Back to the whiteboard. Back to the interviews. Back to the expensive, slow process of translating business knowledge into formal specification.
This is why BPM tools always struggled. Not because workflow engines were technically wrong — the concept was sound. But the cost of maintaining accurate process specifications in a changing business was prohibitive for most organizations.
CRUD sidestepped the problem entirely: there’s nothing to specify upfront. Add a table, add some fields, build some forms. The business rules will live in someone’s head and in the application code. This is a terrible long-term trade, but the short-term savings are real.
That calculation has changed.
What AI Actually Does Well Here
Large language models are, at their core, next-token prediction systems trained on vast quantities of human text — including decades of business documentation, process manuals, software specifications, accounting standards, HR handbooks, legal agreements, and industry best practices.
What this means in practice: LLMs have internalized a remarkable amount of domain knowledge about how businesses work. They know what an invoice is in a consulting context versus a retail context. They know what “approval authority” means in finance. They know what guards should protect a contract execution. They know the difference between a vendor being “suspended” and “terminated,” and what each state implies.
When you ask an LLM to design a FOSM lifecycle, it does something that looks like reasoning but is fundamentally a synthesis of patterns it has seen in thousands of real business process documents. That synthesis is extraordinarily useful — not because it’s always right, but because it’s a strong first draft that a domain expert can evaluate and refine in minutes rather than hours.
This is the new collaboration: you provide the business context, AI provides the structured first draft, you refine it.
The AI is not replacing the domain expert. It’s doing the tedious synthesis work that previously made specification so expensive: drafting the states, naming the events, proposing the guards. The domain expert’s time is now spent reviewing and correcting, not producing from scratch. The ratio of expert time to specification quality has improved by roughly an order of magnitude.
Three Demonstration Lifecycles
Let me show you what AI-generated FOSM specifications look like in practice. These examples are representative of what a capable LLM produces when prompted well. They’re not perfect — we’ll discuss evaluation in a moment — but they’re shockingly good starting points.
Example 1: Invoice Lifecycle (Consulting Firm)
Prompt:
Design a complete FOSM lifecycle for an Invoice in a consulting firm. The firm does retainer and project-based billing. Some invoices require client approval before they can be paid. Include states, events, guards, side-effects, and actors. Format as a structured specification.
AI Output:
FOSM: Invoice (Consulting)
STATES
------
draft Invoice created, not yet sent to client
pending_approval Sent to client, awaiting explicit approval (approval-required invoices only)
sent Sent to client, awaiting payment (or approval waived)
viewed Client has opened/viewed the invoice
partially_paid Partial payment received; balance remains
paid Full payment received
overdue Past due date without full payment
disputed Client has formally disputed the invoice
cancelled Invoice cancelled before payment
EVENTS
------
invoice_sent Consultant sends invoice to client
approval_requested System routes invoice for client approval (if required)
client_approved Client approves invoice for payment
client_rejected Client rejects invoice (dispute before payment)
invoice_viewed Client opens the invoice (tracked via read receipt or portal)
payment_received Payment processor confirms incoming payment
payment_partial Payment processor confirms partial payment
due_date_passed Scheduled job fires when due date elapses
dispute_raised Client formally raises a dispute
dispute_resolved Both parties agree to resolution
cancelled Invoice explicitly cancelled
TRANSITIONS
-----------
draft → pending_approval ON approval_requested
Guard: invoice.requires_approval? && invoice.client_email.present?
Actors: system, billing_manager
Side-effects: Email client with approval link; log timestamp
draft → sent ON invoice_sent
Guard: !invoice.requires_approval? && invoice.total > 0
Actors: account_manager, billing_manager
Side-effects: Email invoice to client; set sent_at; start due date timer
pending_approval → sent ON client_approved
Guard: approver_is_authorized_client_contact?
Actors: client_contact
Side-effects: Notify billing team; log approval with approver identity
pending_approval → disputed ON client_rejected
Guard: rejection_reason.present?
Actors: client_contact
Side-effects: Alert account manager; pause due date timer
sent → viewed ON invoice_viewed
Guard: viewer_ip_not_internal?
Actors: system (via email tracking pixel or portal analytics)
Side-effects: Log view timestamp; notify account manager if invoice > $10,000
viewed → partially_paid ON payment_partial
Guard: amount_received > 0 && amount_received < invoice.total
Actors: system (Stripe webhook)
Side-effects: Create Payment record; send partial receipt; update outstanding balance
viewed → paid ON payment_received
Guard: amount_received >= invoice.total
Actors: system (Stripe webhook)
Side-effects: Create Payment record; send receipt; close associated project billing; update revenue recognition
sent → overdue ON due_date_passed
Guard: current_date > due_date && status != :paid
Actors: system (scheduled job)
Side-effects: Send overdue notice to client; alert account manager; escalate if > 30 days
partially_paid → paid ON payment_received
Guard: total_received >= invoice.total
Actors: system (Stripe webhook)
Side-effects: Create final Payment record; send final receipt; close invoice
partially_paid → overdue ON due_date_passed
Guard: current_date > due_date
Actors: system
Side-effects: Send overdue notice referencing partial payment; escalate
overdue → paid ON payment_received
Guard: total_received >= invoice.total
Actors: system
Side-effects: Create Payment record; send receipt; log overdue resolution
overdue → disputed ON dispute_raised
Guard: dispute_reason.present?
Actors: client_contact, account_manager
Side-effects: Pause all automated reminders; alert finance director; open dispute record
disputed → sent ON dispute_resolved
Guard: resolution_terms_agreed? && revised_amount_set?
Actors: finance_director, account_manager
Side-effects: Update invoice amount if revised; notify both parties; restart due date
[draft|sent|viewed|partially_paid|overdue] → cancelled ON cancelled
Guard: cancellation_authorized?
Actors: billing_manager, finance_director
Side-effects: Notify client; void in accounting system; log reason
What’s striking about this output:
- It correctly identifies
pending_approvalas a state that only some invoices need (retainer clients often have pre-approved billing, project clients often require explicit approval) - The
viewedstate is non-obvious — most developers would skip it — but it’s genuinely useful for account managers tracking engagement - The
dispute_resolvedtransition correctly moves back tosentrather than directly topaid, because a resolved dispute often involves a revised invoice amount - The guard
viewer_ip_not_internal?on theviewedtransition is smart — you don’t want internal testing to trigger view notifications
Could I have designed all of this myself? Yes, eventually. But “eventually” is the key word. The AI gave me a 90% solution in 40 seconds. I spend the next 10 minutes improving the 10%.
Example 2: Hiring Pipeline FOSM
Prompt:
Design a FOSM for a hiring pipeline at a mid-size tech company. The company uses structured interviews. Include an offer negotiation phase and a background check before the final hire. Include states, events, guards, side-effects, actors.
AI Output:
FOSM: Job Application (Hiring Pipeline)
STATES
------
applied Candidate submitted application
screening Recruiter actively reviewing application
phone_screen Scheduled phone screen in progress or completed
interview_panel Panel interview scheduled or in progress
offer_pending Hiring decision made; offer being prepared
offer_extended Formal offer sent to candidate
offer_negotiating Candidate has countered; negotiation in progress
background_check Offer accepted; background check initiated
hired Background check passed; candidate hired
rejected Candidate rejected at any stage
withdrawn Candidate withdrew their application
on_hold Pipeline paused (role put on hold, not cancelled)
EVENTS
------
application_received Application submitted through ATS
shortlisted Recruiter marks application for phone screen
phone_screen_scheduled Phone screen scheduled with candidate
phone_screen_passed Recruiter marks phone screen as pass
phone_screen_failed Recruiter marks phone screen as fail
panel_interview_scheduled Panel interview scheduled
interview_completed All interviewers have submitted scorecards
hiring_decision_made Hiring committee approves proceeding to offer
offer_drafted Recruiter finalizes offer letter
offer_sent Offer formally sent to candidate
candidate_countered Candidate submits counteroffer
negotiation_concluded Both parties agree to final terms
offer_accepted Candidate formally accepts offer
offer_declined Candidate declines offer
background_check_initiated Background check ordered
background_check_passed Background check returns clear
background_check_failed Background check returns disqualifying result
hire_confirmed HR confirms onboarding initiated
rejected_at_stage Any rejection at current stage
application_withdrawn Candidate withdraws
SELECTED TRANSITIONS
--------------------
screening → phone_screen ON phone_screen_scheduled
Guard: recruiter_confirmed_availability? && candidate_confirmed_availability?
Actors: recruiter, recruiting_coordinator
Side-effects: Calendar invite to candidate; notify hiring manager
interview_panel → offer_pending ON hiring_decision_made
Guard: all_scorecards_submitted? && quorum_recommends_hire? && headcount_approved?
Actors: hiring_manager, recruiting_director
Side-effects: Notify recruiter to prepare offer; log hiring decision with scorecard summary
offer_extended → offer_negotiating ON candidate_countered
Guard: counter_is_within_policy_bands? || escalation_approved?
Actors: candidate (via portal), recruiter
Side-effects: Alert hiring manager and HR; pause offer expiry timer
offer_negotiating → background_check ON offer_accepted
Guard: final_terms_documented? && offer_signed?
Actors: candidate
Side-effects: Order background check; notify IT for provisioning prep; notify payroll
background_check → hired ON hire_confirmed
Guard: background_check_passed? && start_date_confirmed? && offer_letter_signed?
Actors: hr_manager
Side-effects: Create Employee record; trigger onboarding workflow; notify all teams; send welcome email
background_check → rejected ON background_check_failed
Guard: disqualifying_result_reviewed_by_hr?
Actors: hr_manager
Side-effects: Notify candidate per legal requirements; rescind offer; update headcount forecast
[any non-terminal state] → on_hold ON role_put_on_hold
Guard: business_justification_documented?
Actors: hr_director, cfo
Side-effects: Notify candidate professionally; pause all pipeline timers; set review date
The AI correctly models offer_negotiating as a distinct state (most CRUD systems would just call this “offer_extended” and add a notes field). The guard on candidate_countered — checking whether the counter is within compensation bands — is something a senior HR professional would add but a developer might miss. The on_hold state, which can be reached from any non-terminal state, correctly handles the real-world situation where hiring freezes.
Example 3: Vendor Management Lifecycle
Prompt:
Design a FOSM for vendor lifecycle management at a mid-size company with procurement controls. Include the RFP/evaluation phase, contract management, and ongoing performance monitoring. Include suspension for compliance failures.
AI Output:
FOSM: Vendor (Lifecycle Management)
STATES
------
prospect Vendor identified but not yet engaged
rfp_issued Request for proposal sent to vendor
proposal_received Vendor has submitted proposal
under_evaluation Procurement committee is evaluating proposal
approved Vendor approved; legal preparing contract
contract_negotiation Contract terms under negotiation
active Contract signed; vendor is in active use
performance_review Scheduled or triggered performance review
suspended Vendor activity paused (compliance or performance issue)
terminated Vendor relationship ended
blacklisted Vendor barred from future engagement
SELECTED TRANSITIONS
--------------------
prospect → rfp_issued ON rfp_sent
Guard: rfp_approved_by_procurement? && budget_code_assigned?
Actors: procurement_manager
Side-effects: Email RFP to vendor contact; log RFP version; set response deadline
under_evaluation → approved ON vendor_approved
Guard: evaluation_score >= threshold? && no_disqualifying_flags? && required_certifications_verified?
Actors: procurement_committee (minimum 2 members)
Side-effects: Notify vendor; notify legal to begin contract prep; update vendor registry
active → suspended ON compliance_failure_flagged
Guard: compliance_issue_documented? && reviewed_by_legal?
Actors: compliance_officer, legal_team
Side-effects: Block new purchase orders for this vendor; notify finance; notify affected project managers; set resolution deadline
suspended → active ON compliance_resolved
Guard: compliance_officer_confirmed_resolution? && legal_sign_off?
Actors: compliance_officer
Side-effects: Re-enable purchase orders; notify affected teams; log resolution
suspended → terminated ON termination_decision_made
Guard: 30_day_notice_served? || material_breach_confirmed?
Actors: cfo, legal_director
Side-effects: Notify vendor formally; trigger contract termination clause; notify all open PO owners; schedule data deletion
active → blacklisted ON blacklist_decision_made
Guard: board_approved? && legal_risk_assessment_complete?
Actors: board_resolution
Side-effects: Update vendor registry; notify all business units; document grounds for blacklisting (legal hold)
The blacklisted state — which requires board approval — is a detail only someone who has dealt with real vendor governance issues would include. The guard requiring “minimum 2 members” for vendor approval reflects procurement control requirements. These are the kinds of domain-specific nuances that make the difference between a toy spec and a production-ready one.
Evaluating AI-Generated Lifecycles
AI output is a starting point, not a finished product. Here’s how to evaluate what you get.
What to accept without major changes:
- The core set of terminal states (the endpoints) — LLMs are good at identifying these
- The names of the happy-path transitions — these tend to be accurate
- Common side-effects like “send email” and “log event” — these are almost always right
What to refine:
- Guard conditions that are too simple (e.g.,
payment_received?when you needpayment_amount >= invoice.total && payment_cleared_fraud_check?) - States that conflate two distinct stages — watch for states like “under review” that might actually need to be split into “legal review” and “compliance review”
- Actor assignments that are too broad — “admin” as an actor is almost always a sign the AI didn’t think carefully about authorization
Red flags:
- A transition that goes backward in the lifecycle without a guard that requires an explicit override and documented reason (reversals should be rare and audited, not default behavior)
- Guards that reference fields that don’t exist on the object — the AI sometimes invents guards based on what should be true rather than what’s in your system
- Missing edge cases for time-based events — LLMs often underspecify the “what if this never happens” paths (the
deadline_passedevents, theon_holdescape hatches) - Actors that are too vague — “management” is not an actor; “finance_director” is
The evaluation process takes 10–20 minutes for a typical business object. Compare that to the 2–4 hours a traditional specification session would require. The productivity gain is real, and it compounds across every object in your system.
The Virtuous Cycle
Here is the argument from the FOSM paper that I want to spend a moment on, because it’s the most important insight in this whole framework:
AI makes FOSM practical. FOSM makes AI safe. Each enables the other.
We’ve established the first part: AI removes the specification bottleneck. But the second part — FOSM makes AI safe — is equally important and less obvious.
When you deploy an AI agent in a CRUD system, you’re giving it access to an arbitrary field-edit interface. The AI can change any field to any value. It can mark an invoice as paid without a payment. It can move a job application to hired without an interview. The action space is essentially unbounded, and the consequences of mistakes are hard to contain.
When you deploy an AI agent in a FOSM system, the picture is completely different. The AI can only trigger transitions that are defined in the state machine. Each transition has guards that must pass. The actor is logged. The current state constrains the available actions to a finite, well-defined set.
This is what the FOSM paper means by “bounded contexts for human-AI collaboration.” The FOSM is the guardrail. The AI operates inside a well-defined action space where mistakes are detectable, reversible, and auditable.
The practical implication: you can deploy AI agents to assist with invoice processing, hiring pipeline management, vendor evaluation, and dozens of other business workflows with far more confidence when those workflows are modeled as FOSMs. The AI isn’t operating on a raw database. It’s operating on a structured lifecycle with explicit rules.
AI generates FOSM specs
↓
FOSM provides bounded action space
↓
AI agents operate safely within bounded space
↓
AI agents generate better specs from real lifecycle experience
↓
(repeat)
This virtuous cycle is not theoretical. We’ll implement it directly in Part V when we build the AI bot layer for the Inloop Runway application.
How Software Development Has Changed
I want to step back and be honest about the context we’re working in, because it shapes how you should read the rest of this book.
Since 2024, software development has undergone a practical transformation that is only partly visible in industry headlines. The change is not that AI writes all the code — it’s that the economics of code production have fundamentally shifted.
Engineers who work with AI coding agents — Claude Code, OpenAI Codex, GitHub Copilot in its agentic forms — report a consistent experience: they spend most of their time writing prompts, reviewing AI-generated code, making architectural decisions, and providing domain context. They spend a fraction of their time writing code from scratch.
This is not exaggeration. A senior Rails engineer who would previously spend a week implementing a complete FOSM lifecycle for a new business object — model, state machine, transitions, actors, guards, side-effects, tests — can now do it in a day. The AI writes the boilerplate. The engineer writes the clarity: the FOSM specification that tells the AI what to build.
The most valuable skill in software development in 2026 is not syntax fluency. It is the ability to think clearly about what you want a system to do, at a level of precision that an AI can act on. FOSM specifications are exactly this: precise, structured, unambiguous descriptions of business behavior that AI coding agents can translate directly into working code.
This is why Part I of this book exists. You cannot tell an AI “build me an invoicing system.” You can tell an AI “here is a complete FOSM specification for an invoice lifecycle with 12 states and 28 transitions; implement this using the FOSM engine in our Rails 8 application.” The second instruction produces production-quality code. The first produces a demo.
What This Chapter Is Not
This is the last theory chapter. Chapter 4 begins the code.
I’ve deliberately kept this chapter free of implementation details, because the concepts matter independent of the implementation. The argument that FOSM + AI is a better paradigm for business software is not contingent on Rails, or Ruby, or any specific technology stack. It’s a modeling argument. It would be equally true if we were building in Python, Java, or TypeScript.
But from Chapter 4 onward, we’re going to build something concrete: a complete, production-capable business management platform using Rails 8, a hand-rolled FOSM engine, and AI integration. We’ll take the NDA lifecycle we designed in Chapter 2 and implement it first — states, transitions, guards, side-effects, the full picture in code.
By the time we reach Part V, we’ll have an AI bot that can discuss, trigger, and analyze FOSM transitions in natural language. The bot operates on the bounded action space of the FOSM — it can’t do what the state machine doesn’t allow, and everything it does is logged.
That’s the full picture. FOSM as paradigm. AI as specification engine. Rails 8 as implementation platform. Together, they make it possible for one developer to build what previously required an enterprise team.
Chapter Summary
The specification problem kept state machine modeling impractical for 30 years. Designing complete lifecycles was too expensive and too slow.
AI has solved the specification problem. LLMs can generate complete FOSM specs — states, events, guards, side-effects, actors — in minutes, for any business domain. The three examples in this chapter — invoices, hiring pipelines, vendor management — demonstrate the quality of AI-generated output and what skilled evaluation looks like.
The relationship between AI and FOSM is symbiotic: AI makes FOSM practical by removing the specification bottleneck; FOSM makes AI safe by providing bounded action spaces. The FOSM paper identifies this virtuous cycle as the central mechanism of the new paradigm.
Software development in 2026 is primarily a clarity and architecture discipline. Engineers write specifications. AI writes code. The developers who will build the most are those who can write the clearest specifications — and FOSM gives you the vocabulary to do that for any business object.
From here, we build.
In Chapter 4, we meet Rails 8 — the framework that makes the “one-person framework” vision real, and why convention-over-configuration is exactly what AI-assisted development needs.