Cookie Consent

I use cookies to understand how my website is used. This data is collected and processed directly by me, not shared with any third parties, and helps us improve our services. See my privacy and cookie policies for more details.

From Probable to Provable: What Automated Reasoning Means for the Board

Published in Board and Emerging | 12 minute read |    
A figure in a dark suit, partially concealed behind a heavy charcoal velvet curtain, one hand gripping the curtain edge in sharp directional light against a black background — a visual metaphor for the unseen operator whose workings a director is expected to trust without seeing. (Image generated by ChatGPT 5.4)

In 1998, as part of my software engineering degree, I wrote formal specifications by hand in Z notation: a mathematical language for specifying precisely what a system must do and proving, step by step, that it does it. A single specification could take days. A single proof could occupy a week. What required weeks of expert human labour in the late 1990s can now be executed automatically, at scale, across systems making millions of decisions per second.

Automated reasoning is not a new idea. It is an old idea that has been industrialised. And that industrialisation is now arriving in boardrooms, whether directors are ready for it or not.

The firehose and the limits of the probable

Boards have always operated under conditions of incomplete information. That is not new. It is the constitutional condition of governance. What has changed is the volume and velocity of the information environment Boards must navigate to reach decisions they can defend. What was once a trickle of reports arriving in advance of quarterly meetings has become a torrent of real-time data, AI-generated insight, and regulatory obligation. According to Stanford HAI (2025), legislative mentions of AI increased by 21% across 75 countries since 2023. Deloitte (2025) found that only 33% of boards feel equipped to govern AI. The information is growing. The confidence is not keeping pace.

The scale of the governance challenge has shifted in kind, not just in degree. Organisations have moved from overseeing hundreds of decisions made per day to millions made per second. At human decision velocity, oversight was feasible. A director could inspect, challenge, and if necessary override. At machine speed, that model has broken. The Board is no longer governing individual decisions. It is governing the system that makes them.

In previous work on decision analytics, I drew the distinction between lagging, leading, and predictive indicators. Lagging indicators confirm what has already happened. Leading indicators signal directional movement based on patterns that tend to precede particular outcomes. Predictive indicators go further, modelling probable futures under varying conditions and giving a Board a structured view of what the data suggests lies ahead. Each type is more forward-looking than the last. But all three remain within the domain of probability. They report, signal, or estimate. None of them prove.

Probable is not the same as provable. A predictive model might tell a Board that the probability of a compliance breach is elevated. That is useful. It is not the same as being told that a compliance breach is mathematically impossible given the current configuration of the system, or that it is guaranteed unless a specific parameter changes. One statement expresses confidence. The other expresses certainty. For a Board making a fiduciary decision and signing their name to it, those are not equivalent.

This is where automated reasoning enters: not as an improvement on the probabilistic model, but as a different class of instrument entirely.

What automated reasoning actually is

Automated reasoning is the application of formal logic, constraint solving, and mathematical proof to determine what is provably true within a defined system. Where a predictive model works by pattern-matching against historical data to produce a probability estimate, an automated reasoning engine exhaustively analyses every possible state a system can occupy, proving which outcomes are reachable and which are not.

The practical difference matters. A predictive model can tell you that 94% of past cases complied with a rule. An automated reasoning engine can tell you that the rule cannot be violated under any combination of inputs, because it has proved that no such combination exists. These are not more or less accurate versions of the same thing. They are answers to different questions.

The underlying mathematics is not new. Formal specification languages, Z notation among them, were developed to enable precisely this kind of proof-writing: specifying system behaviour in terms of preconditions, postconditions, and invariants, then proving those properties hold. Writing those specifications by hand was expert, time-intensive work. Automated reasoning tools now handle the proof-generation mechanically, at the speed of computation.

Think of it this way. A predictive model is like a sophisticated weather forecast: it tells you there is a 94% chance of rain. Automated reasoning is the mathematical proof that, given the current atmospheric constraints, rain is impossible. One expresses confidence. The other expresses certainty. The Board that has lived with probability its entire working life will feel the difference immediately.

The Board does not need to understand the mathematics. They do need to understand the scope and limits of what has been proven: which constraints were formally specified, which were not, and what falls outside the boundary of the proof. Within the domain where the constraint has been formally specified, the outcome is not probable. It is provable. But a proof is only as good as its specification. A perfectly verified system built on an incomplete or incorrect specification is a perfectly verified wrong answer. That makes the quality of the specification itself a first-order governance concern, not a technical detail to be delegated.

And the boundary is real: automated reasoning operates only within domains that can be formally specified, and formalising constraints that currently exist as policy documents or management judgement is itself non-trivial work. It does not replace human judgement on strategy, culture, ethics in context, or stakeholder relationships. The Board remains the decision-maker. Automated reasoning upgrades the quality of the ground on which that decision is made.

Where it is already operating

Automated reasoning is not a theoretical possibility. It is already embedded in regulated industries precisely because the stakes of being wrong are too high to rely on probability alone.

Civil aviation has required formal verification of safety-critical flight control software for decades. The certification standard DO-178C, supplemented by DO-333 (the Formal Methods Supplement), mandates formal verification for Level A software: the classification reserved for functions whose failure would be catastrophic. This is not a pilot programme. It is a regulatory requirement that has produced a generation of formally verified systems aboard commercial aircraft worldwide. The same standard of assurance that governs whether an aircraft’s flight control software is safe to fly is beginning to shape expectations for enterprise systems operating in regulated environments. Extensive testing has not been sufficient in aviation for thirty years.

Major financial institutions are applying formal verification to prove that capital adequacy calculations comply with Basel III/IV regulatory requirements across all possible input combinations, not across tested scenarios, but across the full state space. A Board that can demonstrate mathematical proof of compliance across every possible input is in a categorically different position to one that can demonstrate extensive testing. The EU AI Act (2024) has accelerated this expectation: Articles 9–15 impose documentation, transparency, and human oversight obligations on high-risk AI systems, obligations that probabilistic assurance cannot fully satisfy but that automated reasoning is specifically designed to meet.

Hospital formulary systems have long embedded rule-based constraint engines for drug interaction checking. These evaluate combinations against formally encoded clinical rules and return a definitive result. When the consequence of error is patient harm, deterministic constraint checking has displaced probabilistic alert systems. This is not formal verification in the full mathematical sense of the aviation example, but it represents the same underlying logic: in high-consequence domains, certainty has superseded probability because the liability demands it.

Insurers encode underwriting criteria as formal constraint systems, enabling them to prove that a pricing decision is consistent with approved policy across all customer segments simultaneously. Pricing consistency is a regulatory, reputational, and fiduciary matter. A Board that can demonstrate provable consistency across its entire book is in a stronger position before a regulator than one relying on sampling-based review. Stanford CodeX (2023) has documented this emerging practice in the context of automating insurance contract reasoning.

The TLS 1.3 cryptographic protocol, which underpins secure communication across the internet, was formally verified prior to deployment. Bhargavan et al. (2017), published in the IEEE Symposium on Security and Privacy, documented the verified models and reference implementations that preceded the final standard. The secure communications infrastructure your organisation relies upon every day was considered too important to trust to probability. The same logic is migrating into enterprise governance.

In each of these domains, automated reasoning has been adopted not because probability was unavailable, but because the consequence of being wrong made probabilistic assurance insufficient. That logic now applies directly to Boards governing AI systems operating at machine speed.

Where automated reasoning could strengthen Board oversight

The examples above show what is possible when the cost of error is catastrophic. The same discipline is now tractable for the governance questions every Board already owns.

Consider contractual obligations. Boards approve standard terms, risk allocations, and compliance clauses that flow through thousands of contracts. Formal encoding of those approved terms as computable rules allows an organisation to prove, rather than sample, that every contract in the portfolio satisfies Board-approved requirements. Surden (2017) and the MIT Computational Law Report both document this emerging discipline. The same logic applies to remuneration: Boards approve pay frameworks, but consistency across all pay decisions is typically monitored through periodic review and sampling. Encoding the approved framework as formal constraints and verifying exhaustively across the entire population gives a remuneration committee a meaningfully stronger quality of assurance.

The pattern extends to internal controls. Boards set policies on delegated authority, segregation of duties, and conflicts of interest, currently enforced through process design and periodic audit. Automated reasoning allows those policies to be verified exhaustively, proving that no combination of inputs can produce a violation under the current system configuration. Under SOX 404 and equivalent international standards, SMT-based verification can prove that no user, role, or automated process can combine actions that create an unacceptable control weakness. At the strategic level, Boards review operating models that integrate financial, operational, risk, and compliance perspectives, where internal contradictions are rarely visible through conventional review. Academic work applying formal methods to enterprise modelling, including research by de Kinderen et al., has demonstrated that tools such as Alloy can prove the absence of such contradictions.

These applications share a common pattern. They do not require the Board to become mathematicians. They require the organisation to formalise the constraints the Board has already approved, then let the mathematics do the exhaustive checking. The accountability for the decision remains with the Board. What changes is the quality of assurance on which that decision rests. In each case, the shift is the same: from sampling to completeness, from interpretation to constraint, from confidence to proof.

Each of these domains is a candidate source for a fourth type of indicator. Where automated reasoning is applied to Board-governed constraints, the output is not a probability estimate. It is a provable property.

Reasoned indicators: the fourth type

In previous work, the distinction between lagging, leading, and predictive indicators was established as a framework for Board-level decision-making. A fourth type is now emerging, and automated reasoning is its engine.

A reasoned indicator does not forecast. It proves. Where a predictive indicator might express that the probability of a compliance breach is elevated, a reasoned indicator expresses that a breach is mathematically impossible given the current constraint configuration, or that it will occur unless a specified condition changes. Those are not more accurate predictions. They are statements of a different order.

The four types can be stated cleanly. Lagging indicators measure what happened. Leading indicators signal directional movement based on patterns that tend to precede outcomes. Predictive indicators anticipate probable futures. Reasoned indicators prove what is possible, impossible, or must hold true.

The governance implication is that Boards now have access to four qualities of input rather than three, and the fourth is categorically more powerful for the class of questions it can address. Not all questions can be formally specified. But for those that can, reasoned indicators change the nature of Board assurance from statistical confidence to mathematical certainty.

Accountability remains with the Board

Automated reasoning does not make the decision. It changes the epistemic quality of the ground on which the Board makes it.

A Board that previously had to decide on the basis of evidence and probability can now, in certain domains, decide on the basis of proof. That is a meaningful improvement in the conditions under which they exercise fiduciary judgement, and in their ability to defend that judgement to shareholders, regulators, and courts.

The accountability does not transfer to the system. It remains, as it always has, with the directors who sit around the table. What automated reasoning changes is not who is responsible. It is how much they can know, with certainty, before they accept that responsibility.

That distinction is worth stating plainly. Boards are not being asked to trust a machine’s decision. They are being offered a quality of assurance that makes their own decision better informed, more defensible, and more clearly theirs.

The discipline is the same; the access has changed

The arc from 1998 to the present is not a story about new mathematics. Z notation, satisfiability solvers, and model checkers rest on foundations laid decades ago. What has changed is who can access them, at what scale, and at what cost.

Boards that understand this are not adopting a technology. They are upgrading the quality of input available to them when they exercise the judgement that remains irreducibly theirs. At a moment when AI systems are making more decisions, faster, than any Board can observe, that upgrade is the foundation on which credible governance at machine speed is built.

Are the assurances your organisation provides to its Board based on probability, or proof? And do you know which parts of your decision-making environment are ready to move from one to the other?

Let's Continue the Conversation

Thank you for reading about how automated reasoning is changing the quality of assurance available to Boards. I would welcome hearing about your organisation's experience with the boundary between probability and proof in governance. Are there domains where your Board already relies on deterministic constraint checking, even if you have not called it automated reasoning? Are you navigating regulatory expectations that are moving beyond testing toward demonstrable guarantees? Or are you still working out which of your Board-approved policies and controls could be formally specified and exhaustively verified? These are the governance questions that reasoned indicators are designed to address, and I am keen to hear how they are showing up in practice.