Your Audit Log Knows What the Agent Did. Not Why.
Phil Bolton · June 12, 2026 · 3 min read
A founder I work with let an AP agent handle anything under $2,000. Capture the invoice, match it to the PO, post it, queue the payment. Eight months, no incidents, and his close dropped from nine days to four. Then his lender's annual review asked for support on an $1,800 payment to a vendor nobody on the team recognized. He pulled the agent's log. It showed him the exact second the payment fired. It couldn't tell him why.
His log was complete. It just answered the wrong question.
A list of actions is not a record of decisions
Walk through what a typical vendor "audit trail" actually captures. Invoice received at 14:02. Matched at 14:02. Posted at 14:03. Payment queued. Every event is there, stamped and ordered, and for the first eight months that felt like control.
It isn't. The log records that the agent acted. It doesn't record what the agent was looking at when it decided to. Was there a PO, or did it match on a fuzzy vendor name? Did a duplicate-payment check run? Was this vendor new that week? The events are real, but they're a receipt, not a reason. When your lender or your auditor asks "why did the system pay this," a column of timestamps is not an answer.
Regulators just moved the bar
On June 10 the Financial Stability Board published a consultation report on agentic AI in finance. The draft is non-binding, and a $9M company isn't its target. But read what it asks institutions to be able to show: what the agent was allowed to do, what data and tools it accessed, where human approval was required, and what record remains.
Notice that only the last item is in your vendor's log today. The first three are the ones that turn a payment into a decision you can defend. Big banks will get there because examiners will force it. The question for a growing company is who forces it for you, and the answer is usually nobody, until a lender review or a fraud case asks the question your log can't field.
An action log tells you the agent paid the invoice. A reasoning trace tells you the rule it fired, the inputs it read, and the check it ran first. One is a receipt. The other is a control.
Ask before you widen the mandate
Pulling the agent isn't the fix. It bought back five days a month and that's real. What you want is the reasoning trace, demanded before you raise the threshold from $2,000 to $10,000.
Pick one payment from last month and ask your vendor to reconstruct it end to end: the rule that fired, the data it matched against, the duplicate check, the point where a human could have intervened and didn't. If they can rebuild a single payment cleanly, raise the limit. If they hand you a timestamp and call it an audit trail, you don't have a control. You have a faster way to lose the same argument.
The agent will keep working fine right up until someone asks why. Make sure the answer exists before the question does.

Phil Bolton
Founder & Principal at Manitou Advisory
More from the blog
You Pay Per Outcome. Your Vendor Counts the Outcomes.
Outcome-based pricing sounds fair: pay only when the software delivers. But the vendor writes the definition of 'delivered' and owns the only meter, so your AP process has nothing to reconcile against.
Your Agent Needs One Definition of Revenue. You Have Three.
Finance teams are racing to put AI agents into production. The ones that stall don't fail on the model. They fail because nobody agreed what the numbers mean.
The Stablecoin Your Customer Pays You With Isn't Always a Dollar
Overseas customers paying in USDC often source it at a premium over the official rate. When that premium widens, your collections wobble and it reads like churn.
Want to talk about your finance setup?
We help growing companies build the right finance function.
Book a Call →