Agentic AI in M&A Due Diligence: What Works, What Hallucinates, and What's Next

Over 20% of 2025’s $5B+ megadeals had an AI thesis at their core. DiligenceSquared just launched voice agents for M&A research. Deloitte has published extensively on multi-agent systems for deal analysis. Every advisory firm is adopting hybrid AI tech stacks – general-purpose LLMs for productivity, purpose-built models for mission-critical workflows.

The momentum is real. But so is the gap between what these systems can demo and what they can do in production, under time pressure, across jurisdictions, with a deal team that needs to trust the output enough to stake their reputation on it.

I have spent years working on cross-border M&A strategy across five ASEAN markets. I have sat in data rooms at 1 AM, cross-referencing contract schedules against financial statements, trying to reconcile numbers that refuse to reconcile. I have seen what due diligence actually looks like – and I have a clear view of where AI agents help, where they hallucinate, and where human judgment remains irreplaceable.

This is a practitioner’s assessment, not a vendor pitch.

Where AI Agents Genuinely Work

There are areas of due diligence where agentic AI is genuinely transformative – high-volume, pattern-driven tasks that benefit from exhaustive coverage human teams cannot achieve under deal timelines.

Document Ingestion and Extraction

A typical mid-market deal data room contains 8,000 to 15,000 pages. Corporate records, financial statements, contracts, regulatory filings, tax returns, HR records, IP documentation. A junior analyst team of four might take three weeks to catalogue and extract key terms from that volume.

An AI agent does it in hours. Not days. Hours.

The agent ingests every document, classifies it by type, extracts key clauses, financial terms, dates, parties, and risk flags, and produces a structured index that the deal team can query. I have seen this reduce the initial data room review phase by 60-70% in terms of analyst hours. On a recent cross-border engagement, the extraction agent identified 23 contracts with change-of-control provisions that the manual review had initially catalogued as “standard vendor agreements.” Three of those contracts had termination triggers that would have materially affected the target’s revenue base post-acquisition.

That is not incremental improvement. That is a different category of coverage.

Financial Pattern Detection

This is where AI agents earn their keep on the quantitative side. Cross-referencing revenue recognition across reporting periods, identifying inconsistencies in working capital treatment, flagging unusual related-party transaction patterns – these are tasks that require comparing thousands of data points across multiple documents, exactly the kind of work where human attention degrades and AI attention does not.

On one engagement, the financial agent flagged that the target’s accounts receivable aging had shifted significantly in the two quarters before the deal was announced. The average collection period had extended by 18 days, but the revenue growth narrative in management presentations remained unchanged. That disconnect – between what the numbers showed and what the story claimed – became a central negotiation point. A human analyst would have caught it eventually. But “eventually” in a deal with a 45-day exclusivity window is not good enough.

Market and Competitive Analysis

Synthesizing public filings, news flow, industry reports, and regulatory announcements into a coherent competitive positioning analysis – this is a natural fit for LLM-based agents. The agent can process hundreds of sources in parallel, identify thematic patterns, and produce a market map that would take a research team days to assemble manually.

The output is not perfect. It needs editing, contextualizing, and sense-checking. But as a first draft that captures 80% of the relevant landscape, it saves the research team two to three days. On a compressed deal timeline, those days matter more than people outside the process realize.

Contract Analysis at Scale

When you are reviewing 400 contracts in a data room, you are looking for specific provisions: change-of-control clauses, exclusivity terms, IP assignment gaps, non-compete scope, indemnification caps, limitation of liability carve-outs. An AI agent can scan all 400 contracts and flag every instance of these provisions, cross-referenced against a checklist, in a fraction of the time a legal team would take.

On one deal involving a target with operations across three ASEAN jurisdictions, the contract analysis agent identified that 12 key supplier agreements had been executed under the laws of a jurisdiction different from where the supplier actually operated. That mismatch created enforcement risk that the target’s management had not flagged. The legal team confirmed the finding and it became a condition precedent in the SPA.

Where AI Agents Fail Dangerously

Now the harder part. There are areas where AI agents fail in ways that are actively dangerous because the failures look plausible.

Cross-Jurisdiction Regulatory Analysis

This is the failure mode I worry about most. An agent analyzing a deal structure across Singapore, Indonesia, and India might confidently state that a particular holding structure is compliant with MAS guidelines, when in fact it violates a specific provision of the Securities and Futures Act that applies to the acquirer’s licensing status. The hallucination is not random nonsense. It is a coherent, well-structured legal analysis that happens to be wrong on a critical point.

The danger is amplified because the output reads like it was written by someone who knows the regulatory framework. It uses the right terminology, cites real regulations, and follows a logical structure. A deal team member who is not a specialist in Singaporean securities law might not catch the error. I have seen this happen – not in a live deal, fortunately, but in a test scenario that was realistic enough to be alarming.

Multi-jurisdictional regulatory analysis requires not just knowledge of individual frameworks but understanding of how they interact – bilateral investment treaties, double taxation agreements, transfer pricing rules, foreign exchange controls. Current AI agents do not reliably navigate these interactions. They are good at analyzing one framework in isolation. They are unreliable at analyzing five frameworks simultaneously and identifying the conflicts between them.

Earnings Quality Judgment

AI agents can flag unusual financial patterns with impressive accuracy. But flagging is not the same as judging.

Consider a target company in the agricultural inputs sector with highly seasonal revenue – 60% of annual sales concentrated in two quarters aligned with the kharif and rabi sowing seasons. A pattern-matching system sees lumpy revenue and flags it as a potential earnings quality concern. But anyone who has worked with agri-businesses in India knows this is completely normal. The flag is technically accurate (revenue is lumpy) but contextually meaningless (it is supposed to be lumpy).

The inverse is equally dangerous. A company with artificially smooth revenue – where the smoothness itself is the red flag, suggesting channel stuffing or bill-and-hold arrangements – might pass the AI’s pattern check because smooth revenue looks “healthy” to a system trained on the general principle that consistency is good.

Earnings quality assessment requires judgment about business models, industry dynamics, management incentives, and accounting policy choices. AI can surface the data. Humans must make the call.

Relationship and Cultural Due Diligence

In India-ASEAN cross-border deals, some of the most important information never makes it into the data room. The quality of the promoter’s relationships with key regulators. Family dynamics that determine succession and strategic direction. Informal understandings with suppliers and customers that predate and sometimes override written contracts. The reputation of the business in its local market – not what Google says, but what the industry actually thinks.

I have worked on deals where the promoter’s relationship with a single government official was more material to the target’s value than anything in the financial statements. No AI agent can assess that. No AI agent can read the room in a management presentation and notice that the CFO’s body language changes when certain revenue streams are discussed.

This is not a temporary limitation that better models will solve. It is a fundamental boundary of what pattern-matching systems can do versus what human judgment, built on years of operating in these markets, provides.

Valuation

AI agents can run a DCF model mechanically. They can populate comparable company tables, calculate trading multiples, and produce a waterfall chart. The arithmetic is flawless.

But valuation is not arithmetic. It is judgment. What terminal growth rate is appropriate for an Indian SaaS company selling into ASEAN markets? What discount rate reflects the actual risk of a business that derives 40% of revenue from a single government contract? How much synergy can the acquirer realistically extract, given integration challenges across five jurisdictions?

These are questions where reasonable professionals disagree, and where the disagreement itself is informative. An AI agent that produces a single-point valuation with apparent confidence is not helpful. It is dangerous. Valuation in M&A is a range, a negotiation tool, and a judgment call. Treating it as a computation misses the point entirely.

Multi-Agent Architecture for Deal Analysis

The most promising approach I have seen is not a single AI system trying to do everything, but a multi-agent architecture where specialized agents handle specific workstreams and an orchestrator manages the interactions between them.

Here is the architecture that works for production-grade M&A due diligence:

graph TB
    DR[Data Room Agent<br/>Ingest + Classify + Index] --> FA[Financial Agent<br/>Models + Anomalies]
    DR --> LA[Legal Agent<br/>Contracts + Regulatory]
    DR --> MA[Market Agent<br/>Competitive + Industry]
    FA --> OR[Orchestrator<br/>Cross-validation + Conflict Detection]
    LA --> OR
    MA --> OR
    OR --> |"High confidence"| AUTO[Auto-summarize<br/>for Deal Team]
    OR --> |"Conflicts found"| ESC[Escalate to<br/>Human Review]
    OR --> |"Low confidence"| HUM[Flag for<br/>Expert Assessment]

Data Room Agent. Ingests all data room documents, classifies them by type and relevance, extracts structured data, and maintains a queryable index. This is the foundation layer. Every other agent depends on it.

Financial Agent. Builds financial models from extracted data, identifies anomalies and inconsistencies, performs trend analysis, and generates financial due diligence summaries. Operates on structured numerical data and is the most reliable agent in the stack.

Legal Agent. Reviews contracts for specific provisions, maps regulatory requirements across jurisdictions, identifies compliance gaps, and flags risk areas. Reliable for extraction and pattern-matching. Unreliable for regulatory judgment – this is where human oversight is most critical.

Market Agent. Synthesizes public information into competitive positioning analysis, market sizing, and industry trend assessment. Produces the first draft of the commercial due diligence narrative.

Orchestrator. This is the critical component. The orchestrator routes queries to appropriate agents, manages context windows, and – most importantly – flags conflicts between agent findings.

The cross-checking is where the architecture becomes powerful. The Financial Agent flags a revenue anomaly in Q3. The Orchestrator routes this finding to the Legal Agent, which checks whether any contracts with revenue implications were modified in that period. Simultaneously, the Market Agent validates whether the revenue pattern is consistent with industry benchmarks. If the Legal Agent finds a contract amendment and the Market Agent confirms the pattern is unusual for the sector, the finding is escalated with high confidence. If the Market Agent shows the pattern is industry-wide, the finding is contextualized as a sector trend rather than a company-specific concern.

This cross-validation between agents is what distinguishes a production system from a demo. A single agent can hallucinate. Multiple agents checking each other’s work are far less likely to hallucinate in the same direction.

The India-ASEAN Lens

This is where my direct experience sits, and where the practical challenges are sharpest. The India-ASEAN financial corridor is one of the most active cross-border deal environments in the world.

India-ASEAN bilateral trade has crossed $130 billion. The deal flow across this corridor is accelerating – outbound Indian acquisitions into Southeast Asia, inbound ASEAN investment into India, and complex multi-jurisdiction structures that touch both regions.

The challenge for AI systems in this corridor is formidable. Five jurisdictions (at minimum – many deals touch Singapore, Indonesia, Vietnam, Thailand, and the Philippines). Five regulatory regimes with different disclosure requirements, foreign ownership restrictions, and approval processes. Five languages of legal documentation, often with concepts that do not translate cleanly.

Then add GIFT City IFSC into the mix. India’s International Financial Services Centre operates under a unique regulatory environment – a blend of Indian law with international financial centre principles, administered by IFSCA rather than RBI or SEBI. AI models have not been trained on this framework in any meaningful way. The regulations are new, evolving, and often clarified through circular and FAQ rather than through formal legislation. An AI agent that treats GIFT City as simply “India” or simply “offshore” will get the analysis wrong.

The deeper issue is that in these markets, local context matters more than AI sophistication. Knowing that a Vietnamese regulatory approval will take six months, not the two months that the regulation technically allows. Understanding that an Indonesian partner’s political connections are an asset in one administration and a liability in the next. Recognizing that an Indian promoter’s verbal commitment carries weight in a business culture where personal reputation is a form of contractual obligation.

No amount of training data teaches an AI system these things. They come from years of operating in these markets, from relationships built over dozens of deals, from pattern recognition that is human, not artificial.

The Governance Requirement

If you deploy AI agents in M&A due diligence without a governance framework, you are building a liability, not a tool. Here is what production-grade governance looks like:

Source traceability. Every AI-generated finding must trace back to specific source documents. Not “based on the data room” – based on this page of this document, uploaded on this date. When a deal team presents findings to a board or a regulator, they need to show their work. An AI assertion without a source citation is worthless in a professional context.

Confidence scoring. Every assertion needs a confidence indicator. “The target has 47 contracts with change-of-control provisions [high confidence, based on direct extraction]” is useful. “The deal structure is likely compliant with Indonesian foreign ownership rules [medium confidence, based on general regulatory knowledge]” tells the team where to focus human review. Without confidence scores, the team treats all AI output with equal weight, which is exactly wrong.

Human handoff points. The system must clearly define where AI analysis ends and human judgment begins. Document extraction and pattern flagging? AI-led with human spot-checks. Regulatory compliance assessment? AI-assisted but human-led. Valuation judgment calls? Human-only, with AI providing computational support. These boundaries need to be explicit and enforced.

Audit trail. If a deal goes wrong post-close – if an undisclosed liability surfaces, if a regulatory non-compliance emerges, if a financial irregularity was missed – the acquirer’s board and their regulators will ask what due diligence was performed and how. An AI system without a complete audit trail of what it analyzed, what it found, what it missed, and where humans overrode its recommendations is a governance failure waiting to happen.

What Is Next

The trajectory is clear. AI agents will become standard infrastructure in M&A due diligence within the next two to three years. The firms that are building this capability now – not buying vendor demos, but building production systems with proper governance – will have a structural advantage.

But the trajectory also includes a reckoning. Somewhere in the next 18 months, a deal will go wrong because a team over-relied on AI due diligence without adequate human oversight. A hallucinated regulatory analysis that was not caught. A pattern-matching system that missed a fraud because the fraud was designed to look normal. That incident will define the regulatory response, and the firms that already have governance frameworks in place will be vindicated.

The right approach is neither AI maximalism nor AI skepticism. It is knowing, concretely, where these systems earn their keep and where they create liability.

The governance patterns here overlap heavily with broader agent governance challenges in enterprises – the same principles of bounded autonomy, reasoning capture, and audit trails apply. For the underlying model risk frameworks that should wrap around any AI system making financial assessments, see building AI-native risk frameworks for Indian banks.

One more blunt opinion: the firms that will win in AI-assisted M&A are not the ones with the most sophisticated models. They are the ones that know exactly where to stop trusting the model and start trusting the person who has done thirty deals in the same sector. That judgment – knowing the boundary – is itself a form of expertise that no system can replicate.