From Generative AI Assistants to Autonomous Agents: The Operational Shift in Indian Insurance
The first wave of generative AI adoption in Indian insurance, spanning roughly 2023 and 2024, was dominated by retrieval-augmented chatbots and drafting assistants. An underwriter asked a question and received a summary. A claims executive pasted a policy wording and received an interpretation. The AI produced text; a human took every consequential action. By late 2025, the conversation inside HDFC Ergo, ICICI Lombard, Go Digit, Bajaj Allianz, and Tata AIG had moved on. The new category of system being piloted is agentic AI: software that does not merely generate an answer but plans a sequence of steps, invokes tools, reads and writes into policy administration and claims systems, and completes multi-step workflows with minimal human intervention.
The functional difference is material. A generative assistant answering 'what documents does this marine claim require' produces a checklist that a claims executive must then act on. An agentic system receiving a first notification of loss on a marine open cover policy parses the loss notice, reconciles it against the declared shipment, queries the port authority API for vessel arrival data, drafts a document request letter to the insured, files it through the insurer's correspondence system, logs the claim in the claims management platform, and schedules a survey appointment with a panel surveyor. Seven discrete steps, none of which required a human to touch a keyboard.
The business case driving this shift is not speculative. Indian general insurers process approximately 4 to 5 crore claims annually across motor, health, and commercial lines. Claims handling costs average 8 to 12% of incurred claims. Even a 20% reduction in operational handling time on routine FNOL and document collection workflows represents annual savings in the range of INR 2,000 to 3,500 crore across the industry. Agentic systems promise this reduction by compressing the time between claim intimation and first substantive action from an industry average of 48 to 72 hours down to under 4 hours for straightforward cases.
Agent Architectures: ReAct, Plan-and-Execute, and Multi-Agent Orchestration
The technical patterns underpinning agentic insurance systems are not monolithic. Three architectures dominate current deployments, each with different tradeoffs for insurance workflows.
The ReAct pattern (reasoning and acting in an interleaved loop) is the simplest. The agent receives an instruction, thinks about the next step, calls a tool, observes the result, thinks again, and continues until the task is complete. ReAct agents are well suited to short, reactive workflows such as document classification, coverage lookup, or claim status queries, where the number of steps is small and each step's outcome can be evaluated before the next. Go Digit's claims triage assistant, deployed in early 2026 for motor own-damage claims below INR 1 lakh, operates on a ReAct loop with a maximum of eight tool invocations per claim.
The plan-and-execute pattern separates reasoning from action. A planner module, typically a larger language model with deeper reasoning capability, produces a complete step-by-step plan before any tool is invoked. An executor module then carries out each step, with the plan serving as a contract that can be reviewed by a human before execution begins. This architecture is preferred for longer workflows such as SME underwriting file preparation, where an agent might need to pull the applicant's GST returns, MCA filings, CIBIL commercial score, prior loss history from the IIB database, and external news mentions before producing a risk summary. ICICI Lombard's SME underwriting co-pilot, piloted with a set of broker partners in Q4 2025, uses a plan-and-execute architecture and requires a human underwriter to approve the plan before data collection starts.
Multi-agent orchestration is the most complex pattern. Multiple specialised agents with different roles (an intake agent, a coverage interpretation agent, a reserve recommendation agent, a reinsurance cession agent) coordinate through a shared memory and a supervisor agent. HDFC Ergo's internal experiments with multi-agent claims orchestration on large commercial property claims have shown promise but have also exposed the primary weakness of this pattern: coordination failures where agents work at cross purposes, or cascading hallucinations where one agent's incorrect output propagates through the chain. The 2025 pilots settled on architectures that limit autonomy to well-scoped sub-tasks, with a human supervisor reviewing hand-offs between agents.
Tool use: what separates agents from chatbots
The common thread across these architectures is tool use. An agent without tools is just a chatbot. The productive capability of an agent comes from the set of tools it can invoke: database queries, API calls to external services, document retrieval from the insurer's content management system, write operations into the claims or policy administration platform, email and SMS dispatch to insureds and intermediaries. The safety of the agent is determined by which tools it can call, under what conditions, and with what pre-execution checks.
Action Safety Rails: The Hardest Engineering Problem in Production Agentic Systems
The step from a generative AI that produces text to an agentic AI that takes action is not incremental. Text can be reviewed before it reaches a customer. An action, once executed, cannot be unreviewed. A payment released, a policy endorsed, a claim denied, a reinsurance notification sent. Each of these can have downstream consequences that are expensive or impossible to reverse. The engineering effort required to make agents safe for production deployment in insurance operations is substantially larger than the effort to make them capable.
The first rail is scope restriction. Agents in Indian insurance production environments are constrained to a narrow set of permissible actions defined by the insurer's authority matrix. A claims triage agent may read claim data, request documents from insureds, schedule surveyor assignments, and update claim status to 'document pending' or 'survey scheduled.' It may not approve payment, change the sum insured, waive an exclusion, or close a claim. These constraints are enforced at the tool layer: the claims payment API is simply not exposed to the agent. Bajaj Allianz's deployment of a document request agent on health claims, live since November 2025, uses a tool registry that maps each agent role to a whitelist of API endpoints, with any attempt to call an endpoint outside the whitelist logged and blocked.
The second rail is pre-execution validation. Before an agent executes a tool call with material consequences, a validator layer checks the call against business rules. A document request agent about to send a letter demanding 12 documents from an insured might be blocked if the request list exceeds the insurer's standard document requirement for that claim type. A reserve recommendation agent setting a provisional reserve outside a specified range triggers a human approval workflow. The validator can be rule-based, model-based, or a combination; most Indian deployments use a layered approach with a rules engine as the first check and a classifier model as a secondary check for edge cases.
The third rail is confidence gating. Agentic actions are authorised only when the agent's confidence in its own plan exceeds a threshold. Confidence is measured through model log probabilities, through consistency across multiple sampled plans, or through a separate critic model that evaluates the primary agent's output. Low-confidence outputs are escalated to human review. The thresholds vary by action type: a document classification step might proceed at 80% confidence, while any action that affects policyholder communication or reserve levels might require 95% confidence or a human override.
The fourth rail is the audit trail. Every tool call, every intermediate reasoning step, every input and output, is logged in an immutable store. This is a regulatory requirement under IRDAI's Information Security Guidelines 2023, which mandate detailed logging and retention of all system actions affecting policyholder data. It is also an operational necessity: when an agent makes a mistake, and it will, the ability to reconstruct exactly what happened is the difference between a fixable incident and a governance crisis.
Production Use Cases: FNOL, Document Collection, Reserve Recommendation, and Coverage Interpretation
The use cases moving from pilot to production in Indian insurance in 2026 cluster around workflows that are high-volume, rule-intensive, and low-consequence per individual decision.
First notification of loss processing is the most advanced use case. When an insured reports a loss through a call centre, a broker, a self-service portal, or a messaging channel, an intake agent extracts the structured fields required to register a claim: policy number, loss date, loss location, peril, preliminary estimate, contact details. The agent validates the policy number against the administration system, confirms that coverage was in force at the loss date, matches the reported peril against the policy's covered perils, and creates the claim record in the claims management system. Tata AIG's FNOL agent, live for motor claims since December 2025, processes approximately 60% of incoming FNOL without human intervention beyond confirmation. The human handler receives a fully registered claim with a drafted initial response to the insured.
Document request agents handle the collection phase. Once a claim is registered, the agent determines the document requirements based on claim type, peril, and amount. It drafts and sends the request to the insured through the insurer's standard channel (email, SMS, portal notification, WhatsApp via an IRDAI-compliant Business API integration), tracks responses, sends reminders, and escalates non-responses after a configured period. Bajaj Allianz reports that their deployment has reduced average document collection time on health claims from 11.2 days to 5.8 days.
Use cases that still require human approval
Reserve recommendation is a more sensitive use case. An agent analyses the loss notice, surveyor report, historical loss patterns for similar claims, and current inflation factors to recommend an initial reserve. The recommendation is not binding. A human reserve committee or designated reserving actuary reviews and approves. Indian insurers have deployed this use case cautiously given its direct impact on technical reserves and financial statements, with most deployments limited to claims below a threshold (commonly INR 25 lakh) and with mandatory human sign-off.
Coverage interpretation agents answer the question 'is this covered' for complex claims. They retrieve the applicable policy wording, identify relevant clauses, apply endorsements, compare against the loss facts, and produce a reasoned interpretation. These agents have not yet been deployed for binding decisions; their output is advisory, and a qualified claims officer makes the final coverage determination. The deployment caution reflects the high reversal cost of a wrong coverage decision and the legal exposure under Section 45 of the Insurance Act 1938 and IRDAI claims adjudication guidelines.
Governance Risks: Unauthorised Actions, Audit Trail Gaps, Model Drift, and Hallucination Cost
The governance risks of agentic AI in insurance are materially different from those of generative AI. A generative model that hallucinates a policy clause produces a text that a human reviewer can catch. An agent that hallucinates a policy clause and then takes an action based on that hallucination produces an unauthorised action that may already have affected a policyholder.
Unauthorised actions are the most acute risk. An agent operating outside the insurer's authority matrix, whether due to a tool permissioning error, an adversarial prompt injection, or a model failure, can cause harm that scales with the agent's action scope. The case studies circulating in Indian insurer CIO forums during 2025 include an agent that sent a claim denial letter based on a misread policy wording (reversed within hours but not before it reached the insured), and an agent that initiated a refund request for a premium dispute without proper authorisation (caught by a rules check but narrowly). Every production deployment must assume that unauthorised actions will occur and must have incident response procedures in place.
Audit trail gaps emerge when agents operate across systems that do not share a common logging infrastructure. An agent that reads from the policy administration system, writes to the claims system, and sends an email through a third-party service generates log entries in three separate systems. Reconstructing the full trail of an agent action requires correlating these logs through a shared correlation identifier. IRDAI's Information Security Guidelines 2023 require integrated audit capability for systems processing policyholder data, and the DPDP Act 2023 requires that data principals be able to request records of automated decisions affecting them. Insurers deploying agentic systems are investing in centralised audit infrastructure before they expand agent scope.
Model drift is a quieter but equally consequential risk. An agent's behaviour depends on its underlying language model and the prompts and tools that shape its reasoning. Model updates from the provider (OpenAI, Anthropic, Google, or local providers using open-weight models) can change agent behaviour without visible warning. A regression in the base model's ability to parse Indian address formats, or a shift in its calibration on low-probability events, can cascade through the agent's decisions. Production deployments maintain regression test suites that run against every model update, with canary deployments that expose a small percentage of traffic to the new model before full rollout.
Hallucination cost, the business impact of model-generated fabrications reaching live actions, is the tail risk that governance committees focus on. Insurers quantify this risk by estimating the maximum possible exposure per action type: a hallucinated document request has a low cost (insured minor inconvenience); a hallucinated policy interpretation has a high cost (potential claim dispute, regulatory complaint, litigation). Agents are deployed for lower-cost actions first, with scope expansion conditional on observed reliability.
IRDAI Posture, DPDP Act Requirements, and Ethical-Use Principles for AI-Assisted Decisions
IRDAI's regulatory posture on AI in insurance has evolved rapidly between 2023 and 2026. The IRDAI Regulatory Sandbox framework, originally established in 2019 and updated in 2024, has become a preferred route for testing novel AI applications including agentic systems. Sandbox participants gain limited regulatory relaxation in exchange for structured reporting of outcomes. At least 14 AI-related applications entered the sandbox in 2025, with outcomes feeding into IRDAI's 2026 draft guidance on AI-assisted underwriting and claims decisions.
The core principles emerging from IRDAI's posture are explainability, accountability, fairness, and data protection. Explainability requires that any AI-assisted decision affecting a policyholder be explicable in plain language, a requirement that creates engineering work for agentic systems where the decision path spans multiple reasoning steps and tool calls. Accountability places responsibility on the insurer's board and senior management, not on the technology provider, meaning that vendor indemnifications do not absolve the insurer of regulatory consequences. Fairness requires that models not discriminate on protected attributes, with particular attention to how training data may encode geographic or occupational biases relevant to Indian underwriting. Data protection aligns with DPDP Act 2023 requirements on consent, purpose limitation, and data principal rights.
The DPDP Act 2023 creates specific obligations for agentic systems. Section 8 requires that processing be limited to the purpose for which consent was obtained; an agent that reads a policyholder's claim data for fraud detection cannot repurpose that data for marketing without fresh consent. Section 11 grants data principals the right to access and correct their data, which extends to data used by agents in automated decisions. The Data Protection Board of India can impose penalties up to INR 250 crore for breaches, providing material financial motivation for proper agentic system governance.
IRDAI's Information Security Guidelines 2023 are operationally more prescriptive. They require risk assessments for AI systems, detailed access controls, immutable audit logs, incident response procedures, and periodic third-party security reviews. Insurers deploying agentic AI must document the agent's scope, the tools it can invoke, the data it accesses, and the controls preventing misuse. The Guidelines also require that critical systems undergo a formal change management process, which has slowed agentic deployment timelines at conservative insurers but has also reduced the incidence of uncontrolled deployments reaching production.
Implementation Economics, Vendor Market, and Build versus Buy Decisions
The economics of agentic AI deployment in Indian insurance are shaped by three cost centres:
- the underlying language model infrastructure
- the integration work to connect the agent to the insurer's systems
- the governance overhead of operating an agent in production
Pilot costs for an agentic use case in a mid-sized Indian insurer range from INR 2 to 10 crore depending on scope. A narrow pilot covering a single workflow on a single line of business with off-the-shelf agent frameworks and API-based model access falls at the lower end. A broader pilot covering multiple workflows with custom agent logic, on-premises model hosting (required in certain cases to satisfy data residency expectations), and integration into legacy policy administration systems reaches the upper end. The dominant cost component is usually integration engineering, not model access.
Production deployment economics shift the cost mix. Annual operating costs include inference compute (for open-weight models hosted internally, roughly INR 2 to 5 per transaction; for API-based models, INR 5 to 20 per transaction depending on complexity), human-in-the-loop review effort, ongoing model evaluation and regression testing, and governance committee time. An agent processing 10,000 FNOLs per month at an average handling cost of INR 15 per transaction costs roughly INR 1.8 crore per year in direct operating expense, against a manual handling cost that would have been approximately INR 4 to 6 crore for the same volume.
The vendor market has bifurcated. Global providers (Microsoft with Azure AI Foundry, Google with Vertex AI Agent Builder, Amazon with Bedrock Agents, Anthropic with Claude and the Model Context Protocol, OpenAI with its Assistants and Agents APIs) offer general-purpose agentic frameworks that insurers adapt to their workflows. Indian providers (Gnani.ai, Sarvada Intelligence for commercial underwriting, Smartdocs, Riskcovry for distribution workflows, and several emerging specialists) offer insurance-specific agents with pre-built integrations for Indian policy administration platforms and compliance with IRDAI data residency expectations.
The build versus buy decision hinges on three factors:
- Strategic differentiation of the workflow: claims handling speed may be a competitive differentiator that justifies building, while document collection is a commodity workflow that justifies buying.
- Availability of internal engineering talent; Indian insurers with an established data science function and cloud engineering team can absorb agent development, while smaller carriers benefit from outsourcing.
- Regulatory risk appetite: buying from an established vendor with audited compliance may be preferable to building in-house where the insurer bears full compliance responsibility.
Most large Indian insurers in 2026 are pursuing a hybrid strategy, buying for horizontal workflows (email triage, document classification) and building for core underwriting and claims reasoning.
The 2026 Deployment Curve: Where Indian Insurers Are Actually Placing Agents in Production
By April 2026, agentic AI deployment in Indian insurance has moved past the proof-of-concept phase for a handful of use cases and remains experimental for others. A realistic snapshot of the deployment curve helps underwriters, risk managers, and brokers calibrate their expectations.
FNOL intake and registration agents are now in production at all top-10 general insurers for at least one line of business. The workflows are well-scoped, the decisions are reversible, and the savings are quantifiable. Document request and tracking agents are similarly well-established. These are the low-risk, high-volume entry points that have demonstrated the operational case for agentic AI.
Survey and investigation coordination agents are in expanded pilot at a subset of insurers, including HDFC Ergo and Go Digit. These agents assign surveyors based on geography, specialisation, and load, schedule inspections with the insured, and track report submission. The scope involves external parties (panel surveyors, investigators) which introduces coordination complexity but does not involve direct decisions on policyholder outcomes.
Reserve recommendation agents with human approval are in production at three or four large insurers for claims below defined thresholds. Coverage interpretation agents are in advisory production, meaning the agent's output is displayed to the claims officer but the officer makes the decision.
Underwriting agents, those that score risks, recommend pricing, or prepare quotes, are largely at pilot stage. ICICI Lombard's SME co-pilot and Sarvada Intelligence's commercial underwriting agents are among the most advanced deployments, operating with human underwriter approval on the final bind decision. The caution reflects both the commercial significance of underwriting decisions and the regulatory expectation that pricing decisions be explicable and defensible.
Binding and policy issuance agents are not in production for commercial insurance. The combination of regulatory scrutiny, commercial risk, and the complexity of policy wording production keeps these workflows in human hands for now. The trajectory of the next 18 to 24 months is likely to see agents take on progressively more of the upstream workflow, with humans retaining the authority to bind, to deny, and to settle, until confidence in autonomous systems and regulatory acceptance catch up with the underlying capability.