From Business Continuity to Operational Resilience: What Actually Changes
Most large Indian corporates have a business continuity management (BCM) programme. Many are certified to ISO 22301, run an annual business impact analysis, maintain recovery plans and test them periodically. Operational resilience is not a rebadging of that work; it is a different lens that changes what the firm measures, what it commits to, and how the board assures itself. The shift is from a plan-centric question (do we have a recovery plan for each disruption?) to an outcome-centric question (can the firm continue to deliver the services that matter, within tolerable limits, through severe but plausible disruption, however it arises?).
The distinction has three practical consequences. First, the unit of analysis changes. Classic BCM organises around assets, sites and processes; operational resilience organises around important business services, the outputs the firm delivers to customers and to the market that, if disrupted, cause intolerable harm. A manufacturer's important business service might be the ability to fulfil orders for a critical product line; a logistics firm's might be the ability to move temperature-controlled cargo; a payments-adjacent fintech's might be the ability to settle transactions. The firm reasons about resilience in terms of these services, then works back to the processes, people, technology, facilities and third parties they depend on.
Second, the firm sets impact tolerances. An impact tolerance is the maximum tolerable level of disruption to an important business service, expressed in terms a board can own: maximum acceptable downtime, maximum data loss, maximum number of customers affected, maximum financial loss, before the harm becomes intolerable. This is a commitment, not an aspiration. The firm is asserting that it can remain within the tolerance through a range of severe scenarios, and it must be able to evidence that assertion.
Third, the test is severe-but-plausible disruption, regardless of cause. Classic BCM often plans by cause (a fire plan, a flood plan, an IT plan). Operational resilience assumes that disruption will happen and asks whether the service stays within tolerance whatever the trigger. This cause-agnostic stance matters because the most damaging disruptions are frequently the ones the cause-specific plans did not anticipate.
The Indian regulatory backdrop reinforces this. The Reserve Bank of India and SEBI have progressively raised expectations on operational resilience and operational risk for regulated entities, and the digital-personal-data and cyber regimes have made data and system continuity a board-level concern. The Companies Act, 2013 risk-governance provisions and the SEBI requirement for a risk management committee at larger listed entities create an expectation that the board can speak to resilience, not merely to compliance. For unregulated corporates the driver is commercial: customers, lenders and rating agencies increasingly probe resilience, and a severe outage that breaches an unstated tolerance is a reputational and financial event.
The insurance connection runs throughout. Operational resilience identifies where the firm cannot stay within tolerance on its own; some of those gaps are addressed by hardening and redundancy, and some are addressed by risk transfer. Business interruption, contingent business interruption and cyber cover are the principal transfer instruments, and the resilience analysis is what tells the firm how much cover, on what trigger, for what indemnity period, it actually needs. This post sets out the framework: defining important business services, setting impact tolerances, mapping dependencies, scenario testing, board assurance, and the insurance mapping that closes the loop.
Defining Important Business Services
The foundational step, and the one most firms get wrong, is identifying important business services correctly. Done well it focuses the whole programme; done badly it produces either a list so long it is meaningless or a list so abstract it cannot be tested.
What qualifies as an important business service
An important business service is an outward-facing output: something the firm delivers to a customer, a counterparty or the market, whose disruption causes intolerable harm. The test is harm on disruption, not internal importance. Payroll is internally important but its short disruption rarely causes intolerable external harm; the ability to deliver a critical customer order on contractual terms may cause intolerable harm if disrupted, because it triggers penalties, lost customers and reputational damage. The discipline is to phrase each service from the outside in: what does the customer or the market receive, and what happens to them and to the firm if they stop receiving it?
The common error is to confuse business services with business functions or systems. "The SAP ERP system" is not a business service; it is a dependency of several services. "The Chennai plant" is not a business service; it is a resource that several services rely on. Listing systems and sites as services collapses the resilience analysis back into asset-based BCM and loses the outcome lens.
Keeping the list short and material
A large diversified corporate will typically identify a small number of important business services per business unit, not dozens. The objective is materiality: the services whose disruption genuinely matters at the enterprise level. A useful filter is to ask whether disruption to the service would warrant board attention, draw external scrutiny, breach material contracts, or threaten the firm's standing. Services that clear that bar are in scope; services that do not are managed through ordinary BCM without the full resilience treatment.
Sector illustrations
For a chemicals or pharmaceuticals manufacturer, important business services often centre on the continuous supply of regulated or contracted products where customers have no easy substitute and where supply interruption triggers regulatory and contractual consequences. For a logistics or cold-chain operator, the service is the reliable movement and storage of cargo within specification, where a temperature excursion or a network outage destroys value and breaches customer commitments. For an IT or business-process services firm, the service is the delivery of contracted client processing within service levels, where an outage breaches SLAs and erodes client trust. For a power or infrastructure operator, the service is availability of the asset to the grid or the user within regulatory and contractual obligations. Each firm should derive its own list from its own contracts, customers and obligations rather than borrow another firm's.
Mapping each service to its resources
Once identified, each important business service is mapped to the resources that deliver it: the people and skills, the processes, the technology and data, the facilities, and the third parties and suppliers. This mapping is the bridge from the outcome lens to the operational reality and is the input to dependency mapping and scenario testing. A service that depends on a single plant, a single key supplier, a single data centre or a single critical application has concentration that the mapping must surface explicitly. The act of mapping frequently reveals dependencies the firm did not realise were single points of failure, which is precisely its value.
Setting Impact Tolerances the Board Can Own
The impact tolerance is the heart of the operational-resilience framework and the element that most distinguishes it from BCM. It converts a vague commitment to continuity into a specific, board-owned limit that can be tested and assured.
What an impact tolerance expresses
For each important business service, the impact tolerance states the maximum level of disruption the firm judges tolerable before the harm becomes unacceptable. It should be expressed in concrete metrics the board understands and can defend: maximum tolerable downtime (the service can be unavailable for no longer than a stated period), maximum data loss (no more than a stated recovery point), maximum customers or transactions affected, and maximum financial loss. Different services warrant different metrics; a settlement service is dominated by time and data-loss tolerance, while a manufacturing-fulfilment service may be dominated by duration and contractual-penalty exposure.
The tolerance is set from the outside in, anchored on the harm to customers and the market, not on what the firm's current recovery capability happens to be. This is a deliberate inversion. A firm that sets its tolerance equal to its current recovery time merely ratifies the status quo; a firm that sets its tolerance based on tolerable external harm and then discovers its recovery capability does not meet it has found a real gap to close. Setting tolerances honestly, ahead of capability, is what makes the framework useful.
Severe but plausible, not worst case
The tolerance is tested against severe-but-plausible scenarios, not against the literal worst conceivable event. An impact tolerance is not a promise that the service will never breach under any circumstance; a sufficiently extreme catastrophe can defeat any firm. The commitment is that, across the range of severe-but-plausible disruptions, the firm remains within tolerance. Calibrating what is severe-but-plausible is a matter of judgement informed by the firm's history, its geography, its sector and the threat environment, and it should be documented so the board understands what it is and is not committing to.
Board ownership and governance
Impact tolerances must be approved at board or risk-committee level, because they are statements about how much harm the firm is prepared to risk to its customers and stakeholders, which is a governance judgement, not an operational one. The board approving a tolerance is accepting accountability for it. This is why the metrics must be ones the board can understand and own; a tolerance expressed only in technical recovery-time-objective language that the board cannot interpret has not really been owned.
Linking tolerances to risk appetite
Impact tolerances should sit inside the firm's broader risk-appetite framework, not float separately. The risk-appetite statement expresses how much risk the firm is willing to accept across categories; the impact tolerances express, for the most important services, the specific operational limits that operationalise that appetite. A firm with a low appetite for customer harm and reputational damage should have correspondingly tight tolerances on its customer-facing important business services, and the two should be visibly consistent. Where they are inconsistent, one of them is wrong, and surfacing that inconsistency is part of the value of the exercise.
Mapping Dependencies and Surfacing Concentration
An impact tolerance is only credible if the firm understands what the important business service actually depends on and where those dependencies are fragile. Dependency mapping is the analytical core that connects the outcome (the service) to the things that can break it.
The dependency dimensions
For each important business service the firm maps the full chain of resources required to deliver it. People and skills: the roles and the specific individuals whose absence would degrade the service, and the depth of cover behind them. Processes: the operational and control processes the service runs on. Technology and data: the applications, infrastructure, data stores and connectivity the service relies on, and the recovery characteristics of each. Facilities: the sites, plants, warehouses and offices the service occupies. Third parties: the suppliers, outsourced providers, cloud and connectivity vendors, utilities and logistics partners the service cannot run without. The map should trace each dimension end to end, including the dependencies of the dependencies, because a service's resilience is only as strong as the weakest link two or three layers down.
Surfacing single points of failure and concentration
The purpose of the map is to surface where the service has no redundancy. A single plant that fulfils a critical order line, a single supplier of a sole-sourced input, a single data centre hosting a critical application, a single cloud region, a single key person with undocumented knowledge: each is a single point of failure whose disruption pushes the service toward or past its impact tolerance. Concentration is the broader version of the same problem: several important business services depending on the same plant, the same supplier, the same provider or the same geography, so that one event disrupts many services at once. Indian corporates with operations clustered in particular industrial corridors or dependent on a small number of large suppliers frequently carry concentration that only surfaces when the map is drawn.
Third-party and supply-chain dependency
Third-party dependency deserves specific attention because it is the dimension over which the firm has least direct control and the one most often underestimated. A service that depends on an outsourced provider inherits that provider's resilience, or lack of it, and the firm's impact tolerance is only meaningful if the provider can support it. The firm should understand the provider's own continuity capability, its concentration (does the provider in turn depend on a single sub-provider the firm shares with competitors?), and the contractual rights the firm has to continuity, recovery commitments and exit. Supply-chain mapping for physical inputs follows the same logic: a sole-sourced critical input, or a tier-two supplier that several tier-one suppliers all depend on, is a concentration the resilience analysis must surface and the procurement and risk functions must address.
From map to action
The dependency map drives two outputs. The first is remediation: where the firm can reasonably remove a single point of failure or reduce concentration (a second supplier, a second site, documented key-person cover, geographic diversification, a recoverable architecture), it should, and the resilience programme should carry those actions with owners and dates. The second is the residual: where a dependency cannot be economically removed, the firm is left with residual exposure that must be managed through recovery capability and, where appropriate, risk transfer. The map tells the firm precisely which residual exposures matter, which is the input to the scenario testing and the insurance mapping that follow.
Scenario Testing and Board Assurance
A framework of services, tolerances and maps is theoretical until it is tested against disruption. Scenario testing is how the firm proves, or disproves, that it can stay within its impact tolerances, and it is the evidence the board relies on to assure itself and external stakeholders.
Designing severe-but-plausible scenarios
Scenario testing exercises each important business service against severe-but-plausible disruptions, designed to stress the dependencies the map identified as fragile. Good scenarios are specific and uncomfortable: the loss of the single plant that fulfils a critical order line for an extended period; a ransomware event that encrypts the application hosting a critical service and forces a recovery from backups; the failure of a sole-sourced supplier; the loss of a primary data centre or cloud region; a flood or cyclone affecting a concentration of operations in one geography. The scenarios should be cause-agnostic in spirit (the firm cares about the disruption to the service, not the label of the cause) but concrete enough to test real recovery actions.
For each scenario the firm walks through the response: detection, decision-making, recovery actions, workarounds, communication, and the time and data-loss consequences for the service. The key question is whether the service stays within its impact tolerance through the scenario. Where it does, the firm has evidence of resilience for that scenario. Where it does not, the firm has found a resilience gap, which is the entire point.
Tabletop, simulation and live testing
Testing ranges in rigour. Tabletop exercises walk a team through a scenario in discussion, surfacing decision gaps and assumptions cheaply. Simulations exercise actual recovery mechanisms (failover, restore-from-backup, supplier substitution) to test whether they work as designed and how long they take. Live or near-live testing actually invokes recovery in a controlled way and is the most convincing but the most disruptive. A mature programme uses a mix, reserving live testing for the highest-impact services and the most critical recovery mechanisms, and uses tabletops more frequently and broadly. The cadence should ensure that every important business service is tested against relevant severe scenarios on a defined cycle, and that lessons feed back into remediation.
Closing the gaps
Testing produces a gap register: for each service, the scenarios under which it breaches tolerance and the reasons. Each gap is then dispositioned. Some are closed by remediation (faster recovery, added redundancy, removed single points of failure, better-documented procedures). Some are closed by risk transfer (where the residual financial impact of a within-scope disruption can be insured). Some are accepted, explicitly and at the right level, where neither remediation nor transfer is economic and the residual is judged tolerable. The discipline is that no gap is left undispositioned; every breach found in testing is either fixed, transferred or consciously accepted by an accountable owner.
Board assurance
The board's assurance question is simple to state and hard to answer well: can the firm stay within its impact tolerances for its important business services through severe but plausible disruption, and where can it not? The resilience programme should report to the board, through the risk committee, a clear answer: the important business services, their impact tolerances, the testing performed, the services currently within and outside tolerance, the remediation in flight with timelines, and the residual exposures accepted or transferred. This reporting is qualitatively different from a BCM status update; it speaks in the language of outcomes and tolerances that the board owns. A board that receives this reporting can assure itself, its regulators, its lenders and its major customers with evidence rather than assertion.
Mapping Insurance to the Resilience Gaps
Operational resilience and insurance are usually run by different teams, which is why the connection between them is so often missed. The resilience analysis identifies exactly where the firm cannot stay within tolerance on its own, and a subset of those gaps are financial exposures that insurance is designed to absorb. Mapping the two together produces both better resilience and better-specified insurance.
Business interruption and the indemnity period
The primary instrument is business interruption (BI) cover, which responds to the loss of gross profit and the increased cost of working following an insured physical damage event (typically attached to the property or fire policy). The resilience analysis directly informs two BI parameters that firms routinely get wrong. The first is the indemnity period: the maximum period for which the BI policy will pay. The dependency map and scenario testing reveal how long it would actually take the important business service to recover from a severe physical-damage scenario, including the time to rebuild a plant, re-source equipment, re-qualify a product or rebuild a customer base. If the realistic recovery time exceeds the chosen indemnity period, the firm has a resilience-driven underinsurance: the policy stops paying while the business is still impaired. The second is the sum insured for gross profit, which must reflect the value the service actually generates and the trend over the indemnity period, not a stale figure. The resilience work gives the BI placement a defensible, evidence-based specification rather than a renewal-by-rote number.
Contingent business interruption for third-party dependency
The dependency map's most important insurance insight is usually about third parties. Standard BI responds to damage at the insured's own premises; it does not respond when the disruption originates at a supplier, a customer or a utility on which the important business service depends. Contingent business interruption (CBI), and the related extensions for supplier and customer premises, utilities and denial of access, are the instruments that close that gap. The resilience analysis identifies precisely which third-party dependencies are material and concentrated, and therefore which CBI extensions the firm needs, against which named suppliers or on what basis, and for what sub-limit. A firm that has mapped a sole-sourced critical supplier and quantified the impact of its loss can specify CBI cover that actually matches the exposure, rather than buying a generic extension with a sub-limit unrelated to the real risk.
Cyber cover for the technology and data dimension
The technology-and-data dependency dimension maps to cyber insurance, which can respond to business interruption from a cyber event (system outage, ransomware) where conventional BI, tied to physical damage, does not. The resilience scenarios involving a ransomware encryption of a critical application or the loss of a cloud-hosted service are precisely the ones cyber BI cover addresses. The resilience analysis informs the cyber cover's business-interruption sub-limit and waiting period by reference to the impact tolerance: if the important business service's tolerance is a short downtime but realistic recovery from a cyber event is longer, the financial gap during recovery is what the cyber BI cover should be sized to absorb. Cyber cover also responds to the data, liability and response-cost consequences that the resilience analysis surfaces as part of a cyber scenario.
Closing the loop: from gap register to programme design
The mapping discipline is to take the gap register from scenario testing and ask, for each residual exposure that is financial in nature, whether and how insurance responds. Some gaps map cleanly to BI, CBI or cyber cover and inform the sum insured, indemnity period, sub-limit, trigger and waiting period for those covers. Some gaps reveal that the existing programme would not respond at all, because the trigger does not match the scenario (for example, a supply disruption with no physical damage, which neither standard BI nor an unextended programme covers). Surfacing a non-responding exposure is as valuable as sizing a responding one, because it forces a conscious decision: extend the cover, find an alternative transfer, or accept the residual. The result is an insurance programme specified by the firm's actual resilience gaps rather than by last year's renewal.
Doing this well requires the firm and its broker to compare what each insurer's wording actually grants against the resilience scenarios: which triggers apply, which extensions are present, what the sub-limits and indemnity periods are, and what is excluded. Sarvada gives commercial insurance brokers structured, searchable access to insurer policy wordings, so the broker can compare BI, contingent BI and cyber triggers, grants, sub-limits and exclusions across the market and place cover that matches the corporate's mapped resilience gaps rather than a generic specification. Request Access to evaluate how structured wording comparison supports resilience-driven programme design.

