What’s the chance that
- an earthquake at sea could knock out primary power and generate a tsunami which would also knock out backup generators for nuclear power plant emergency cooling equipment (1 in 40 yrs),
- an overextended speculative market segment would collapse and cause widespread ruin that would take down both equity and bond markets and force 100s of financial institutions to go under (1 in 77 yrs),
- a hurricane occurs that destroys flood barriers which then flood your home, office and the place you store your backups (?)
All these represent correlated risks that prior to the actual event, were deemed very improbable. But high improbability, doesn’t mean it will never happen.
Correlated risk defined
A correlated risk is the risk of any subsequent disaster or event occurring after a primary event or catastrophe has occurred. In the case of natural disasters, any event that is generated as a consequence or because of an originating event occurrance is a correlated event and as such, has a correlated risk.
I once worked for a major company, that kept their disaster recovery backups in a basement, underground in the same campus as their headquarters. This seemed risky, as any event which took out the campus could potentially damage this basement as well and all the associated backup tapes.
How to understand your correlated risk
It seems to me to be pretty straightforward to understand correlated risk within the framework of a business continuity or disaster recovery plan (BC or DR plan). One lists in one column all possible primary accidents, calamities, disasters, etc., man made or natural, in another column other possible accidents, calamities, disasters, etc. that are generated because of the prime event.
One then recurses on this process to generate all possible correlated events associated with the primary or previous correlated event until you exhaust all possible chains of catastrophes associated with the primary disaster. Then in a third column, list the potential scope (distance or area impacted) and outcomes (what damage could be expected) of all those activities in the first two columns. In a fourth column, one lists the best guess probability of the events and/or the correlated event(s) occurring.
In the end, you should have an exhaustive list of things you should be preparing for. Now one ranks the events in probability order and tackles them from highest to lowest probability. There is some cutoff point that everyone reaches depending on their risk tolerance, at some point dealing with the multiple disasters that could potentially occur becomes too costly to deal with. But it all depends on risk tolerance. For instance, a nuclear plant probably needs a much higher risk tolerance than your average corporate environment.
With that in place you have a start on a BC and/or DR plan. Now all you need to determine is your risk tolerance level and how to handle primary and correlated risks that fall within that level.
A correlated risk analysis
Take Silverton Consulting as an example . I take daily incremental backups stored on a local hard disk, take weekly “partial backups” (critical business files only) to removable media but also stored locally in the office, and take monthly full backups stored in a safety deposit box located in a vault in the basement of a bank within five miles of the office.
If I just look at natural events:
- My first and most likely natural event is building fire – in this case the scope of the event would be limited to the building, which would take out both the local hard disk incrementals and weekly partial backups but the safety deposit box of monthly fulls would still be accessible.
- A possible correlated event as well as another primary event could be wild fire – in this case, potentially both the office and the bank could be consumed and all backups would be lost. The fact that the bank is 5 miles away, has it’s own fire suppression system, and has my backups located in their basement, just reduces the probability of a wild fire impacting both locations but doesn’t eliminate it.
- Another possible correlated event to any wild fire would be loss of power, transport, and communication services – the fact that the bank is only 5 miles away, indicates that if the primary office loses these services, it’s highly probably that the bank would lose them as well. Access to the bank vault backups, under these circumstances would be delayed at best, until at least such services could be restored. Had I been using a cloud provider backup service (which I am considering), I couldn’t access my data until communication services were restored or until I had moved far enough away to regain access to these services. Wth the roads/other transport being out this would take some time.
- Next most likely natural event is flood. Our location is within a 100 year flood plane, so a serious flood is possible that would take out the office once every 100 yrs. I would like to say that our bank is outside our flood plane, but I just don’t know yet. But I promise to find out.
- A correlated event to a flood is a loss of power, transport and communication services. The scope and consequences of this catastrophe are similar to that discussed above.
- Next most likely natural event is tornado, …
- Next most likely natural event is earthquake, …
- Next most likely natural event is volcano eruption, …
… and the list goes on. Of course these are just natural disasters, one would need to consider man-made catastrophes as well.
In any event, all these have a distinct, non-zero probability. One can come up with some calculation of the probability of such primary and correlated events through research and/or other means.
For instance, I get a fortnightly email from Colorado University’s Natural Hazards Center which occasionally provides some insight into these probabilities. Potentially, your corporatations insurance companies can also provide some guidance into these probabilities as well.
What is risk tolerance?
But at some point, only the company can determine it’s risk tolerance. I believe risk tolerance to be some combination of money one is willing to invest and your ability to invest it in mitigating risks. For example, let’s say my company makes $10M a year in revenues. Given the importance of IT to my corporation’s activities a reasonable risk tolerance in $ terms might be somewhere between 0.1% to 1.0% of revenues or $10K to $100K. I must say I am probably spending more than that percentage of SCI revenues in my current DR activities, such as they are, but I include weekly and monthly backups with these costs (most would not include these activities in pure DR spending).
So as the disaster in Japan continues, let us pray that it works out well in the end for all parties. But also let’s use this time to re-examine our risk tolerance and disaster recovery plans with respect to correlated risks. Hopefully, we will all do better next time.