You know what they say…

if it can happen….it will happen”.

This is the theory posed to us in the so-called Murphy’s Law” of probabilities. We all know this is not some scientific string theory of how the cosmos is bound together but in fact is the truth. Yes, you even may debate Darwin’s theory of evolution but regardless of what religion you are, what political persuasion you have, or what country you come from…we are all clear that Murphy’s Law is not a theory but THE TRUTH!

This classic cartoon, Milo Murphy’s Law on Disney  XD, shows just how easily things can not only go wrong on a epic scale but also how disastrous they can be. This episode embodies how we view Murphy’s Law and for many, these types of things, although funny on the surface,  happen in everyday life.

While we might have a gut-busting laugh at Milo Murphy’s antics, unfortunately in real life, Murphy’s Law can be a quite disastrous and can culminate in big financial losses or even failure of the business.

From times long ago until now, crazy and even unforeseen events happen that cause mass destruction, untold human suffering and of course the loss of business continuity.  These cause the proverbial Cluster” of events that have a great impact on people. Since people are what drives business then there is an obvious effect in how we conduct business. Recovering from these types of catastrophes that impact human beings is called Disaster Recovery.

Disaster Recovery

What is disaster recovery? In terms of Information Technologies (IT) it’s a procedural process of recovering IT systems in event of the loss of a data center operations that are driven by a natural disaster or man-made catastrophic event. What are typical DR events?

  1. Natural disasters such as an earthquake, tsunami or flooding, hurricane or tornado.
  2. Fire (natural or man-made)
  3. Failure of a regional power grid.
  4. A terrorist attack which could include a conventional bombing, chemical or biological or even nuclear (dirty bomb) events.
  5. War including civil unrest, ethnic cleansing, political upheaval.

The first decade on the 21st century was consumed with unprecedented events on a global scale not seen before and the second decade is sizing up to be as bad if not even more disturbing than the first.

In terms of disasters, these events affected hundreds of millions of people and took the lives of several million. What did it look like?

 

2000’s
2010’s
  • World Trade Centers 2001
  • Northeast Blackout 2003
  • Sumatra Quake/Tsunami 2004
  • Hurricane Katrina  2005
  • Sumatra Quake 2005
  • China Quake 2008
  • Chili Quake 2010
  • Haiti Quake 2010
  • Gulf Oil Disaster 2010
  • Australian Great Flood 2010
  • Japan Quake/Tsunami   2011
  • Japan Nuclear Disaster 2011
  • U.S Tornado Outbreaks 2011
  • Civil War In Syria 2011
  • Superstorm-Hurricane Sandy 2012
  • Typhoon Bopha 2012
  • Indonesia Quake 2012
  • Super Typhoon Haiyan 2013
  • Great Mexican Flood 2013
  • Russian Meteor 2013
  • African Ebola Outbreak 2014
  • Super Typhoon Rammasun 2014
  • Ludian Quake 2014
  • Nepal Earthquake 2015
  • Indian Heatwave (118 degrees) 2015
  • California Wildfires 2016
  • Strongest Hurricane Ever Recorded 2016
  • Super Floods in Louisiana 2016
  • Fukushima Quake 2016
  • New Zealand Quake/Tsunami 2016
  • Tennessee Fires 2016
  • Hurricanes Irma, Maria and Henry – 2017
  • 2 Great California Wildfires – 2017
  • 2 Mexico Earthquakes – 2017


So you get the point! These were just the highlights of the last and current decade.  The frequency and ferocity of these types of events are increasing. Therefore it’s not IF, it’s WHEN! We will all see a catastrophic event happen that will affect us directly during our lifetime. I was in NJ during Superstorm Sandy. I saw first hand what nature can do to homes, businesses, and lives. I was also in New York City during the September 11th attacks so I know what nefarious humans can do and the impact it had on lower Manhatten businesses.

How can you prepare for one of these events? Like business continuity, it’s all about preparation. A disaster recovery plan is a basis for ensuring recovery after a catastrophic human or natural event. It is very important that time is put into this as with any plan. It must be tested to its strength and accuracy.

It requires additional and sometimes expensive skillsets that you normally don’t need in tradition business continuity. Why? In a DR event, your IT systems must be brought back online at a different location (a different datacenter). This requires that an organization has:

  1. An operational recovery site
  2. IT systems replicated to this an alternate location 
  3. A command procedure the bring systems back online if there is an prolonged outage preferably via automation

These require very skilled technical people to plan, test and perform the recovery process.

There are several means to restore critical systems at a DR site. These specific techniques are driven by what is known as the Recovery Point Objective (RPO) and Recovery Time Objective (RTO). These two principles drive the type of techniques and IT department will use to recover their systems. The RPO is simply how much data you can afford to lose in a DR event. The recovery time is how long are you willing to wait until the system is back online.  These two are may or may not be mutually exclusive.

For example, you might have a financial application that can only be down for no more than one minute and can only lose a few transactions. So the RPO/RTO may be inline with each other sub one minute. However, you may have another app that requires that they can’t lose any data, but they could live without the system once it is down for an hour. This would be a very low RPO but high RTO. In addition, you may have a old legacy application that does need to be restored in a DR event, but if they lose some data and it takes a week to bring back online. Clearly, these need to be understood.

So preparing for Murphy’s Law is not a trivial task. It takes forethought. A commitment to protect the business and the political fortitude to push for investment in what is perceived to be a low-risk event. With so much potential loss on the table, this SHOULD BE a no-brainer right? Wrong!

 

Whats at Issue?

APATHY. In many organizations, BC/DR is perceived as a high-cost solution to solve a very low-risk event.  However, the cost of doing nothing is more than the expense of a tested plan. 

In his 2014 article by Jules Taplin entitled The five most common attitudes towards disaster recovery and most are incredibletalks about apathy in IT when it comes to protecting IT systems. He mentions Cost Minimisers recognize that they need to protect their business, but their focus remains very tightly not on their requirement, but on the amount of money that they’re prepared to spend”. So many executives, especially CFO’s don’t see the real dangers in not properly funding BC/DR.Executives need to understand this is inevitable. It will happen! It’s just a question of when. Your organization will experience a DR event at some point, albeit at a unknown time. Many executives in the NJ/NJ area had the perception that DR would never happen to them. Then came Superstorm Sandy! This took many executives by surprise in terms of preparedness.  I spoke with business owners in the area who didn’t think the storm would be as bad as the meteorologist had forecasted. Many didn’t take hurricane risk seriously and so proper investments for many coastal businesses were not made in BC/DR and for some, it took years to fully recover.In addition, many companies who did an adequate job of planning for events like Sandy didn’t take into consideration their own supply chains. These partners also need verifiable BC/DR plans in order to ensure delivery of products and services that their customers depend on and are not interrupted because of a major disaster. The Economist highlighted a 2012report by DHL, a logistics firm, that 23% of big companies did not include their entire supply chain in their business-continuity plan. The article further quotes Dutch Leonard, a risk expert at Harvard Business School, as saying that the best-prepared firms use a combination of planning for specific events and planning to cope with specific consequences, such as a loss of a building or supplier, regardless of the cause. So this too is an important area to consider in BC/DR planning.

SKILLS. As mentioned earlier building a tested and fireproof disaster recovery site is not a trivial task. This takes very expensive and competent people who understand the different aspects of BC/DR technical challenges like how to build a resilient datacenter. How to identify critical systems that need to be recovered at the DR site. How to virtualize server inventory at the remote site. How to keep the production workloads synchronized over to the remote site. How do you setup and manage the networking needed to support all this stuff.This usually requires a team of people to build and maintain these technical issues. Companies who do not have this skillset will likely have to contract out to those who provide this competency to assist with the buildout and operations. Many are looking at cloud service providers who offer Disaster Recovery as a Service (DRaaS) now to handle these projects as it’s much more cost-effective at the economy of scale that only a cloud can provide especially with smaller IT organizations.

A PLAN. Without a plan, your business will not survive a major disaster! But a plan if of itself is not enough. You need to exercise and refine the plan.Exercise your plan regularly. Most organizations will do a DR test” at least once a year. Many others do them bi-yearly. Still, others do them monthly. Regardless of the intervals, do them! This will help with preparedness and allow you to work out the kinks in the plan. A well tested DR plan helps IT professionals sleep well at night knowing when that event happens they can be con concentrate on executing the plan instead of questioning if it will be effective. Executives will be able to say ensure investors and shareholders that the company can survive a catastrophic event and it won’t subject the business to negative visibility.Plan also for different types of events with different outage durations. This issue again became evident with Super Storm Sandy. In a Wall Street Journal article in November 2012, David Sarabacha, a principal with Deloitte & Touche LLP who specializes in resilience and recovery planning was quoted as saying Sandy brought to light the need for short-, medium-, and longer-term business continuity plans,” says Sarabacha. Companies will likely need different disaster recovery strategies for events of different durations.
So going back to Milo Murphy’s experience. Did you notice that although everything went wrong in his life, he always had his disaster recovery plan in his backpack? Whether it was a paper map when his GPS failed so he could get to school, lights if he fell into a stinky sewer or the seatbelt in just in case he was in an accidental train derailment. The key has he had a plan. How much more does it mean to have one for our businesses?

A future article will talk more in depth about how to build a successful BC/DR plan and ensure its effectiveness.

Comments are closed.