When things go *Boom* in the night…
Heck, how about when things go *boom* during the day? It is every techs worst nightmare:
“Uh, the server won’t boot up.”
“What does this smoke rolling out of the back of the server mean?”
“We lost all power to our building last night, and when the power company turned it on this morning; we heard a ‘pop’ sound.” BCP Singapore
As you might have already guessed, this quarter we’re going to talk about the disaster recovery – specifically the financial implications that should drive your disaster recovery plan. Notice how I didn’t say anything about fires/floods/lightening/hurricanes/tornados/earthquakes/etc? Yes, depending on where you live, every one of those is a very real threat. I didn’t talk about them, because statistically speaking, you are more likely to be robbed and have everything stolen than for one of those to happen. And because servers are electronics, any one of the average, ordinary problems is far more likely to happen than theft.
First of all, I thought I would tackle some definitions:
Backup – the process of generating multiple copies of data to prevent data loss. At a minimum your backup should follow the 3-2-1 Rule of Backup.
Disaster Recovery Plan – The complete process for securing and protecting your data. Disaster Recovery (DR) includes backups, the testing of those backups, the security of those backups in a secure offsite storage location, and the plan for using those backups to recover from a disaster. You need a sense of serenity.
Business Continuity Plan – The complete process for resumption of business. Business Continuity is significantly more far reaching in scope, and includes your Disaster Recovery plan, plus considerations for things like insurance policies, where your business will work from if something happens to your building, how you get a hold of your employees, customers, and vendors, how you get your phones working again. It is the “a [insert disaster here] happened and we lost everything – what does it take to get us back up and running” plan.
Now we have some basic definitions to work from, and in November I talked about the 3-2-1 Rule of Backups so I’m not going to go into backups a lot.
Here is the reality of the situation – disaster recovery plans would be better off to be named “data recovery insurance plans” and it would be easier to understand their importance. We all understand that with insurance, the better coverage/protection you require, the more expensive the policy. DR is very much a risk vs. cost analysis that must be done. Before you can really start building your Disaster Recovery Plan, you need to know the answer to 3 questions:
1. If your server(s) were to go down, and you had no access to your data, what is the hourly cost to your business? This seems like it should be easy to find, but it’s normally not. There are several items that normally go into calculating this number.
a. What is your employee cost per hour if they aren’t/can’t work? If you have 10 people who make $35k per year, by the time you figure benefits things like Social Security / Medicare / etc., the cost per hour would be somewhere around $210 per hour. Since we just finished out the year, you can take your 2010 payroll, add benefits and payroll taxes, and divide by 2080 business hours in a year to get a pretty close number.
b. For every hour you are down, how many hours does it take you to catch up? A lot of our customers can function on a limited basis by hand-writing everything, but when their system comes up there is a ton of work to get done to get caught up. Is it a 1 to 1 ratio? Is it 30 minutes per hour down? Or is it 2 hours per hour down? Are you going to have to pay overtime to get caught up?
c. What sales/revenue did you not capture while your systems were down? Hopefully it is as easy as saying “I’m sorry, our system is down- can I call you back when it comes up?” and everything is fine. But what if your time is also revenue? Attorneys can’t work on files, which means no billing. Accountants can’t access client files, which means no billing. A lot of physicians can’t check patients in/out or access their electronic charts, which means no billing. Financial institutions are time sensitive down to seconds! I’ve seen this number range from $10 to $10,000 per hour.
d. Are there other costs to your organization? This is for you to decide – try to put a realistic number on it and add it to your hourly cost total.
2. What is your Recovery Point Objective (RPO)?
a. Recovery Point Objective is an industry term referring to how often we are running backups. Simply put, if I walked into your office and said “We got everything back up and running, but I lost all of the data for the last hour.” -What does that mean for your organization? What if it was everything for the last day? The last 2 days? The last week?
b. How long would it take someone to recreate lost data? Can it even be recreated? If it is gone forever, what happens?
3. What is your Recovery Time Objective?
a. Recovery Time Objective is another industry term, this one referring to how long before the system is up and running again and regular business is resumed.
b. What if I told you we only lost 15 minutes worth of data, but it took 8 hours to get your people back to work?
c. Is that better or worse than losing 1 day worth of data but having everyone back to work in one hour?
Earlier I said that your disaster recovery plan should be driven by your risk vs. cost analysis. Now that we understand your risk tolerance, we can balance it out with an appropriately priced solution to mitigate that risk. High risk tolerance usually means lower priced DR solutions, and of course the opposite it true – low risk tolerance usually means higher priced solution. Understanding the financial implications of your disaster recovery plan is always the first step towards success!