Business continuity, Disaster Recovery, Redundancy and Uptime — these are no longer terms restricted to a server room in an IT company. They've come out from the technical infrastructure and begun making an impact on real life, practical processes.
The question of whether or not you can manage disaster is perhaps an incomplete one considering humans have been given the will and desire to survive through the most challenging circumstances. So the fact that the survival is already happening, is somewhat irrelevant. What people are becoming more aware of is the fact that they are unable to prepare themselves to cope with disaster. In order to maintain the continuity of business, it is essential to be able to have the necessary backup or secondary switch that you can turn on, and keep going.
A few months ago, we covered LMKR and how they had managed the aftermath of the Marriott Bombing, something that people appreciated around the world. Other IT companies were also able to follow their DR plans and mitigate the aftereffects of the tragic incident, but here's the point: because IT companies know about all of this because the largest chunk of their business is dependent on their intellectual property. In a world where technology helps manage business, you really don't have an excuse to get caught unprepared.
Whether it is a natural disaster or an attack of sorts, have you thought to ask your office building on what their DM plan is? How about the school where your children may go? In the event of chaos, what plan of action will be followed? Same list of questions apply to hotels, restaurants, city district planning agencies — this is an endless list. In a lot of cases where disaster planning for physical damage is done, people still fail to plan how they are going to rehabilitate themselves back into the system. It's the same sustainability (or lack there of) challenge all over again. If you conduct a Business Impact Analysis in your organization, it will help you to figure out what your various operations are and since most apps are linked to one another, how long you can afford to have one app or area down, before it begins impacting the core business function.
You'd like to think that the BIA is something that is done through technology, but it's not. The majority of the BIA planning is done by an analyst who can communicate with each of the departments and areas and actually assess the importance of every step of the organizational and virtual hierarchy.
Before selecting a Disaster Recovery strategy, the Disaster Recovery planner should refer to the company's business continuity plan which should specify the key metrics of Recovery Point Objective (RPO) and Recovery Time Objective (RTO) for various business processes. The metrics specified for the business processes must then be mapped to the underlying IT systems and infrastructure that support those processes.
While it is important to have Disaster RPOs and RTOs in place, here's something to think about: what if the critical data you are currently using, becomes corrupt? Worse yet, what if someone accidentally deletes some portion? Well, the IT manager will head over into the most recent backup data, and simply recover. But because when there is no crisis as such, the data backup is usually done on a 24-hour, daily basis, think about the situation you are creating for the organization — the daily RTO and RPO back is up 24-24 (24 hours each), whilst lets say that you define the disaster RTO and RPO to be 4-4. In the event of an unplanned incident which is not necessarily a disaster, you can't get to the data until 24 hours later, which means that unless you 'declare' the organization to be in a state of disaster, you will have lost 24 hours worth of data! So it is imperative that your regular metrics match with your disaster or in-crisis metrics.
You can always rebuild brick and mortar however once your virtual operations are compromised, there is nothing you can do to bring it back.
Initiating the Process for Business Continuity
Business Continuity is the umbrella which sits on top of Disaster Recovery. Recovering lost data or assets is simply a part of business continuance.
Depending on how your organization is set up and structured, you or your clients need to have real-time access to specific bits of data so that business can be continued.
If you run a company where your major interaction with your customers is through a website interface, then it would serve you well if you were mirroring that website in a secondary location, whereby customers trying to resolve the DNS of one server, can be redirected to another. In case you cluster the application so that you are able to make that switch. Clustering will also help you to reroute your data through an alternate node should an unplanned incident happen on the node outside your organizational premises. This is something which helps you to become more fault tolerant regardless of the fact that the solution you might be running isn't a high availability solution. So if PTCL gets a cable fault, at least you can still be running your operation.
It is also important to have role-based recovery in place, rather than specify one, single individual who will be responsible for a specific task in the time when the crisis is hot. Different people react differently in the time of crisis and you don't want to have to put someone in a place he or she can't handle. Rather, put the position or job description to manage the recovery.
Want to compare storage products? View our IT Product Guides now.Here are some of the standard backup measures which you may want to keep in mind:
— Backups made to tape or high capacity, highly available media and sent off-site at regular intervals (preferably daily)
— Backups made to disk on-site and automatically copied to off-site disk, or made directly to off-site disk
— Replication of data to an off-site location, which over comes the need to restore the data (only the systems then need to be restored or synced). This generally makes use of Storage Area Network (SAN) technology
— High availability systems which keep both the data and system replicated off-site, enabling continuous access to systems and data
In many cases, an organization may elect to use an outsourced disaster recovery provider to provide a stand-by site and systems rather than using their own remote facilities.
In addition to preparing for the need to recover systems, organizations must also implement precautionary measures with an objective of preventing a disaster situation in the first place. These may include some of the following:
— Uninterruptible Power Supply (UPS) and/or Backup Generator to keep systems going in the event of a power failure. Have appropriate fire prevention and anti-virus tools in place.