Today enterprises live in a world where natural or man made disasters can crumble a business to its knees. It is therefore critically important for these enterprises to recognise the fact that disasters are real and happen and it is essential they have a structured programme to protect the information from external and internal threats and disasters.
These are the potential threats to an organisation and if realised may impact business operations, reputation and brand image. As you see, the threats are both internal and external.
A holistic management process that identifies these potential threats and provides a framework for building organisational resilience with capability for an effective response to safeguard the interests of its key stake holders, reputation, business operations and brand image is called Business Continuity Management.
Generally, most enterprises need to be back on business with minimum downtime after a disaster.
There is no “one size suits all” generic BCM and disaster recovery plan. Each enterprise needs to have their own customised plan to bring them back to business. Nevertheless, there are useful guidelines available to manage the disaster and The British Standards Institution (BSI) has released a new independent standard for BCP — BS 25999-1. Prior to the introduction of BS 25999, BCP professionals relied on BSI information security standard BS 7799, which only peripherally addressed BCP to improve an organisation's information security compliance. BS 25999's applicability however extends to organisations of all types, sizes and missions whether governmental or private, profit or non-profit, large or small, or industry sector. Using these guidelines, each enterprise then needs to develop their own customised BCP.
(Read more: 5 easy ways to build your personal brand !)
A well defined BCM has the following essential components:
- BCM Strategy
- Organisation wide awareness
- Identification of Information assets
- Risk assessment
- Impact Analysis
- Risk mitigation
- Business Continuity Planning
- DR site strategy and implementation
- DR drills
- Audit and continuous improvement
(Read more: REST APIs and Next Generation Threats: Part 1)
The structured programme to secure an organisation’s business operations starts with a clearly articulated vision. At Mindtree, we believe that this vision should come from none other than the CEO and that the initiatives should be driven from the top. The vision need to be then adapted to all the departments. When a disaster strikes, it may not spare any department. It is also critical to articulate this vision to be board and incorporate as a part of corporate governance.
The next stage is to define a well articulated strategy for recovery from disaster, the essential functions that need to be recovered, time lines for recovery. The strategy should clearly focus on recovery of business operations, brand image, and reputation
The strategy typically should be in lines mentioned below:
- A BCP budget should be formalised and approved by senior management.
- Disaster declaration authorities, who will be responsible for implementing the continuity strategies in the event of a disaster or business interruption, should be identified.
- Incident management system or process for monitoring, recovering and stabilising from a disaster or business interruption should be identified.
- The plan should be reviewed periodically and benchmarked against industry standard practices and other similar organisations’ best practices.
Organisation wide awareness:
One of the main challenges of BCM is lack of interest. BCM is always treated as an initiative of either IS or Security Department. It is important to create awareness among the employees, partners and vendors of the organisation on the BCM initiatives and their role and responsibilities for this initiative. The training plan should be developed and the training should be conducted on regular and defined intervals.
Identification of information assets:
The information resides everywhere in an organisation, in printed sheets, in files, in computers, in storage racks, in offsite data centers, in tapes stored in a remote location and, even in employees’ heads. All these sources of information are vulnerable to external and internal threats. The damages can be significant. These information assets need to be identified along with their location. Once the assets are located and identified, the criticality of these assets need to be documented.
Two important characteristics of risks are:
- Probability of occurrence of risk (low, medium and High)
- Severity of the risk (low, medium and high)
Develop a risk table by
- List all the risks
- Categorise the risks
- Analyse the probability
- Analyse the severity
- Sort the risks and identify the risks to be managed
(Watch more : An approach to present IT Risk as Business Risk )
Risk analysis need to be undertaken to cover the impact of the risk.
An earthquake of Ritcher scale 8.0 is low probability in London, but high impact to your information assets. On the other hand a virus attack can be high probability but low impact if all the secure measures are taken to prevent a virus attack
This impact analysis should also cover the financial / brand and other damages should be clearly quantified.
Identify key business processes and critical dependencies. The impacts of potential business interruptions should be identified.
Once the impacts are analysed, MindTree recommends a mitigation strategy need to be developed for each category of risk. The next step is to take measures to manage the risk.
Risk mitigation involves:
- Analysis of threats most likely to occur
- Identifying threats makes most impact
- Minimising service disruptions and financial loss
- Having a contingency plan for mitigating risks
For example, the risk mitigation strategy for hardware failure of a mission critical server is to have spares onsite so that the down time is minimised
Business Continuity Plan:
The business continuity plan should have the optimum business recovery time for your business. For example, if it is acceptable for your business recovery time to be measured in days then you may opt for just offsite tape storage. However, if the acceptable business recovery time is just a few hours, then a hot standby system at a disaster recovery site may be needed.
BCP need to cover the following aspects:
- Identify process specific Recovery Time Objective (RTO)
- Identify minimum capacity requirement to run the business operations at acceptable level
- Calculate recovery efforts based on RTO
- Review Service Level Agreements between the organisation and external partners
- Identify critical information resources
- Prioritise these resources in order of recovery
- Identify procedure for acquiring critical resources in the event of disaster
- Identify contact information and procedures for disaster authorities
- Identify and keep ready a disaster recovery site
- Conduct a cost benefit analysis of moving the business processes to DR site
- Define standard procedures for response, recovery and restoration
- Develop procedures for relocating the business processes to DR site
- Define emergency response procedures that are
- Time based
- Team Based
- Checklist based
- Identify ER team members with contact information
- Create response, recovery and restoration processes for security and safety
- Document and train crisis communication procedures
DR site strategy and implementation:
If the primary site of business has a major impact due to a disaster, the business processes may have to be located to an alternate site. The business processes may include people, machinery, and IT assets. The location of the DR site has to be carefully selected such that the same disaster should not affect the DR site at the same time when an event of disaster strike at the primary site.
Eg: If the probability of forest fire spreading in the entire location is very high, then the disaster site should be located several hundreds of miles away from the primary site.
It is also important to identify minimum capacity operations to be duplicated at the disaster recovery site to enable acceptable level of business continues until the primary site becomes functional again.
Disaster Recovery Drills:
Disaster recovery drills need to be drawn and tested at regular intervals in order to ensure your preparedness for a disaster.
BCP and DR should cover all aspects of business from sales to operations and from people functions to IT…. specifically information management. Testing approaches like top down drill and full plan tests should be conducted.
The drills often take care of only certain aspects of the business and our view is that it is likely to be worthwhile to create disaster simulation models to test the DR drills in areas where an actual drill cannot be taken care of.
The drill should involve all critical business units, departments and functions. The roles and responsibilities for BCP testing should be assigned in advance.
Audit and continuous improvement
A post test review and analysis process need to be created.
The BCM process needs to be periodically audited to ensure compliance with company standards.
Specific time lines need to be defined to update the BCM based on the change management process of the organisation.
Though BCM is absolute necessity for every enterprise, implementation often is faced with several challenges. Some of them are:
- BCM doesn’t have ROI
- BCM does not generate revenue
- Can BCM be replaced by insurance?
- Planners’ overkill budget
- Lack of interest from senior management
- No budget for BCM
- It will not happen to us
It is important to make sure that the BCM is lean and mean and only need minimum capacity requirement to run the business operations at acceptable level in the event of disaster. It is also important to quantify the impact to the business, brand and image of the company in the event of a disaster.
This was brought home to one of the MindTree’s customers very recently when Hurricane Ike struck their Houston Data Center. Yet, with the help of a well planned and articulated BCP and BCM plan, MindTree IMTS engineers were able to ensure recovery from the disaster within 48 hours without disrupting the client’s business. This was only possible because of several months of planning and implementation of BCP and DR. The critical business operations were moved to a disaster site without any physical movement of people, hardware or software within 24 hours.
By Ram Mohan, Executive Vice President and Head of Infrastructure Management and Tech Support at Mindtree Ltd.