Managing an enterprise BCM program requires BCM Practitioners to address many program initiatives and tasks that must must seamlessly work together. I liken BCM programs to a watch with many moving parts; some critical and others not so critical to its operation and ability to provide accurate time.
In today’s high pressure environment, we see BCM Practitioners being overrun with not only managing the program daily but dealing with external influences (e.g, audit requests, questionnaires, etc.) that take up their time. Yet, many BCM Practitioners continue to attempt to work on everything at once in an effort to maximize productivity but end up actually producing less and making more mistakes. Are you and your team experiencing any of these symptoms:
- Are you and your BCM team stretched too thin?
- Do you simultaneously feel overworked and underutilized?
- Are you often busy but not productive?
- Do you feel like your time is constantly being hijacked by other people’s agendas?
- If you answered yes to any of these, the way out is the Way of the Essentialist.
I have learned from being a BCM practitioner and now running multiple BCM related companies that to be successful you must be mindful; and more importantly, be an essentialist in order to not get more done in less time but get the right things done that make the most difference. A member of my Board of Directors had me create a list of everything I was doing and / or felt I needed to do in managing our companies. The list was exhaustive and made it clear how scattered my efforts were and were not focused on the essential tasks that bring the greatest return on investment to me and our organizations. Eliminating unnecessary tasks was not easy; it required me to train others to take tasks, hire where possible, outsource to external parties, forget about some and most importantly, trust that the minimum set of tasks was what I needed to do.
So, how do we apply this to our BCM teams and our programs?
- List all of the tasks you and your team members perform.
- Inventory all of the program initiatives (Policy, Plans, Strategies, Audits, BIAs, etc.) you are working on currently.
- Starting with your team member list of tasks, review the list and categorize them by essential and non-essential by looking at tasks permit you to make the highest possible contribution. Determine what to do with the non-essential tasks (e.g., eliminate, transfer, outsource, etc.).
- Based on your review of you program initiatives, which ones provide the greatest return on compliance, resiliency and maturity? Which ones are window dressing?
- Revise the tasks you and your team members will perform based on what is essential and brings the highest possible contribution.
- Generate a program roadmap with the most essential initiatives that will heighten the sophistication and maturity of your program.
Essentialism is systematic discipline for identifying what is absolutely essential, then eliminating everything that is not, so we can make the highest possible contribution towards the things that really matter. By applying a more selective criteria for what is Essential, the disciplined pursuit of less empowers us to reclaim control of our own choices about where to spend our precious time and energy to bring about the highest possible contribution to our team and organization.
As BCM Practitioners we are often required to dream up, plan, implement and facilitate a mock disaster exercise for our Crisis Management teams. The planning process is crucial to developing an exercise that meets the needs of your organization. Steps in planning a successful mock disaster exercise are:
- Consider the past list of scenarios you have presented to the team in the past. Does a past exercise suffice or do we need to develop a brand new exercise? A past exercise can be used if significant gaps were exposed that require you to replay it to validate the teams response. Always consider the maturity of the team.
- Review action items from previous exercises to make sure they have been resolved and do not cause gaps in the upcoming exercise.
- Identify the key objectives of the exercise; what are you trying to stress test and validate? Focus on a core set of objectives that you would like the exercise to meet. Less is more here.
- Based on the objectives, identify Subject Matter Experts who will aid you in building the exercise. These individuals can be internal and/or external personnel who will provide you with expertise to build your scenario. These people typically do not participate in the exercise since they built it.
- Hold multiple brainstorming sessions with your Subject Matter Experts to build the exercise based on objectives you are trying to meet. Typically, a couple of these sessions will build the framework that you can use to create the detail events. Validate the exercise framework meets objectives.
- Build the detailed timeline and list of events to occur based on the framework you developed with the Subject Matter Experts. Consider how long you have for the exercise, give people time to address events and respond as needed. I consider the maturity of the team in determining how long I give them to address and respond to events in the exercise.
- Validate the scenario, timeline and events with your Subject Matter Experts; ensure it makes sense and meets the objectives. Identify gaps or areas that are confusing; you don’t want participants pointing at holes in your exercise that will derail it.
- Revise the scenario and you are ready.
- Make sure you have a good facilitator ready to lead the exercise. This person must be prepared to lead the team from the beginning to the end of the exercise. He or she must know the exercise in and out as well as assess how the team is doing. If the exercise needs to be slowed down or sped up, the facilitator must address it.
- Have fun and enjoy the exercise. It will never go as perfectly scripted but when does a disaster fit our plans?
A recent Harvard Business Review article in the December 2013 edition entitled “The Hidden Benefits of Keeping Teams Intact” discussed the benefits and reasons for keeping teams familiar with each other. The article expresses that team familiarity raises performance; leads to fewer mistakes, encourages better decision making, etc.
So how does this apply to us? In our role of BCM, we deal with a number of different teams including Fire Life Safety, Crisis Management, Business and IT Recovery Teams, etc. Maintaining familairity consistency across team members is difficult as existing team members leave and new members arrive.
In my experience, I agree with this article as I can the say that the performance of Crisis Management Teams who have worked together for a number of years or at least have some familiarity is much higher than those who do not have familiarity and/or long term working relationships. So what data substantiates this theory:
- Defense – Special ops teams such as the Navy Seals are kept intact over many years.
- Aviation – NASA found that fatigued but familiar crews made about half as many errors as rested but unfamiliar teams.
- Surgery – A study of surgeons who worked across multiple hospitals found performance varied perhaps because of their varying levels of familiarity with the OR teams.
In our consulting firm, we have a high degree of familiarity as the majority of us have worked together over 10 years. This familiarity has led us to a high level of performance as we are clearly versed in each others strengths, weaknesses and areas of expertise.
So, how do we make this work? We can’ t keep team members forever; however, we can work teams to have some level of familiarity which is better than none at all. Hold short training and awareness sessions, short 30 minute mock disaster exercises, etc.
Does having a BCM program compliant with industry best practices, standards and guidelines equate to recoverability? I do not believe it always does. Being compliant, in my opinion, ensures the best underlying infrastructure has been assembled, implemented and integrated to to maximize program efforts and potential for success in a disruption. It does not mean however; that you will recover without a hitch or difficulty in all situations.
Lets use the athlete analogy. Being Tiger Woods doesn’t mean you will win 100% of all golf tournaments played. Now, because of his talent, preparation and work ethic it does mean he will win more than a good share of those he plays in and so goes it for being compliant. Working to be compliant is like building the best possible athlete to compete but you will not always dominate; there are too many variables like the people factor, events we never saw coming, just plain bad luck, etc. that can derail us.
So, working towards having a high level of compliance with industry best practices, standards and guidelines is the right thing to do. I liken the industry best practices, standards and guidelines to a fitness program for your organization. Some organizations get on it but quit because they get tired, lose interest or don’t want to do it on a routine basis. Others work through the soreness, the daily grind and the sweat to build a BCM program that is strong, resilient and ready for any disruption that comes its way.
Get your BCM program on a workout routine today!
Some tests only involve two people while others can include an entire department. All tests require preparation time. This is necessary to coordinate schedules of people, exercise control rooms, and equipment. At a minimum, every plan should be tested annually. Plans to test should include business processes, IT systems, work area recovery, pandemic, and more. The following is a typical testing schedule and what to include:
- Inspect Command Center sites for availability and to ensure their network and telecommunication connections are live.
- Data Backups
- Verify that data backups are readable.
- Ensure that every disk in the data center and key personal computers are included in the backups.
- Inspect safe and secure transportation of media to off-site storage.
- Inspect how the off-site storage facility handles and secures the media.
- All business process owners verify that their employee recall lists are current.
- Issue updated versions of plans.
- Conduct an IT simulation at the recovery site.
- Conduct a work area recovery simulation at the recovery site.
- Conduct a pandemic table-top exercise.
- Conduct an executive recovery plan exercise with all simulations.
- Review Business Continuity Plans of key vendors.
- All managers submit a signed report that their recovery plans are up to date.
- Practice a data backup recall from the secured storage area to the hot site.
MHA Consulting CEO Michael Herrera discusses the Business Continuity Management (BCM) trends that he and his team have experienced across their global customer base in 2013:
- Business Continuity staffing in most organizations is not increasing. Many organizations continue to either staff minimally or use outside consultants to augment the program. Business units are having to take more accountability for their plans and use the continuity staff as Subject Matter Experts (SMEs). MHA continues to heavily augment or serve as the BCM or Disaster Recovery Office for a good number of its clients.
- Business Continuity Management (BCM) is the new Business Continuity Planning (BCP). The majority of organizations are renaming their enterprise continuity programs to Business Continuity Management.
- Enterprise Risk Management (ERM) is integrating BCM into its process and utilizing the information gathered through BIAs and Threat & Risk Assessments to support identification of risks and exposures; a good sign.
- The Business Impact Analysis (BIAs) study remain as the foundational component to drive the development of the BCM program. However, senior management is continually looking for us to refine the BIA process, shorten business unit participation time in the studies and ensure the rigor in the process is strong enough to clearly identify the most critical activities and dependencies. A common weakness in most BIA studies is not having management sign off on the results which affects alignment discussions between IT and business.
- We see Recovery Time Objectives (RTOs) continue to get shorter and shorter (e.g., no downtime, 1 hour, 4 hours, etc.) in many of the companies we worked at in 2013. The influx of complex technology and automated workflows and customer demands for uptime require business activities and dependent systems/applications to be recovered in timeframes that mandate “real time” recovery strategies that can be activated immediately, a challenge few companies can support at all levels which causes gaps between the RTOs and the Recovery Time Actuals (RTAs).
- The new norm for tolerance for data loss or Recovery Point Objectives (RPOs) across critical business activities is zero or near zero in many companies due to the use of complex technology and automated workflows that virtually eliminate manual workarounds. However, in many cases, senior management continues to believe they don’t need the data backup technology to meet the RPOs because they believe they can work manually for a period of time. We also find cases where IT cannot afford the technology to provide the short RPOs and/or the business has no idea what their RPOs are currently or what they should be.
- Business and IT RTO/RPO Alignment – Alignment remains a critical gap across a majority of companies whether they are small, medium or large. Recovery Time Objectives (RTOs) and Recovery Point Objectives (RPOs) continue to be driven by Information Technology (IT) versus by the needs of the business.
- Emergency Notification Systems – The use of ENS is becoming widespread. However, organizations routinely struggle with the processes to effectively and efficiently notify associates, getting good contact information from associates and holding testing on a regular basis. However, ENS is only good if we have electricity for our technology.
- Big Data -We have heard a lot about “Big Data”; the monster sized database warehouses that drive today’s businesses. In the old days, data warehouses had low recovery priorities, however, Big Data is now driving mission critical applications requiring short RTOs and RPOs, a huge challenge for Information Technology.
- Companies continue to struggle with Recovery Strategies particularly for the business units of the organization. Yes, work at home will work but only for a limited time and Information Security concerns are limiting its use. Information Technology strategies are making it easier and easier to recover the critical systems and applications. The problem that remains is how will my business get to that data based on their strategy. It is our opinion, that in today’s complex business environments recovery strategies for RTOs of 72 hours need to be fully in place before an event occurs.
- Our most mature clients (financial, utilities) are holding live Recovery Exercises. They shut down production operations and migrate production work to their alternate sites (data center and business) for a day to validate their plans and strategies. Other clients are building in resiliency through diversity of operations which permit them to transfer work loads across their network. But sadly to say, recovery exercises at many organizations are limited to desktop plan reviews, a minimal examination of true recovery capability.
- Customer Audits are filling the inbox of the BCM Office and lowering staff productivity. The sheer number and diversity of questions is requiring management to spend hours completing these audits and reviewing them with the customer. We strongly recommend to our clients to build a Customer Audit process to streamline it, ensure consistency in responses, minimize the opportunity for unauthorized information to be disclosed and take less time.
Overall, 2013 was a good year for BCM. Companies are continuing to recognize the need for BCM in their environments. I was reminded by our Director of Operations that BCM is still a relatively new field and we are still figuring out how to make it a refined, streamlined process.
Happy New Year to You from MHA Consulting
Exercise and testing can consist of talking through recovery actions or physically recovering things. Testing can be discussion-based or operations-based. There are several different kinds of testing each categorized by their complexity involving set-up and number of participants needed.
- Standalone Testing – the person who authored the plan reviews it with someone that has a similar technical background (i.e. manager, backup support, etc.) It is useful for catching omissions in the plan and can also provide insight into the process for the backup support person.
- Integrated System Testing – occurs when all components of an IT system are recovered from scratch. This type of testing can reveal many of the interfaces between IT systems required to recover a specific IT function.
- Table-Top Exercises – these simulate a disaster but the response to it is conducted in a conference room. A disaster scenario is provided and participants work through the problem. Similar to walk-through testing except the team responds to an incident scenario.
- Simulation Exercises – requires taking a table-top exercise one step further and includes the actual recovery site and equipment. A simulation is the closest that a company can come to experiencing (and learning from) a real disaster. Simulations provide numerous dimensions that most recovery plan tests never explore. They are time consuming and expensive to conduct.
The Emergency Operations Center (EOC) should be located as close to the problem site that is safe. If you were aware of where and when a disaster would strike, you would take steps to prevent it. Therefore, unless you’re the cause of the problem, you don’t know where it will be. When establishing an EOC, evaluate possible sites based on a few criteria. Because very few companies can afford to leave a fully equipped room sitting idle until needed, most companies convert an existing facility to an EOC when needed. Often times, with a bit of rearranging and some additions, a room that is already wired for data and equipped with computers can turn into an EOC.
A typical center is between 500 and 2000 square feet and should have a large closet to hold supplies for set up. It should also be close to a building exit. It must be easily accessible by road and have ready access to delivery services, food service, and hotels. Other things to keep in mind when setting up an EOC is the power source and telephone company. These should both be serviced by different companies than the central office. This way, your primary EOC can become a back-up EOC if you have another facility in a nearby city or town.
A few options for EOC are a personal computer training room, a large conference room with wiring, or a hotel wired for PC training that has sufficient outbound telecommunications capacity.
A note on using a backup EOC to control recovery operations: expect to relocate closer to the disaster site within 48 hours, as it will quickly become unwieldy to control operations from a distance. However, for the first few hours, even a remote facility will be extremely valuable.
A disaster scenario is a hypothetical incident that gives participants a problem to work through. The scenario may describe any disruption to the normal flow of business. When selecting a scenario, be sure to make it one that is realistic as well as broad enough to include several teams to test intergroup communications. Also, make sure the final solution is achievable. The following is a list of potential testing scenarios.
- Natural Disasters
- Hurricane/heavy winds and rain
- Civil Crises
- Labor strike
- Workplace violence
- Serious supplier disruption
- Terrorist target neighbor (judiciary, military, federal, or diplomatic buildings)
- Limited or no property access
- Location Threats
- Nearby major highway, railway, pipeline
- Hazardous neighbor
- Offices above 12th floor (limit of fire ladders)
- Major political event
- Network/Information Security Issues
- Computer virus
- Hackers stealing data
- Data communication failure
- Data Operations Threats
- Roof collapse
- Broken water pipe in room above data center
- Fire in data center
- Critical IT equipment failure
- Telecommunications failure
- Power failure
- Service provider failure