The cost of not having an IT disaster recovery team can range from being unable to recover from a disruption, to overspending. In today’s post we’ll look at the right way to build an IT/DR team, focusing on what roles need to be represented to ensure that the organization’s various computing services each receive an appropriate level of protection.
Related on MHA Consulting: Who Does What: The Most Critical Job Roles in IT Disaster Recovery
The Price of Neglecting IT/DR
Being a business continuity consultant can be frustrating. We know what organizations should do to protect themselves from disruptions; however, many of our clients try to get by with doing the minimum or nothing at all. Many companies pay a price for this approach, but the confidential nature of our work means we must keep their stories to ourselves.
This means that in the blog when I talk about negative consequences, I’m limited to saying, “Believe me, I’ve seen it over and over again: Companies that neglect business continuity planning frequently live to regret it.”
This is especially true of IT disaster recovery planning (IT/DR), the aspect of business continuity that is concerned with the protection and recovery of IT systems, data, and applications. Organizations have never been more dependent on IT than they are now, but at the same time the threats to our computing systems—whether from cyberattack, supply-chain problems, global disruption, or other challenges—have never been greater.
Inadequate or nonexistent IT/DR planning is a problem that rears its ugly head everywhere we go. The cost of poor or nonexistent IT/DR planning can range from grossly over architecting their IT/DR capacity to neglecting it to being unable to recover after a catastrophic event. When organizations lack a comprehensive planning team that meets regularly and makes a concerted effort to identify and close the gaps between the needs of the business units and the capabilities of IT, then IT has no clue about what its need to do in recovery and when. All too often, when something happens, the result is delayed recovery, possibly no recovery, and finger-pointing as each side blames the other for the meltdown.
Building a Solid Disaster Recovery Team
Intelligent IT/DR planning is a must, and the starting point is a strong disaster recovery team. Let’s look at how to build such a team, focusing on the roles that need to be represented to ensure that the most important IT processes are identified and appropriate levels of protection put in place.
Your IT disaster recovery planning team should consist of the following:
Management Steering Committee. This group is made up of executives who oversee the process at a high level. They may not technically need a seat at the table, but they should be standing in the room. These leaders play an important role when it comes to approvals for things like budget, policy, strategic direction, and overcoming roadblocks or intradepartmental issues.
Disaster Recovery Coordinator. The disaster recovery coordinator is an individual usually from the IT department who manages the overall recovery in the event of a disruption. Typically a member of the emergency management team, the DR coordinator is responsible for setting recovery plans in motion and coordinating efforts as they progress. The coordinator also helps resolve problems encountered and removes roadblocks that might slow the process down.
Business Continuity Representative. The BC professional on the team ensures that IT recovery plans align with business needs. Business needs are determined by a Business Impact Analysis (BIA) completed before disaster recovery planning begins. The BIA, whether formal or informal, is critical to DR. The BIA identifies the business process whose interruption would cause the greatest impact to the organization, providing critical guidance to the DR effort. The BC representative bridges the gap between business and IT to ensure that critical business needs will be met through IT recovery plans and that any gaps in alignment are addressed. The BC person additionally contributes knowledge of essentials such as crisis management, how to report information during an event, contact lists for key personnel, information on vendors, and so on, helping ensure a smooth, effective recovery process.
IT Infrastructure Experts. These team members do the lion’s share of the recovery work. They are responsible for identifying strategies and solutions that will recover critical operations in their areas of expertise, then implementing and testing them to ensure they work. The strategies they design must meet the requirements for critical business units as outlined in the BIA. The team should include three IT infrastructure experts, one from each of the following areas:
- Servers, Storage, and Databases. Almost all technology runs on some type of server. The person in charge of this area should be intimately familiar with the server and operating system infrastructure along with the backup or replication technologies needed to meet the recovery needs and the implications of the differences between the use of physical and virtual environments. Regarding storage, data protection or replication is a critical recovery component; in fact it is often the major component of the recovery strategy and capability. In most organizations, the storage used in the processing environment is not completely local to the servers (whether physical servers or the server running the virtual environment). With regard to database administration, databases house the data that applications depend on and represents an architecture unto itself. Databases may be shared across applications or run on individual or shared servers. Depending on the organizational structure, the database administration may be part of the infrastructure or application team.
- Networks and Telecom. This is a critical role since no IT system can work without firewalls and connections to servers, storage, and so on. The expert overseeing this area should be intimately familiar with the organization’s network infrastructure and be able to take charge of recovery strategies related to it. Another reason for the importance of this role is that disruptions often affect voice communication infrastructure, making it difficult for employees to communicate with others internally and externally.
- IT Applications. Depending on the IT infrastructure recovery plans and the extent of the disruption, the individual(s) responsible for applications may play a greater or lesser role in recovery. This person’s duties include understanding (based on how the infrastructure team proposes to restore the environment) what additional application tasks may need to occur, e.g., changes to application configurations and settings, data consistency, or application integrations. This team member should work closely with the other infrastructure representatives to identify recovery steps and design an appropriate plan that meets the needs of the critical business units.
Cybersecurity Experts. This group is unbelievably important. These folks are responsible for implementing comprehensive information security procedures, policies, and systems to protect the organization. Their job is to ensure that the current environment—including technology staff, operating system, applications, infrastructures, server, network, and data backups—is protected and properly architected. They must also know how, if a disruption occurs, these components will look in the recovered environment. (Will the new environment be able to provide the same level of recoverability, availability, capacity, and security as the original?)
Core Services Users. Lastbut not least are the core services users, representatives of mission-critical departments that make heavy use of IT in conducting the essential functions of the organization, whether this is making widgets, providing services, caring for patients, or educating students. These people should be consulted regarding how they use IT, how long they could manage without it, and how long they could work manually, if necessary. The planning team should take the information into account as they develop the IT/DR plans, and the final plans should be the product of a conversation between the business units and IT. This is the best way to integrate the needs of the business units and IT, minimizing gaps and boosting the likelihood of the plan’s success.
Safeguarding Your Systems
It would be difficult to overstate the importance of building a robust IT disaster recovery team. Neglecting IT/DR planning can lead to consequences ranging from overspending on unnecessary infrastructure to being unable to recover from a disruption, with all the finger-pointing that typically ensues.
A well-structured disaster recovery team includes a management committee, an IT coordinator, business continuity representatives, infrastructure experts, cybersecurity specialists, and core services users. This mix ensures that the team can safeguard the organization’s computing systems by identifying critical IT processes, ensuring processes receive a level of protection commensurate with their importance, and closing any gaps that might exist between the needs of the business units and the capabilities of the IT team.