Who Does What: The Most Critical Job Roles in IT Disaster Recovery

Richard Long

There’s a lot more to IT Disaster Recovery than backups and recovery. In today’s post we’ll look at the IT/DR job functions that must be performed in order to ensure that any organization’s IT position is resilient and recoverable.

 

Related on MHA Consulting: Learning to Talk to Your IT/DR Colleagues

 

Arguably, the area of business continuity that is the subject of today’s post is misnamed. We use the term IT/Disaster Recovery, but it would be more accurate to call it IT/Disaster Resiliency. That’s because in today’s world, recovery is not enough. IT/DR is not just about recovering from outages; it’s also about preventing them.

Whatever term is used, there is no doubt that understanding the critical job roles in IT/DR can help a BC professional know what functions need to occur to protect an organization’s IT systems and data.

IT/DR is more than a technologist ensuring there is redundancy in place.

Below is a breakdown of the critical IT/DR roles.

It’s not correct to call them job titles because at smaller organizations, multiple roles are often performed by one person.

This list is organized from the ground up. The first roles given are important for every organization. Those that appear at the end will typically be most important at the largest companies.

Why is it so important to make sure your organization’s IT/DR situation is protected? We’ll review that at the end of the post after we review IT/DR roles.

 

DR PROGRAM/PROCESS ANALYST

The IT/DR Program/Process Analyst coordinates with the business and technical teams on the planning and implementing of disaster recovery capabilities and processes. He or she coordinates the development and implementation of policies, procedures, plans, and programs to ensure the business is positioned to recover from a disaster.

The DR Program/Process Analyst monitors and reports on:

  • The organization’s compliance with regulatory or standards-based DR requirements.
  • Compliance with DR policies, documentation maintenance, and training.
  • Overall DR program state and capability.
  • Third-party compliance with contracted DR capabilities and documentation.

The Program/Process Analyst also participates in the Business Impact Analysis (BIA) process.

 

DR TESTING/RECOVERY ANALYST

The DR Testing/Recovery Analyst monitors and reports on the IT recovery state and the capabilities associated with the technical solutions. These capabilities include:

  • Virtual server protection
  • Data replication and backups
  • Network resiliency
  • Application-level resiliency

The Testing/Recovery Analyst is responsible for reporting on cloud DR processes and protections to ensure that SaaS, IaaS, PaaS, or other cloud solutions meet the defined Recovery Time Objectives and Recovery Point Objectives.

This role coordinates the development of plans to ensure that recovery meets the defined requirements. This includes the security functions, data integrity, and application functionality.

The Testing/Recovery Analyst also assists in documentation gap remediation. 

Finally, the person in this role coordinates the DR test schedule, test type, scope, and planning; monitors the test and records the results and any issues and follow-up items; and ensures that the summary test reports are completed and monitors follow-up and action items.

 

DR ARCHITECT

The DR Architect leads the technical team’s activities around system design and architecture and the implementation of systems and infrastructure to ensure the availability of critical applications in the event of a disaster to one of the primary data centers. The Architect also assists in the development and testing of disaster recovery plans and in discovering, tracking, and facilitating the remediation of recovery capability gaps.

In addition, the DR Architect provides guidance on and documents DR and resiliency strategy patterns and standards for current and new application or technology implementations.

 

DR PROGRAM MANAGER

The DR Program Manager oversees the overall IT/DR program. The Program Manager monitors DR capability and reports to senior management on the DR program and recovery capability. The Program Manager takes the lead in ensuring senior management understands the needs and risks associated with the current DR resiliency state and advocates for an appropriate level of resources. 

The Program Manager also oversees the detailed program, testing, recovery, and architecture responsibilities noted in the Analyst and Architect roles.

Finally, the DR Program Manager develops and continually updates an overall roadmap setting forth the organization’s priorities and needs and how they will be implemented.

 

IT/DR DIRECTOR

The IT/DR Director has responsibility for the overall DR program. The Director provides direct leadership for the IT/DR team and strategic direction for the DR program. The person in this role also performs a continuous review of DR program maturity, keeping an ongoing prioritized roadmap of the program.

In addition, the IT/DR Director chairs the DR steering committee, which is made up of members of the senior leadership team. The Director informs the committee about the status, needs, and risks of the program and provides advice.

 

THE IMPORTANCE OF HAVING A SOUND IT/DR PROGRAM

Making sure these roles are performed capably is not a nice-to-have; it is a must-have for any organization that wants to have a resilient IT/DR program. And having a resilient IT/DR program is critical for any company that wants to protect itself and its stakeholders from potentially devastating impacts from adverse tech events.

Having a strong IT/DR program reduces the chances that the organization will experience a technology outage. It also protects the company if and when such outages occur. Companies with inadequate IT/DR programs run an increased risk of being hit with outages and of suffering severe impacts from them. Adverse tech events can bring an unprotected company to its knees. They can prevent it from carrying out its mission-critical functions, causing it to lose revenue and customers, damaging its brand reputation, and exposing it to adverse regulatory impacts.

That’s the main reason it is imperative for every serious organization to develop a strong IT/DR position.

Having a good IT/DR program also brings many supplemental benefits. These include:

  • Uncovering redundancies that can be eliminated, thus saving on costs.
  • Identifying process improvements.
  • Enhancing understanding of business processes and dependencies.
  • Increasing the communication and understanding between the technology and functional teams, which can redound to the benefit of non-BC project implementation.

 

COVERING THE CRITICAL ROLES

To achieve a resilient IT/DR position, your organization must cover all of the critical IT/DR roles. Having a strong IT/DR program reduces your company’s chances of suffering a technology outage and provides crucial protection if such an outage does occur. Not having such a program exposes your organization to severe costs in terms of process interruption, revenue, customers, reputation, and regulations.

 

FURTHER READING ON IT/DR ROLES

For more information on IT/DR roles and other hot topics in BC and IT/disaster recovery, check out these recent posts from MHA Consulting and BCMMETRICS:

About
Richard Long is one of MHA’s practice team leaders for Technology and Disaster Recovery related engagements. He has been responsible for the successful execution of MHA business continuity and disaster recovery engagements in industries such as Energy & Utilities, Government Services, Healthcare, Insurance, Risk Management, Travel & Entertainment, Consumer Products, and Education. Prior to joining MHA, Richard held Senior IT Director positions at PetSmart (NASDAQ: PETM) and Avnet, Inc. (NYSE: AVT) and has been a senior leader across all disciplines of IT. He has successfully led international and domestic disaster recovery, technology assessment, crisis management and risk mitigation engagements.
business continuity mythsRTO and RPO