RTO and RPO: Making It Simple

RTO and RPO: Making It Simple

RTO and RPO are two of the most important concepts in business continuity and IT disaster recovery. Today’s post will explain what they are, why they matter, and how to use them, illustrating their use with straightforward examples.

Related on MHA ConsultingAll About RTOs: What They Are and Why You Have To Get Them Right

Two Critical Concepts

The concepts of recovery time objective (RTO) and recovery point objective (RPO) are critical in developing a solid business continuity management (BCM) and IT disaster recovery (IT/DR) program. Let’s define them:

Recovery time objective (RTO)

recovery time objective

Relates to business processes and their supporting applications. The maximum length of time that a business process and its associated applications can be unavailable following a disruption in order to prevent an unacceptable amount of impact.

Example: If a company’s RTO for its online customer sales process is four hours, it means that the organization must recover and resume its online storefront within four hours of a disruption.

Recovery point objective (RPO)

recovery point objective

Relates to technical processes. The RPO for a given process is the amount of data from it, as measured in time, that can be recreated manually following its restoration after an outage.

Example: If a company has an RPO of one hour for its customer database, it means that after a disruption, the organization can only afford to lose up to one hour’s worth of data, and the recovery process must restore the data to a state that is no more than one hour old from the time of the disruption.

Both RTO and RPO are essential components of an organization’s business continuity and disaster recovery planning. They help determine the necessary strategies, resources, and technologies required to ensure the continuity of critical business functions and minimize the impact of disruptions.

The organization determines the RTOs of its key business processes and their supporting applications through an analysis of its needs. It determines the RPOs of its key technical processes through an analysis of its capabilities. (See below for a more in-depth discussion of how RTOs and RPOs are determined.)

Any gaps between the RTO and RPOs relating to an essential business process must be addressed by business continuity plans and strategies.

The Long and Short of It

A process can have a short RTO and a long RPO or vice versa. Alternately both the RTO and RPO can be short or long.

The following examples illustrate these possibilities:

Accounting: Long RTO, short RPO. In most organizations, general ledger (GL) accounting is a business process with a fairly long RTO, typically several days. This is because, if the accounting process is disrupted, it is usually a matter of quite a few days before the outage has a serious impact. However, the RPO of the technical and data side of the accounting function is very short.It might be four hours, but it could be as short as zero. This is because it’s virtually impossible to recreate accounting data after the fact.

Public-facing website: Short RTO, long RPO. This is your typical Company.com website providing basic information to the public. These typically have a short RTO because if the site goes dark it can immediately attract negative attention and undermine the company’s reputation. However, the RPO for the site is generally fairly long—e.g., 24 hours or more—because the information on such sites tends to be relatively static and any updates that are lost can recreated fairly easily.

Storefront website: Short RTO, short RPO. The company site that takes orders, tracks stock, and so on. This function has a short RTO because when such a site goes down, a meaningful impact on the company’s revenues and reputation can begin almost immediately. The function has a short RPO because the information in the system changes quickly and there’s no way to recreate it if lost.

Policy and standards oversight: Long RTO, long RPO. This process is important over the long-term, but an outage of a few days is unlikely to have a serious impact on the organization. Hence the long RTO. And while policies and standards do change from time to time, the rate of updating is generally slow and losses of data of up 24 hours could most likely be recreated with little difficulty. This means the technical and data processes pertaining to this area will have a long RPO.

RTOs and RPOs in Practice

The RTO for a given business process and its supporting applications is arrived at through an analysis of the company’s overall operations and prioritization by staff. The question to ask in determining an RTO is, how long can the process be down before the impact on the company becomes unacceptable?

The RPO for a given application is determined by identifying how much data from the application the staff could manually recreate. As mentioned previously, this is measured in terms of time (e.g., up to two hours’ worth, up to eight hours’ worth, and so on).

Manually recovering the data means recreating it by various methods such as reproducing it from memory, locating it in other applications or in hard copy, or contacting customers and asking them to resubmit their orders.

Knowing the RTOs and RPOs for the processes and technologies used across your organization helps you understand how you need to protect both processing and technology needs. Knowing these metrics helps ensure that your strategies, implementation, and plans are neither overly aggressive (wasting resources) or inadequate (providing insufficient protection).

Devising Your Categories

Every organization must devise its own scale of RTO and RPO categories. It is best to limit the number of categories to around five or six. More can be a maintenance nightmare.

The following is a scale of RTOs that we have seen work well for many organizations:

   
RTO 0 Immediate/high availability
RTO 1 < 8 hours
RTO 2 < 24 hours
RTO 3 < 72 hours
RTO 4 < 5 days
RTO 5 > 5 days

 

And here is a scale of RPOs that many organizations have used successfully:

   
RTO 0 Zero data loss
RTO 1 < 4 hours of data loss
RTO 2 < 12 hours
RTO 3 < 24 hours
RTO 4 > 24 hours

 

Once a company devises its categories, each of the its key business processes are analyzed and placed into an RTO category and an RPO category. These designations guide the subsequent development of the company’s recovery plans and strategies.

Determining RTOs and RPOs

How does a company go about determining the RTO and RPO categories for its processes and applications?

The BCM office should develop proposed categories for RTOs and RPOs based on the organization’s known risks and needs. In doing this, the IT team can be a good place to start. The BCM team should make note of the times IT uses for its current protection and recovery strategies. Using those values, the BCM office can make adjustments based on discussions with management to understand the general times departments would need to be recovered.

After the categories are defined, the organization should perform a Business Impact Analysis. Making the best choices depends on factoring in information and insights commonly held across many different levels within the organization.

The final decisions regarding RTOs and RPOs should emerge after the BIA. Once defined, those proposals should be submitted to upper management for review.

Throughout this process, the BCM office has the job of educating others, facilitating the discussion, seeking consensus, and obtaining the necessary approvals.

Every organization should review its RTOs and RPOs on a regular basis. This is because organizations and the environment change. A company that has outgrown its recovery plan has no recovery plan. It is critical that RTOs and RPOs be kept up to date.

Program Cornerstones

RTOs indicate how soon after a disruption a given business process and its supporting applications must be restored to prevent an unacceptable impact to the organization. RPOs are a metric of how much data from a given technical process, as measured in time, can be manually recovered in the event of an outage.

RTOs and RPOs for key processes and technologies are typically determined through a collaborative process led by the BCM team and calling on the judgment and expertise of people from across the organization. Once determined, the two types of objectives become cornerstones of the organization’s business continuity and IT/DR program.

Further Reading

For more information on RTOs and RPOs and other hot topics in BC and IT/disaster recovery, check out these recent posts from MHA Consulting and BCMMETRICS:

Richard Long is one of MHA’s practice team leaders for Technology and Disaster Recovery related engagements. He has been responsible for the successful execution of MHA business continuity and disaster recovery engagements in industries such as Energy & Utilities, Government Services, Healthcare, Insurance, Risk Management, Travel & Entertainment, Consumer Products, and Education. Prior to joining MHA, Richard held Senior IT Director positions at PetSmart (NASDAQ: PETM) and Avnet, Inc. (NYSE: AVT) and has been a senior leader across all disciplines of IT. He has successfully led international and domestic disaster recovery, technology assessment, crisis management and risk mitigation engagements.


Business continuity consulting for today’s leading companies.

Follow Us

© 2024 · MHA Consulting. All Rights Reserved.

Learn from the Best

Get insights from almost 30 years of BCM experience straight to your inbox.

We won’t spam or give your email away.

  • Who We Are
  • What We Do
  • BCMMETRICS™
  • Blog