If your data recovery plans are lengthy, detailed, and/or “bare metal” based, requiring comprehensive operating system, database and application recovery steps, then they are almost certainly out of date and not functional. If that is the case, then you should probably revisit your recovery strategy and ensure that it meets your business needs (that is a topic for a different blog). Even if your plans are not “bare metal” based recovery, they are probably not functional.
With the current technologies (e.g., virtual servers, virtual storage, storage-based replication, application-based replication, disk to disk backup), data recovery plans should be very different from what they were even 10 years ago when these technologies were first becoming more common.
To make your data recovery plans functional, you should ensure that the following are items are included:
- Detailed dependencies (servers, storage, or specific applications).
- Don’t assume everything will be up when it is needed. Also, think in terms of support or third party dependencies.
- Proprietary information, such as the amount of storage usage or specific scripts needed.
- Remember, there may be contractors or secondary resources performing tasks. They will not be as familiar with your environments.
- Detailed execution steps (even though there may be few steps).
- While sometimes it seems like the technical steps are automated, there are often scripts or steps that still need to be executed. Also, having screen shots can be helpful because these are tasks not done on a daily basis.
- Technical validation steps.
- Just because the tool shows “green” does not mean everything is really available.
- Communication between systems (DB connections, FTP, APIs, etc.).
- Data validation and synchronization steps.
- Another assumption made by many organizations is that data will be synchronized between environments. This may be the case, but depending on the replication schedule that may not be entirely accurate. Validation should occur to ensure that the replication occurred at the scheduled time and there were no errors.
- Include validation, rerun or correction of interfaces between systems.
- Application validation steps.
- It is much better to know issues before there are data or performance issues when the entire organization tries to access an application.
- Performance expectations.
- Understanding how the environment will run and any known performance impacts will help in communication and troubleshooting.
- Changes to the processing schedule.
- Determine if the interfacing or processing schedule will need to be adjusted due to the recovery order of systems or recovery issues.
This post on data recovery plans has been updated from a previous version published in November, 2010.