Table of Contents
Everyone knows that failures are simply a part of the IT world, and with this fact comes the risk and reality of major loss and high costs. Below you will learn the keys to improving performance and preventing failures for your backup and restore processes.
Because of the reality of Murphy’s Law, a higher than 98% success rate is not likely achievable when it comes to backup. Human error, combined with power and network failures are often the culprits. Did you know that several companies maintain 80% or LESS when it comes to rates of success? For those who boast of 100% success, it is probable that they are simply viewing an isolated moment or have altered various data points. In the end, this does not serve anybody well, as trust is lost when the system and spreadsheet data don’t match.
Below you will see the major concerns of backup professionals with respect to their backup success rates.
The Process of Monitoring
The Concern: In these technologically advanced days, many IT environments are growing larger and larger and the amount of systems is much larger than originally planned for. Thus many add-ons are required to monitor systems successfully. Backup failures are therefore not as easily grasped, which in turn, creates an opportunity for more failures in the future.
The Solution: A user interface that offers graphs is needed, which would allow IT departments to view the backup environment as a WHOLE, along with each separate server and client. This system would also combine data automatically, and have the ability to be multi-vendor friendly.
The Concern: Protection alerts are often sent through email to specific administrators, and there is concern that over time various changes WILL occur, causing these alerts to be missed. Discipline and detailed instruction must be administered in order to ensure the appropriate person gets each message. Timely reception of alerts allows for prompt investigation, which is essential in the world of backup.
The Solution: Proper tools must be used, which will alert command centers and various administrators in real-time through email, SNMP integration, as well as through SMS. Using tools such as these, the appropriate person will have the ability to directly address error conditions involved in the backup system. It will also grant opportunity to quickly generate detailed information regarding the failure, which is certainly a time saver!
Command Line Driven Operation is Prone to Errors
The Concern: Even though the command line interface is the first choice for most administrators to quickly finish something, it is a challenging method in that it does not allow consistency in terms of backup operations. As staff varies over time, so do operations. It is important to implement best practices with respect to these procedures.
The Solution: A user interface must be added to backup systems which grants GUI operation regarding backup. Also, the error prone command line operation must not be available to be used as the standard.
Reports and Planning are NOT Given Enough Time
The Concern: One report is usually focused upon by administrators, and that is whatever system sent an alert last. Even though this is important, it also equally as important to create reports regarding alerts, trends, and forecasting. Also important to note is that fact that data is constantly being removed from the distributed servers. This removal offers more space for incoming data. Information as to why there was a backup failure could potentially be in the data that was removed, causing the analysis to take much longer.
The Solution: Data collection must be acquired from BOTH backup servers in individualized databases. This way, the daily tasks involved in backup operations will not be disrupted.
The Concern: Misconfiguration of backup and recovery systems can cause various challenges, and often happens as growth occurs within data and servers spheres. Below you will see some common challenges during misconfiguration.
-Recovery Logs Not Sized Properly: Information regarding backup goes from a recovery log to a database. To make room for new data, the recovery log is purged. Manually enlarging (and restarting) the file space of the recovery log must be completed prior to its availability. Problems arise when failure takes place during a time in which the log is not available.
-Disk to Tape Error: Where tape is still commonly used, writing from backup disk to tape is a general occurrence. A small disk pool can cause delays in backup and a missed backup window. Technology is needed which allows for more than a single thread from disk to tape. Speed is also an issue, and if the tape and disk speeds don’t match, the disk pool will refuse further backup data.
-Multiple Backup Sessions Occurring: It is easy for the number of clients to exceed what is acceptable for a given backup system, especially in new environments. This overload of clients can also cause missed backup windows.
The Solution: Now days, it is easier for certain IT departments to watch their backup environments more closely due to better monitoring systems. These systems allow for quick identification of errors and help teams watch for changing environments. The truth is however, mistakes DO happen, and while it would be lovely to set up a backup system and walk away in confidence, it is important that backup software is paired with a good monitoring system. This way, success is more likely.
So…Is it a Matter of the Arts or the Sciences?
The answer is…BOTH! Backup environments are fragile spheres, depending on rates of failure, capacity, speed and timing. There is a precise science behind this world that is under continual pressure and variation. But, let’s not forget that managing this scientific sphere is also an art, relying heavily upon tools which project trends and upcoming challenges. The art of backup administration is something that needs to be passed on carefully, as there is a high learning curve when this knowledge is transferred. Those that have high success rates can attribute some of it to the effective art of combining monitoring and reporting tools with their backup software. With these monitoring and reporting systems, the backup sphere is carefully watched and will ensures high rates of completion and restore processes.
About the guest author:
Mike Johnson is a Technical Writer for Rocket Software. He writes on topics like data protection, backup monitoring and reporting software. He holds a Bachelor of Science Management degree from DeVry University.