Social Network For Security Executives: Network, Learn & Collaborate
Part 14 of 15: System Recovery Capabilities
What Is It? Often viewed as a purely technical capability, being able to recover your systems to operational capacity is imperative. Your systems are the heart and soul of your enterprise and central to your mission and ability to deliver what your clients pay you for. Whether a database server, piece of switchgear, workstation, or mobile device, the ability to recover it to a usable state and preserve the data is often overshadowed by the daily grind of using these systems.
Consider system recovery capabilities like a good insurance policy that must be there when you need it most. Lately, the flurry of ransomware attacks and other compromises attributed to breaches and malware have necessitated the need to get back on your feet quickly and carry on. The ever-present threat of equipment failure and other non-malicious events such as loss of an office and accidental destruction of systems and data prove system recovery is not a nice-to-have, but a must-have.
Some examples include virtualisation that leverages system snapshots to rapidly recover a system and mitigate the impact of a malware attack, migrating to a cloud-based “as a service” environment to take advantage of a service provider’s capabilities, deployment of your SOE from trusted sources to minimise time spent restoring remote systems, and enterprise mobility leveraging virtual desktops from mobile devices and laptops. Anything that gets you back on track and working quickly can be considered.
Some may view this as part of DR/BCP (which it is), discussed in a previous article, but this is a bit more specific, smaller-scale, and agile, focusing in a single system to an entire office if need be. If you lose a file server, you shouldn’t must invoke a DR/BCP, but only a small subset (or even a separate plan) to get it back up.
By the way, don’t overlook vendor support (onsite or remote) to help repair and replace damaged computers and network devices such as switches, routers and IP-based telephones. Sometimes systems are lost beyond repair and must be replaced. Support agreements are worth their weight in gold, especially when it comes to critical, physical hardware!
So, remember that the next time you let the smoke out of a router and your local shop has run out of electrical smoke to replace it J
Where Do I Start? A good inventory of all systems and services is a good place to start, mapping out the dependencies between them all. Sometimes one server can take out a dozen services or a single switch can bring an entire network to its knees. Know what you have, what it does, and what depends on it. There are some cool mapping applications out there that can discover and map your entire network and even monitor devices to alert you to failures. They might be worth a look.
In the case of other services that may continue running and not flag an error (such as one hit by malware which continues running but in a crippled state) you may need to rely on secondary detection of incidents and events. Of course, there is always the phone call to the help desk where a user says, “The Internet is down!” or “My computer is doing funny things” that could alert you to a potential reason to perform a system restore. Intel gathered from logs and alerts, such as from a SIEM or even manually, may provide the detail you need so you can troubleshoot, recover, and resume operations.
While doing your inventory, double check to see that everything is up to date – operating systems, applications, AV signatures, etc. Check the firmware on your devices as well. Check with your vendors and service providers to understand the support in place, such as none, expiring soon, good for x more months, device is end-of-life, and so on. Your vendors and service providers can be your best friends when you need a system replaced and restored…. be nice to them!
With an understanding of what you have, if it’s supported or not, and with a way to figure out there is an issue, you need to revisit your backup and recovery scheme for these. If you have been following our recommendations, you already do daily backups of critical data – it’s in the Essential Eight, after all. What about the operating system and device configurations? You should have a plan in place for each of these, either as part of your DR/BCP or individually, depending on how you do things. These plans must articulate how to back systems up, how to recover systems, and what to do when things don’t go as planned.
Backup everything to a reliable location when it comes to not just data, but configuration of computer, mobility, network, and peripherals. Oh yes – PLEASE test your ability to restore and recover! It needs to work and when the network is down because a router fell over is not the time to test your plans!
For the recoveries that can’t be done via software and configuration, you may want to keep some spares around, either new in boxes, refurbished, or have an agreement with a vendor for replacement. You can determine the criticality from 1-hour on-site to next business day to best effort. Everything needs to work together, and you can’t always count on your local distributor to have parts or systems in stock. THIS is my planning and support is so critical.
How do I make It Work? As you’ve probably guessed, planning is critical. Get that system inventory to figure out what you have, what you need, whether its supported, how you back it up and recover it, figure out how you will monitor your systems for events that dictate recovery, who’s going to support it, and all points in between. Test, rinse repeat!
Another good exercise is figuring out the odds and probabilities of different events and what is needed to recover from the different events. Restoring a server from a malware attack is a lot different that recovering the server from a failed hard drive or accidentally deleted files. There can be knock-on effects that impact other systems, or it can be completely self-contained. Be ready for nearly anything that might require system recovery.
Pitfalls? Adopting a one-size-fits-all approach is a disaster waiting to happen. Larger enterprises often issue singular plans without understanding the nuances of their various operations. Recovering a finance system is far different than recovering an email server, so we can never assume that restoring servers, workstations, mobile devices, laptops and so on is the same regardless. If you decide it’s acceptable to take 8 hours to recover a failed hard drive in a laptop, good luck explaining that to your CEO when Jimmy in the mail room may think that’s fine. Understanding the needs of your business and the criticality of the systems being restored and who uses them is just as important as the technical elements.
Ghosts in the Machine? Be sure that what you’re backing up to use for system recovery or that your “trusted source” is reliable. I’ve seen some systems get backed up in the past that already were infected and, when recovered, resolved nothing. In one case, the infected version had been around just long enough that previous clean versions were no longer available necessitating a lengthy and expensive system recovery and rebuild. Also, be wary of single points of failure. Having that layer of redundancy is gold while you work to restore the failed system.
Anything Missing? Plan and test. Plan and test. Have a backup plan and test. Have a backup plan for the backup plan and test. You MUST be sure you can recover when the chips are down and it all comes off the rails. Often, these individual pieces are part of a larger DR/BCP and you need to be able to recover your systems.
Psssst…….one more thing…….keep your documentation up to date!
Disclaimer: The thoughts and opinions presented on this blog are my own and not those of any associated third party. The content is provided for general information, educational, and entertainment purposes and does not constitute legal advice or recommendations; it must not be relied upon as such. Appropriate legal advice should be obtained in actual situations. All images, unless otherwise credited, are licensed through ShutterStock