Backup, DR and BCP for Dummies

In the first part of this blog we discussed some high level concepts around backup, DR, DR planning and BCP, we also explained the use cases for each and how each part fits into the BCP ecosystem. In part two I would like to expand the discussion around each disaster scenario and provide readers with some considerations for each of these.

As discussed in part one, the three most common IT disaster scenarios we plan for are:

An event occurs that renders IT systems unavailable for a sustained period
An event occurs that renders the businesses main offices inhabitable or unavailable
An event occurs that stops users connecting to IT resources and data repositories

The first scenario has traditionally been the one that most IT teams plan for when considering DR. Events such as Crypto attacks, server failure or corruption and natural disasters like the Christchurch earthquakes are common causes of IT system failure. One response to recover from these scenarios is to restore from backup. But is the downtime and the potential data loss from this approach acceptable to the business? If the answer to that question is “yes” then robust and regularly tested backups are probably all your business requires in this scenario. If the answer is “no” then below are a couple of bullet points that should be considered as part of your IT DR planning.

How robust is your physical hosting environment? Moving your infrastructure to a purpose-built IT Datacentre that has the best practice resilience for power and cooling and has been strengthened to withstand natural events is recommended.
Do you have enough capacity within your infrastructure to cope with a single piece of equipment failure? Moving to a cloud model such as vBridge’s IaaS platform, will ensure that your environment is not susceptible to single points of failure
Do you have replication enabled within your environment? Cloud service providers such as vBridge can easily replicate your workloads to a different geographic destination and infrastructure stack. With the right planning around networking and connectivity recovery can occur within minutes of a disaster occurring.

The second scenario above, is a situation that we have all had to respond to within the last 18 months, with the advent of covid-19 and the subsequent lockdowns that saw many of us working from home. There are many remote working solutions available, such as VPN’s, Remote Desktop servers and virtual desktops. These solutions, although fit for purpose are somewhat legacy, and it may be worth considering what a modern desktop experience would look like in your organization. The advent of services such as OneDrive, Teams, SharePoint online and cloud-based SaaS applications can make remote working far simpler than some of the legacy methods described above.

Many organisations had to implement remote working solutions at pace at the time of the initial lockdown, and potentially did not plan the solution as well as they normally would have as the business needed it quickly. Therefore, we encourage IT teams to do a retrospective examination on how your business’s solution went over that period, and what you could improve on, if and when a similar event occurs in the future. One strong recommendation is to examine the security of your work from home solution, and ensure you are applying the same security methodology to home users, as you do with the head office infrastructure, as much as practicably possible. Be aware that bad security actors are aware of the rise of work from home solutions and the potential secure issues that accompany this, they will target your work from home users.

The third common scenario is probably the simplest one to plan for, and it is all about your network and connectivity. Consider your network infrastructure, and how resilient it is. It goes without saying that in this modern world of cloud-based infrastructure, that external connectivity becomes more important than in the days where everything was onsite and accessible over the LAN. We are very fortunate here in New Zealand with the wealth of fibre riches we have in this country, and also the price of UFB connectivity. These factors make resilient networking much easier than in other countries that do not have our network infrastructure. Ensure you do everything you can to mitigate network outages. This includes redundant fibre to your key sites. But also consider other backup connectivity options, such as copper based backup circuits, 4/5g mobile connections and even a satellite-based connection such as Starlink. Also consider your on-premise network infrastructure and ensure the design is resilient and highly available.

In summary, access to our data and systems is more important than it has ever been, without access to systems and data most businesses cannot survive for a longer than a short period. For IT leaders your challenge is to plan what this looks like for your business and ensure you have the whole business’s buy-in for the plan. Implement and test your plan. Testing your DR systems ensures “muscle memory” and will stand you in good stead to respond and deliver for your business when the pressure is on. As always we at vBridge are happy to assist you in your planning and delivery of your DR solutions.