Ed Toner

Plan to Fail(over)

I remember being asked a few years ago, “What are you going to do if we need to failover to our disaster recovery site?” My quick but not very politically correct answer was, “I will be working on my resume.” We did not have a plan to failover. We had not provisioned the disaster recovery environment in a way that replicated our production environment and it would take days to build it out.  So we lacked a plan

As with most companies we had a secure copy of our data backed up at an offsite disaster recovery site nightly, and many people (yes CIO’s) misguidedly believe that routine data backups have them covered in the event of an outage or disaster. Data backup and disaster recovery are not the same. Backing up your data without having a recovery environment that reflects your production environment is almost the same as not backing it up at all. The servers, operating systems, external and internal connections, storage etc. must be in place. The people, processes, and tools necessary to successfully restore and recover are often an afterthought that you must deal with unfortunately for the first time in the middle of an outage or disaster. Failing to deploy appropriate high availability strategies for critical applications results in significant loss, either in cost of downtime, loss of data or impact on the organization’s reputation and customer experience. In the past two years we have read about outages that have lasted days. Delta Airlines and British Airways data center outages lasted nearly three days each and cost more than $100 million in lost revenue.

Ed Toner standing by servers

Business Continuity and Data Availability

The OCIO made a decision early in my tenure to eliminate its disaster recovery site in favor of two active data centers. This decision was made due to several reasons, the most important being our Availability. A secondary reason for this decision was that the existing disaster recovery site was not a realistic option in the case of an actual outage or disaster. The main purpose of having a disaster recovery site is to recover quickly to ensure that the operation of the business continues with the minimum impact to availability for our customers.  While it may be desirable to recover all applications as quickly as possible, the recovery process has to be prioritized against those applications and services that are most critical. This is in line with our Application Portfolio project I discussed in another blog.

With a primary focus on availability, a traditional disaster recovery site is no longer good enough.  Critical applications such as those that are hosted on the State’s Mainframe must be accessible at all times for the State to successfully serve its Citizens efficiently. The State of Nebraska addressed this by implementing a design that allows its mainframe to move from the primary site in Lincoln to the secondary site in Omaha in case of a failure at either site. This was tested successfully on Saturday (4/28), with the assistance of testing teams across the State. Not waiting for a disaster to occur, the State proactively moved between sites to ensure the citizens of Nebraska have access to business-critical applications at all times with limited interruptions or downtime.

Making certain that critical applications have the ability to failover to a secondary location, means we are protecting our business from data loss. This solution provides real-time replication of all data in a separate recovery site with point-in-time recovery, which enables us to restore an earlier version of our customers’ data.  Delivering application high availability across interdependent systems is increasingly difficult. Designing an application for continuous availability begins with network architecture, which the State of Nebraska redesigned as part of their consolidation efforts.  The new configuration allows the State to deliver high availability services for citizens in the dynamic and agile 21st century.

Thank You

Thanks to the Agency and City Teams involved on Saturday:

Nebraska Department of Transportation ● Department of Administrative Services ● State Patrol
Department of Labor ● Department of Motor Vehicles ● Department of Correctional Services
Department of Health and Human Services ● Department of Revenue ● City of Lincoln ● The Office of the CIO

As always, I appreciate all you do for the citizens of Nebraska and a special thanks to all the teams that supported our efforts on Saturday.
Ed

Blog Home