400 Servers In Under 4 Hours

In the immortal words for Forrest Gump, “it happens.”  In technology, truer words might never have been spoken.

But when “it” does happen, Azzaron’s team of system admins and network of support personnel work very hard to ensure “it” has little impact on end users.

Responding to a critical outage of its entire storage infrastructure, Azzaron engineers were able to shut down, repair, and restore over 400 virtual servers in under four hours!  The coordination involved the installation and subsequent upgrade of an entirely new storage rack system and a full upgrade of Azzaron’s switch infrastructure.

On Monday night, December 26th, Azzaron experienced a catastrophic hardware failure affecting 60 hard drives at its primary datacenter in Phoenix.  Azzaron employs numerous redundancies for several different types of disasters.  One eventuality that was not accounted for was literally the failure of the metal case that holds the SAN (Storage Area Network) drives.  That failure resulted in immediate downtime for many of Azzaron’s customers, which currently span 15 states.  Responding to that crisis, most primary services were recovered and—thanks to a robust backup and disaster recovery plan—no data was lost.

Working with support teams from around the world, the determination was made that the enclosure had ultimately failed.  Since Azzaron keeps all of its storage hardware contracts current, a replacement unit was sent out for early next-day delivery and the plan was formulated.  (Those plans were thwarted by a FedEx airplane mechanical failure, grounding our parts…but still, Azzaron customers could work).

The parts came in none too soon as additional failures of the dying system rocked the network all day Thursday.

On Thursday night, December 29th, the work started.  Taking care to preserve physical drives and data, techs replaced the hefty storage enclosure; Azzaron employees literally did the heavy lifting of these gigantic units.  In under four hours, more than 400 servers were cleanly shut down and brought back online for Friday’s business day.

Google, Microsoft, Amazon, Dyn, SalesForce, and other all experienced significant service outages in 2016, in some cases with data loss.1

Hardware will fail.  “It” will happen to all businesses.  When it does, you need an infrastructure partner with the skill, expertise, rapid response, and personal touch of Azzaron.

 

1 - http://www.crn.com/slide-shows/cloud/300081477/the-10-biggest-cloud-outages-of-2016-so-far.htm

Is My Internet Fast Enough?
 

Comments

No comments made yet. Be the first to submit a comment