A Crash Course in Network Reliability
November 12, 2010
Late last month, the popular e-commerce service, PayPal, was knocked offline for approximately 80 minutes after undergoing what’s commonly referred to in the industry as a “crash.” As PayPal puts it, “a network hardware failure in one of our data centers resulted in a service interruption for all PayPal users worldwide.” As a result of the initial hardware failure, standard backup procedures (in this case, to route service to a separate data center facility in Denver, CO) also experienced difficulties before the site was up and running again.
When you consider the scale at which PayPal operates, to the tune of 87 million active accounts worldwide, it’s easy to see why such an outage makes large ripples across the IT industry. Yet it’s important to remember that businesses of all sizes must deal with network reliability and troubleshooting issues on a daily basis. While the specific nature of PayPal’s incident has yet to be revealed, it does help bring to mind the basic ways in which you can safeguard your data-rich network infrastructure. Bear in mind, while outages like this can and will happen, it’s good to know that you can take steps to minimize impact, including:
- Power Management – ensure the safe and reliable distribution of power to the data center using Power Distribution Units (PDUs) capable of providing real-time feedback
- Cable Management – poor cabling in the data center can lead to a loss in the integrity of the data being transmitted – it pays to plan ahead
- Environmental Monitoring – closely monitor and oversee the data center’s real-time atmospheric conditions, including temperature, humidity, smoke detection and more
- Security – everything from data center room motion detection to keyed cabinet locks and more helps combat illicit data and/or network sabotage
There’s no denying that as the volume of data in our digital lives continues to increase at an exponential rate, both factors in and out of our control will impact our usability and accessibility to that information. Understanding what we can do before and after the fact is critical to better aligning our data centers for future optimal performance – after all, we’re all in this together!