On Wednesday, various computer glitches resulted in the cancellation of more than 800 flights by United Airlines, as well as delays of hundreds of other flights, the closing of the New York Stock Exchange for roughly four hours, and similar problems at the Wall Street Journal. While three such occurrences in one day appears to be unprecedented, it should hardly come as any great surprise.
Our technological world runs on computers, and almost every business of any size is hampered, if not brought to a halt, by anything that crashes its computer system and online/internet communications. Equally to the point, all too many businesses do not to have back-up plans/systems because: (a) they can’t or don’t want to spend the money; (b) back-ups aren’t technically feasible, usually because alternative access to the internet isn’t available; or (c) no one even considered the necessity.
Most of these problems get back to money. As I noted before, several years ago, one backhoe in the wrong place knocked out fiber-optic internet access for much of Southern Utah for over a day. Did anyone even think about a parallel line? Hardly.
Then add to this the continual changes and upgrades to computer systems. Some companies can barely keep up with upgrading one system, and if the back-up systems aren’t upgraded and maintained as well, then they’ll soon be useless as well.
And what about all that data? Is it backed-up and stored elsewhere? Just how reliable and accessible will it be if internet connections are disrupted? Yet, if it’s onsite, that’s a different vulnerability. What tends to be either overlooked or minimized is that the world wide web not only maximizes opportunities, but also maximizes vulnerabilities, and minimizing those vulnerabilities takes time, resources, and money… and those measures don’t always work, as Wednesday’s events just proved.
Good computer systems can multiply advantages, but those systems are anything but as cost-saving as too many individuals and businesses seem to think. Cyber cheap is courting disaster…but I’d wager that lesson will be lost on too many CIOs and corporate managements.
IT resilience is indeed an issue, and one which rarely justifies the expense. “Guaranteed 99% uptime” means you are permitted 87 hours per year before penalties apply (365×24=8760). Move this to 99.9% and you are down to 8.7 hours. The IT and associated costs escalate dramatically to achieve this. Even so, being without service for 8.7 hours a year can be annoying.
Achieving 99.99% means hot standby computers, physically diverse comms routes etc. Lots of testing. Very expensive indeed, especially when – as you say – one component in the end-to-end story needs to be upgraded.
In your example of parallel fibre routes, the cost of a diversely routed cable could never be justified by small business and domestic use alone. Todays comms technology (eg SONET/SDH) does however permit alternative routing at high bandwidths which can affect performance as all traffic has to use one route – but at least it is operational. If the whole of South Utah had no alternative route then someone has decided to keep costs down and risk the isolation of half a state. That did surprise me.
Only the military and major financial institutions such as stock exchanges can justify and afford high resilience. We domestic users need to recognize that cost equation unless we want to pay lots more for guaranteed service.
With five computers @ home, plus a couple of mobile devices, running various OS’s and versions, I find myself updating regularly. I know my way around all of them pretty well, and have made it fairly easy for myself (although reducing the number of OS versions would help, but there are reasons other that just laziness not to); but if I fall more than a couple weeks behind on something, it’s a mess. For instance, on of my Macs can dual-boot Windows, for the rare occasions I need it enough to put up with it. But as rare as those are, the first thing I end up doing is spending at least an hour updating it.
That’s all _very_ easy compared to enterprise level situations. I don’t know how private industry is, but in government, if it’s operational, it’s probably “legacy” and not sexy compared to what’s not yet ready to replace it, and there’s little budget for either the tools or the expertise to manage it all effectively. It does help that there are more open-source tools available for managing a lot of systems; but open-source is NOT truly free, because it’s more labor-intensive and does not include support (unless you pay for that).
There was some discussion back in March that the open-source NTP time synchronization protocol software, essential to much of the Internet, was in danger because the primary maintainer of the software couldn’t afford to spend the time on it unpaid. Someone from Apple allegedly said that they couldn’t pay because how do you tell your accountants you have to pay for something free? Of course, it’s cheaper to pay the existing experts than to build your own expertise, and if the guy had withheld Apple-specific fixes pending payment, they could probably have justified it.
It’s a strange world, and people (as an unwarranted courtesy, including lawyers, accountants, and bureaucrats in that category) need to adapt. 🙂