The past week has certainly given us a lot to talk about in IT. The global outage caused by the Crowdstrike update thankfully had minimal impact on NYU as an institution, though some individuals may have experienced issues. I think there are some lessons we can learn, however.
Follow Maintenance Procedures
First, don’t overestimate “non-impacting” when doing maintenance and updates. Many changes are non-impacting—until they aren’t. NYU IT has established processes and maintenance windows to make sure we do everything we can to avoid a significant outage. If you know them, follow them; and if you didn’t know this, now is a good time to learn.
Make Disaster Recovery Plans
Second, you can do everything right to avoid a disaster, but sometimes the disaster comes for you anyway. This is true with weather, and this is true with IT. We rely on many different partners, and we trust that our partners are being diligent. Sometimes, though, something happens that is outside of our control. This is why it’s important to have good emergency communication and a well-tested but agile disaster recovery plan—since few disasters behave exactly as you planned for them to. Make sure people know it. Make sure it’s stress-tested and regularly updated as needed. And if you have a plan A, it doesn’t hurt to develop plans B and C.
Don’t Skip Updates
Finally, don’t let this bad update news become an excuse for not running your own updates and keeping systems patched. What happened last week was an extraordinary case. Run your updates responsibly but DO run them. Build this practice into a routine maintenance schedule. I guarantee you it’s better than dealing with the fallout from a successful ransomware attack.