(This is an update to a post I made about admitting your mistakes, which you can read here!)
As some of you who read this may be aware, I recently posted on Twitter about an exceptionally bad day I had at work. I had a MAJOR screw up. This isn’t the kind of screw up you can brush under the carpet and hope no one would notice this was big.
We run WSUS to update our machines. In WSUS, we have specific computer groups – we have ours separated by their environment: development, testing or production. This makes it easier for us to deploy to just our dev boxes, then our test boxes and (finally) when we’re happy that the updates aren’t going to break anything, we can deploy to our production machines.
I was removing some computers from WSUS as they were being decommissioned. For anyone who has used WSUS or is still using WSUS, Let this be a warning. I was stupid. I didn’t read the dialogue box that popped up. So when I selected multiple machines and clicked on the little red ‘X’, I blindly clicked thinking I was removing these two servers.
Oh how wrong I was.
That stupid dialogue box wasn’t asking if it was okay to delete the servers. That stupid dialogue box was asking me if I wanted to delete the GROUP the servers were in or just the servers themselves.
Bet you can guess which one I clicked on blindly.
So I watched every single production server (all 400 and something of them) disappear from my WSUS box. To make matters worse, this was on Wednesday…in the middle of production patching week.
My timing couldn’t have been worse.
I was told by someone to not worry, not tell anyone and that it would be fine. This didn’t sit well with me, I’m very much a “share and care” kinda gal and I knew I was going to need help if I was to get all of these servers reporting back to our WSUS server before the end of the day so that those that were due to receive updates the following morning would be fine.
So I did what any self-respecting admin did…had a mini-teary at my desk, worked out what needed doing, then told the people I knew could help what I’d done.
The support I got was absolutely amazing – three of my colleagues immediately volunteered to help add these servers back in. For those who are unaware, this involves running a simple script, but it has to be done on the server…so someone has to jump onto each server and run this. Or so I thought.
Another colleague, who came in late to this whole debacle, looked at me and disappeared for about 30-45 minutes and came over and showed me a script he’d written. This script was a psexec script (how could I have forgotten psexec in my time of need?!?!) that ran the command on every machine in a specific domain. So instead of logging into 400+ machines, I only had to log into about 6 and run this script.
I swear, if I wasn’t already engaged, I would’ve said I’d marry him at that point. Gratitude++. I was ecstatic.
Ran the script – sure enough a few failures, but logging onto 20-30 machines is far better than logging into 400+. And I couldn’t have done it without the help of my colleagues. And I wouldn’t have had that help if I hadn’t owned up to my incredible stupidity and admitted I’d screwed up.
Silver linings do come out of things like this – it turns out that we hadn’t cleaned out our WSUS server for quite some time, and had approximately 100 stale computer accounts lying dormant there. So it gave us an opportunity to clear them out. Though I wouldn’t suggest doing it in the same spectacular fashion that I did.