I survived the weekend… thus far!

So it has been quite the weekend. Originally my plan going in was to sleep from Friday Night -> Saturday about mid-day, do some cleaning around the apartment, rest some more watching a movie, and then maybe hit up church with the rents…. sadly it didn’t happen as I planned.

Friday Night 11pm

So I see servers start going down, and coming back up – dug into my email, sure enough overlooked a notice of some maintenance – it was supposed to be brief, so I thought nothing of it, nearly midnight west coast, and for most of my customers non-stateside, not a big deal some minor outages as networking goodies are tested/replaced – there was a second note about a server reboot required of a node I am on, but last 2 reboots took no more than 30m total before I was back online 100% so again, not huge outages, so even though I had missed sending out a notification, not a huge deal right?

Saturday Morning 1am

Started noticing that a grouping of servers were not coming back online, did some tests, monitoring from Pingdom confirmed it, and I contacted the provider to check in – see what was the reason for the extended? They assured me it was part of the plan:

We are aware of the intermittent issues with connectivity as part of the scheduled network maintenance and we are working to resolve them as soon as possible. You should see connectivity restored soon. I am sorry for any inconvenience that our scheduled maintenance may have caused you.

So thinking that “soon” would indeed be soon, I gave it some time.

Saturday 2am->3pm

So the journey continued, after I gave it an hour tops of “soon” I get back in touch, I am given the quick note of “I am working with our network team to provide you with some solid info.” – so not much happens from there, but the only downside is they have a 6hr window of maintenance, so not a ton I can do, and I start wishing I had let folks know and that I had noticed the email, but around 4am some people started to try and track me down, one customer even had copied the Google Voice # from the website I slipped in and gave me a call asking what was up, at that point not a lot to tell him, as I could see his website, but he couldn’t.

The journey continued with 2 issues in total:

  1. One of the nodes that was being rebooted was doing a filesystem check on its storage array, so not your common 250GB hard drive at home that takes 30 minutes to ScanDisk, but say 1-10TB most likely, so to scan that quite a bit more time – which lead to over 13 hours of total outage , thankfully only about 10 of my own customers were effected, the other server included quite a few more :(
  2. Network issues on various servers were causing routes to be incorrect downstream – while I could see Shepherd, customers back east could not, and monitoring reported enough outages to classify it as an outage. Not good. But they fixed that around 10am.

So it was quite a nite, I could of gone to sleep I suppose, but I was getting emails and calls from customers, not about to just block them out and leave them guessing – but I am thankful that in the end, no customers were lost – as one of the first major outages in over 2 years, most were very patient about it, some were a little upset, but nothing I could do but ask for their prayers – and it all worked out.

I am going to be making some changes to how the primary business websites function – might off-site the blogs and push forward on launching the fB and Twitter pages more publicly – but all in all I am happy its over and done with – and going to get some more rest this evening I hope – but for those wanting a more detailed rundown of what happened – there it is, well at least most of it, that I can recall – I will add in the aftermath, my cousin took me to dinner @ Marie Calendars (Chicken Ceaser! Woohoo!), Target, and RiteAid for some Thrifty Ice Cream (YUM!) so it all ended well with me sleeping on the living room floor as she watched Stargate SG1 :)

So that’s it – I am off to find dinner, and watch The Soloist. Happy Sunday All.

No Comments
August 9, 2009 in General
Tagged , , , , , , , , ,

Leave a Reply

You must be logged in to post a comment.

Using Gravatars in the comments - get your own and be recognized!

XHTML: These are some of the tags you can use: <a href=""> <b> <blockquote> <code> <em> <i> <strike> <strong>