What happens when Murphy strikes…

dsl-speedstream6520Ahh, the joys of the Internet.  After suffering 3 days without real Internet access, I can say it’s good to be back again.  My Business Telco DSL provider had a 3 day outage.  Now, if this was the height of summer, and I wanted to spend more time outdoors, this would have been perfect.  Not so fast in this case.

When I designed the infrastructure for ITInTheDataCenter.com, I knew the WAN would be the weak link.  Only one connection, it’s all my eggs in one basket.  About 2 or 3 times a year, it goes away, so it’s usually tolerable.  But it’s usually for no more than 4 hours either.  This time for 72 hours, that’s a little much.

So, I managed to procure a back-up low-speed connection for occasional use.  I decided on Rogers Portable Internet.  It works.  The WiMax modem is a little strange, in that it only likes 10mb half internet Ethernet connections, but otherwise it works ok. For 40$ a month (that I’ll only activate when needed) it’ll save me headaches when travelling, or when I need access out with customers.

The moral of this story… Murphy will strike, it’s just a matter of when.  Always have a backup.

Active Directory and BSOD

error_buttonEver had one of those days?  During routine maintenance to move some VM’s around to different disks (in an effort to get ready for some new storage), my Active Directory system went down, hard.

The volume was migrated using storage vmotion correctly, or so I thought.  I went to reboot the server after the move to test, and about 10 seconds of getting into the desktop, Windows 2008 BSOD’s with a error message “A device attached to the system is not functioning properly”.  so, I boot off the second plex of the mirror.  Same thing happens.  Now, this has me concerned.  Typically booting the second plex gets things going.  This was more fundamental.

I booted into Directory Services Recovery Mode, and hunted through log files and event logs.  I carefully removed each error as it came up, and I decided to sleep on it and just rebuild the disk plex — just to be safe.  What concerned me was it would boot into DSRM, but not into Safe mode.  Definately something was up!

Rebooting in the morning with a new plex did not fix it.  At this point, I started going through some MS material, and noticed that even in DSRM, Active Directory should start.  In looking through AD’s event log, it had the error, “The log file is corrupt” and would not start AD.  I’ve seen this before on Exchange, so, I tried to repair the AD logs.

Once I removed the corrupted log files, and rebooted, the system came up, and is working properly.  How a volmgr error and AD are related, I’m not sure.  Sometimes it helps to sleep on it.