Hell of an annoying "server down" 911 call yesterday (Saturday)

YeOldeStonecat

Well-Known Member
Reaction score
6,546
Location
Englewood Florida
So yesterday I hop on the Harley and take a ride over to visit my parents....been quite a while since I've seen them, plus being labor day weekend..proper thing to do is visit family. Not to far...just under an hours ride.

Just get there...mom showing me a few things with their property...my phone rings, thinking it's my wife updating me on some stuff she was doing...I answer it without really checking on who is calling, and it's one of my clients that's a site management office. Particular client gets hot if anything so much as farts on her network. Matter of fact, just shook hands with them on Friday for a brand new server to take over their current 3 servers (will hyper-v them). Their current hardware is getting near 7 years old...I've been begging them for a couple of years (I don't like to let servers get over 5 years old).

Surer than sh|t...it's freakish I tell ya...I've seen this before, soon as you sign the deal for a new server replacement..the old server seems to know about it and start giving you grief. Even though it's been doing its thing fine for 5 or 7 years...it starts flipping out as soon as it hears about being replaced.

So I'm on the phone with her...she's just telling me her Outlook was giving an error..couldn't connect. I have her walk over to the server...shake the mouse..."What's on the screen?" "HP Proliant...looks like some DOS like screen! Fans are loud". OK...power it down...lets wait a minute...power it back up. She it begins to boot...see Windows Server start....see applying network settings....and then she tells me it clicked and went back to the HP Proliant screen. "Ugh" I think to myself. "OK I'll be over....be there in about 45 minutes....power off the server til I get there."

So much for my visit. Hop on..ride over.

So the server had some “thermal issues”….she’s get hot (or think it was getting hot)…fans would start screaming, red light up front would go on, and the server would do that thermal protected reboot. Was a sudden reboot for the OS like you pulled the power cord, not a graceful shutdown. That server has a recent history of fans that would scream now ‘n then. So when she called, had her try to reboot it..soon as it got near applying computer settings..”click”…reboot. Once got to login and desktop but since infostore ‘n other stuff still loading…fans screamed, click…reboot. So I check safe mode....just to make sure the OS and data and RAID volumes are all fine, got into safe mode fine. I always breath a sigh of relief when I go up to a crashing server and can at least get to safe mode. Got into safe mode w/network support fine. So really started thinking it was just a load/thermal issue.

So I went into the BIOS to see if there was a way I could shut off that damn integrated thermal switch. I found a setting where I could change it from the default “kill power-reboot”….to “wait 10 minutes and initiate a graceful shutdown of the OS”. Tried that…no change, still just a “click” shutoff just before login. There was no “turn off this thermal protection” thing. But I did find some processor settings. From the default “full performance”…to “dynamic power”…to “low power”. I went right to low power on the processor….which it warns will slow down CPU intensive applications. And I also disabled “Hyper-threading”. She was a dual socket Xeon server. Booted up the server…she booted up with minimal to medium fans spinning…..and remained that way, the fans never got to screaming like before. For good measure I also grabbed one of the girls desktop fans…stuck it on the floor in front of the server…right in front of the hot swap drive bay, blowing in.

So there went almost 3 hours of my holiday weekend Saturday. Todays weather looking worse....hopefully clear up a little this afternoon to get a chance of a good motorcycle ride.

So I'll have to rush this server replacement....and I already have another big server migration down in NYC scheduled for the latter 2 weeks of Sept. There's 2x good server jobs for the month!
 
Yikes man, glad you got it figured out. Hope the weather is clear there today a d you can enjoy some time off!
 
Do they have a dedicated AC unit for their server room?

Typical large accounting office, server in sorta central area....ambient temp of room is fine, the usual air conditioned office settings in a large professional center. She wasn't actually overheating....just faulty sensors that "told it" it was too hot.
 
Whelp...just finished my Quickbooks quotes for him....e-mailed over already. Hopefully pickup a check for a new server by end of today or tomorrow.
New Proliant ML350 G8, 32 gigs, 6x core Xeon, 6x 2.5" 300gig 15k SAS drives, 1 gig RAID controller, iLo advanced, redundant PS's.

He tells me he also wants to go "dual monitor" at the office...so added 20x monitors and some labor for that.

New HP ProCurve 1810-24G switch to replace his old Stinksys SRW2024

And enough "after hours" labor for the server P2V migration to let me buy a few upgrades for my Harley. :D
 
Back
Top