Hurricanes, Storms and Mother Nature can make you review your power

Hurricane Sandy made me think long and hard about my power stratagy for the servers I maintain. Not to go into the deep infrastructure suffice it to say that I have 17 remote servers plus 8 internal servers under my watch. My internal units are on an UPS and also that Data Center is backed up by genrator. So internally I have my belt and suspenders to keep things going.
 

At my remote sites I have all of my units on UPS units with appropriate management software so if power goes away the units will shut down gracefully. The UPS units also condition the power to protect from spikes and brownouts. So enter Sandy. As I watched the report of her path I figured I am okay based on the above so they should "weather" the  storm, no pun intended here. But along with these 17 remote sites I have 17 remote administrators that tend to think out of the box at the worst times. My cell phone goes off and it is one of the sysadmins saying they are going to turn off my server and their own since the storm is coming and they want everything to be shut down. I ask about my unit and now find out that he took it on his own to change the configuration of the power to his Data center and non of the servers actually talk to the UPS units. He says he has ample run time to shut them off. I try to find his reasoning but I am lookiing at a storm hitting in 10 hours so I will leave it for after the storm.

I get another call from another site asking what should she do. Shut down or ride it out. I asked the question if the my server is still connected to the UPS. It was and the UPS was functioning. So I said why take away your safety net. If the other units were not on an UPS I would shut them down but hopefully she had a good backups in case they did not start up.

Well Sandy hits the Jersey coast and steams toward the Philadelphia area. I don't lose power and neither does my internal NOC. I don't even lose Internet. The City of Philadelphia shuts down for three days and from my alerts I see that I have four remote units that may be out or no connectivity. Thursday I have two out. The one unit is the one where the creative sysadmin powered off everything. That place did not come on line until the following Monday. Luckily my server powered up fine.

I did have one casualty and I am not sure what caused it. At one
location the admin decided to walk in and power things off except that he just hit the power button on my unit. No call to me, although he was
trained on how to power the box down. Anyway late Thursday night I get a call saying my server's screen says Insert Boot disk. Way to go. Come to find out that they lost power Monday night along with heat and we had a cold shot with temperatures dipping to 45 degrees. The server room has a wall of windows so the room temperature was close to the outside temperature. Perfect for a server that was up for 8+ months. After a little coaxing and getting the room back to 65 degrees it started to work and eventually came up.

This begs the question. Do you power down your servers before a major storm and run the risk of them not starting again? I know from this case I now have to rethink my power stratagy and school my remote admins on the proper procedure but powering them down, to me, seems to be more of chance for disaster than letting them run until the power goes out and the UPS software and UPS unit shut it down gracefully. Also by turning things off don't you increase the risk for power surges as the
electronics in the UPS units that are there to protect you get put out
of the mix?

Just to let everyone know I used Big Brother monitoring to monitor these servers and they sit spread out over a 5 county area. They only do one application, no email, web servers, just one application so the majority of the time they idle along.