I read the following blog post about automated testing in software projects which made me think about how “monitoring” is the system administration equivalent. I’ve never thought about this analogy before, but I think it’s valid and I might try floating it at work, seeing as we’ve been doing some monitoring enhancements lately. (More below the cut…)
I also think that many of the arguments that the author deals with about why people don’t do automated testing are also valid for monitoring. Too often I’ve seen proper monitoring pushed to the back burner because it’s hard, we don’t have time, or “because the last thing we need is more email.” The concept of using monitoring as a continuous testing service to validate your changes against the environment in order to reduce alerts, breakage and general frustration seems odd to some people. That monitoring isn’t just ping time latency or pulling disk space utilization with SNMP but testing your assertions about service functionality routinely blows people’s minds. [1] At least, these are all things I’ve heard before.
Sadly, I never seem to have a good and articulate response other than “But it makes your life so much easier!” So, how do you evangelize monitoring?
—
1 – Well, probably not for LOPSA members, the choir to whom I am preaching. But I assert that LOPSA members are a small, distinct minority of the people out there doing systems administration. I actually met a consultant/VAR of a certain enterprise monitoring system who believed that a SNMP/WMI process check was the best way to see if a service was up and running, and that anything further was “user experience monitoring that few people really do.”