First thoughts on bcfg2

In my last post I mentioned starting a quick eval of the existing config management tools. I ended up with bcfg2,so I got to spend some time Friday starting to look at it. Armed with the SAGE [short topics book|http://www.sage.org/pubs/19_bcfg2/] and the online docs, I managed to get it installed and doing some stuff. These are just my initial observations based on a couple of hours poking around.

====What I like====
Everything seems to get logged to syslog (or at least a large amount of stuff)

bcfg2 has some neat ideas compared to its contemporaries. For example, it looks like you can pull the version of a config file off a managed machine and into your config management repository. Still have yet to get this working though …

There is an interactive shell that has the same environment as the server. This lets you do things like show which configs a particular client will receive. Likewise, configurations appear to be built entirely on the server before being distributed to clients. I can see where this will help a lot with reporting, and determining what will/should happen on a client.

bcfg2 has an interactive mode on the client. This has the potential to make it easier to sell to other admins at the site. One of the complaints I often get with cfengine2 is that folks don’t know what it’s going to do, and that makes them wary of putting it on new machines. With interactive mode, they can iteratively refine their configurations until they’re completely integrated into the config management infrastructure.

Unlike cfengine2, all you need to bootstrap a bcfg2 client are some command line options to the client. For some reason, this strikes me as easier than having to copy out an update.conf.

Supposedly there’s an extensive statistics/reporting mechanism. I have yet to get it to work (see later about python tracebacks).

====What I dislike====
The RPM spec file lacks some important dependency information. It will happily build and then not run correctly. In my case, I didn’t have a requisite python SSL library installed. From the little digging I did, it seems that you either need a newer version of python than comes with CentOS 5, or one of two different extra SSL libraries for python.

SSL error messages suck. However, this is a problem with OpenSSL and not just bcfg2. For example, I spent an hour tracking down two different problems. The first, I was working on totally unconfigured VMs (to see how easy it was to get bcfg2 to configure them). Turns out my client had a lot of clock skew. All I got for an error message was an invalid certificate error. I had to run an “openssl s_client” to get the message about the server certificate not being valid until some time in the future. The second may be more of a bcfg2 problem. Your server cert has to have the FQDN. This one was my fault, though. This was in the docs, and I just missed it.

While trying to debug some other issues, I ran into a mental block. Normally, if I’m trying to get a new program running, and it dies (i.e. segfault), I’ll start up gdb and start poking into things. When bcfg2 dies, however, I get a python traceback, and for some reason this is more intimidating. It shouldn’t be, I’ve worked with python in the past and know how to navigate my way through it.

The python startup overhead for running some of the commands seems to be pretty large. I’m not sure yet is this is because of bcfg2 itself, or limitations of the VMs I’m running.

Documentation is a bit scattered around, and doesn’t necessarily match. I think a lot of this could be because a lot of the docs were written in the 0.9.x era, and I’m working with 1.0.x. In particular, the documentation on the Statistics plugin doesn’t seem to jive with what I’m seeing on my system.

The package management features seem to need some work on x86_64 systems. The docs indicate that these are mostly issues with RPM and not bcfg2. More investigation is required.

====Conclusion====

Obviously this isn’t a full review, just my experience after 4-ish hours of playing with bcfg2. I’ll post some updates later this week when I’ve had a chance to work with it more, and ask some questions on the mailing list.