What is configuration management and why you need Ansible, Chef, Puppet and others

Imagine you've got a server.

Maybe it's a bare metal machine, maybe it's a virtual machine.

It could be running on your laptop, or it could be located somewhere else - in a data centre of your company, or at one of the many infrastructure providers.

Having a server that does nothing is boring.

You need to install something there, like a web server. So you SSH to your machine and you run some commands:

yum install httpd

systemctl enable httpd

systemctl start httpd

But you also want to customize your webserver, so you go the directory that stores the configuration files of httpd and you start editing these files.

Then you realize, that you also need to open the firewall, harden your ssh configuration, create some user accounts and so on.

You end up executing lots of commands and modifying lots of files.

Which is totally fine, as long as you have just one server and you don't plan to do all of this configuration ever again.

If you need to do it again, then you save all the commands and configuration files into a nice collection of shell scripts.

You can re-use them to configure another server, but you can't really re-run them on the original server. Unless, of course, you put an effort into crafting idempotent shell scripts that most likely only you can understand.

Now imagine you have two web servers, a load balancer in front of them and a database behind them.

Each of those require different configuration, so you need to triple the amount of shell scripts you have.

And what if you need to multiply this by 3, because you need a production, a staging and a testing environment? That's, basically, one huge pile of shell scripts to maintain and a lot of servers to run these scripts on.

The approach that worked fine for one machine becomes very inconvenient when you have five machines and impossible to follow when you have hundreds of them. So at some point, you start thinking: is there a better way to configure my servers?

And there is. It's called "configuration management tools".

Configuration management tools, like Chef, Puppet or Ansible, provide you with mechanism to easily configure and continuously re-configure any number of servers.

Instead of writing imperative shell scripts, that tell the computer what to do, you are using a special declarative language.

This language allows you to describe the state of your machine.

You are not telling the computer what to do, like "create me a directory owned by this user". You are just describing that the directory with this name and this ownership should exist.

It's the job of the configuration management tool to figure out how to bring your server to the desired state.

It will automatically check if directory exists or not and if any of the attributes of this directory are not the same as you described, the tool will adjust them to match the desired state.

You can re-run this configuration code as many times as you want.

It's best practice to run your configuration code on schedule, for example every 30 minutes. This ensures the consistent state of your configuration. Even if someone manually changed your server, the configuration management system will roll those changes back.

Most of the configuration languages have a way to group the code into some type easily re-usable packages. Chef has Chef Cookbooks, Puppet has Puppet modules and Ansible has Ansible Playbooks.

Those packages can be used to configure particular software. For example, you could have a Puppet Module dedicated to configuring HAProxy. Every proper configuration management tool has a public repository of community packages, so you don't have to write all the code from scratch.

The declarative configuration language is just one part of the configuration management system. You still need a way to execute your configuration code.

Most of the tools require you to install an additional software on your servers. For example, Chef needs Chef Client and Puppet needs Puppet Agent.

Once this additional software is installed, you can normally use it to apply your configuration code.

The problem is that you need to bring this code to your server and trigger the execution of the configuration code on your own. This could be done in many ways, for example via SSH, Kickstart scripts or Cloud Init. You can set up an automated process to pull new configuration code and re-apply it.

But some of the most powerful features of configuration management systems are provided by their server-client variant.

In this case, you have an additional central management component, that keeps track of all of your servers.

You can see the configuration status of each server and various reports.

Instead of distributing the configuration code on your own, this central component will take care of sending configuration to each machine.

Such a central component might have features like secret management, role based access control, hierarchy and groupings of your servers, on demand task execution and many others.

Most importantly, this central component serves the role of a central inventory of your servers. Among other things, it allows you, for example, to get the list of application servers and use this list during configuration of your load balancer.

Examples of such central management components are Chef Server, Puppet Server and Ansible Tower.

Configuration Management Systems are complex beasts. They allow you to treat your infrastructure as code and develop it like a software project. At the same time, those tools do not require you to be an experienced programmer. Learning them might be a perfect first step to move beyond traditional system administration.

Not everyone likes it, but Configuration Management is often associated with DevOps. When people talk about DevOps as a profession, they normally assume the knowledge of at least one of the configuration management systems.

Finally, with the advance of cloud services, there are now alternatives to the configuration management systems.

If you would like to learn more about those alternatives and when you should and should not use the configuration management tools, please leave a comment below.


Here's the same article in video form, so you can listen to it on the go: