Executing your configuration
I run into this blog post talking about using a real programming language for defining your configuration. I couldn’t agree more, I wrote about it 15 years ago. In fact, I agree so much I wrote a whole book about the topic.
Configuration is fairly simple, on its face. You need to pass some values to a program to execute. In the simplest form, you simple have a map of strings. If you need hierarchy, you can use dots (.) or slashes (/) for readability. A good example is:
As the original blog post notes, you also need to have comments in the format, if the file is meant to be human readable / editable. From that format, you can transform it to the desired result. Other formats, such as JSON / YAML / XML are effectively all variations on the same thing.
Note that configuration usually takes a non trivial amount of work to properly read. In particular if you have to run validations. For example, the port above, must be greater than 1024 and less than 16,384. The log’s level can be either a numeric value or a small set of terms, etc.
The original post talked a lot about reusing configuration, which is interesting. Here is a blog post from 2007 showing exactly that purpose. I’m using a look to configure an IoC Container dynamically:
However, after doing similar things for a long while, I think that the most important aspect of this kind of capability has been missed. It isn’t about being able to loop in your configuration. That is certainly nice, but it isn’t the killer feature. The killer feature is that you don’t need to have complex configuration subsystem.
In the case above, you can see that we are doing dynamic type discovery. I can do that in the INI example by specifying something like:
I would need to go ahead and write all the discovery code in the app. And the kind of things that I can do here are fixed. I can’t manage them on the fly and change them per configuration.
Here is another good example, passwords. In the most basic form, you can store passwords in plain text inside your configuration files. That is… not generally a good thing. So you might put them in a separate file. Or maybe use DPAPI on Windows to secure them. Something like this:
I have to write separate code for each one of those options. Now, I get a requirement that I need to use Azure Vault in one customer. And in another, they use a Hardware Security Module that we have to integrate with, etc.
Instead of having to do it all in the software, I can push that kind of behavior to the configuration. The script we’ll run can run arbitrary operations to gather its data, including custom stuff defined on site for the specific use case.
That gives you a lot of power, especially when you get a list of integrations options that you have to work with. Not doing that is huge. That is how RavenDB works, allowing you to shell out to a script for specific values. It means that we have a lot less work to do inside of RavenDB.
With RavenDB, we have gone with a hybrid approach. For most things, you define the configuration using simple JSON file, and we allow you to shell out to scripts for the more complex / dynamic features. That ends up being quite nice to use and work with.
Comments
The biggest problem with executable configuration is that it can quickly devolve into arbitrarily complex programs. This is bad, because configuration needs to be easy to reason about. On the other hand, plain data languages like JSON or ini are not expressive enough to capture common abstractions like "these five things are essentially the same with just that one proprety different".
Configuration langauges like dhall prove a great middle-ground: It's not turing complete and thus limits you to express configuration in terms of common operations / abstractions and thus putting an upper bound on complexity. It's got an extremely strong and flexible type system though which is what you need with configuration (i.e. express interdependencies between properties, validate configuration by type checking it).
The part about the shell scripts is from my POV not as much an issue of configuration, I'd say as it's more of a well designed (and chosen) integration point / API.
Johannes ,
Setting up an application can be a complex task, yep.
But given that the whole point of the script is giving information to the real application, I don't think you'll see too much complexity here.
dhall
is a good example of something that looks good, but doesn't solve the problem. How do you integrate with an HSM to pull password data indhall
? With a script, you call the relevant binary and you are done.Since IT Operations people are already expected to dabble in PowerShell, instead of forcing completely new and foreign language onto them, I've used full-fledged PowerShell as a configuration DSL in my latest project.
as usual there are some examples that work and some that dont. Linux and configuration files - some are scripts, some just name-value pairs and some have a multi-level structure but no logic. From all these i dislike the .sh scripts the most, mainly because quite often they end up with some unclear or complex code that does something, has no error handling and isnt explained anywhere (or maybe copied from some random stackoverflow). The fact that every distribution and every program has a different way of specifying the configuration just adds some more flavor. CMake file format - is it a programming language or a data structure? I'd really prefer makefiles to be declarative, static data structures. With cmake you never know what to expect. At the same time i'm still a heavy user of Boo and DSLs based on it - Boo is dying slowly but so far i havent been able to find anything to replace it. But i would never use Boo for just specifying configuration options, it's the configurable logic that i'm after.
Rafal,
The issue is what is the executable configuration doing? The idea is that it is going to create some data to feed to the application.If it is a full blown program, that is something to look at and consider. If you need something like cmake in order to config your system, there are other issues.
And yes, the idea is that the config code is as bare bones as possible
Comment preview