Puppet from the trenches – How to prevent overwritten user configuration with a custom type

by frankOctober 29, 2013

Puppet_LogoIn this installment of the ‘from the trenches’ series I cover the use of Puppet during one of our projects. We have used Puppet to provision Jenkins instances as part of a build and deployment platform for a large organization. I discuss the problem of when Puppet overwrites user managed configuration and how we solved it by writing a custom type.

Background

One of the goals of the project we used Puppet in was to ensure that every Java project got Mavenized and built on Jenkins. In order to provision these Jenkins instances we started out with a few manifests as we got familiar with the Puppet DSL and eventually the code evolved into a Jenkins Puppet module. This module followed the ‘Package, Config, Service’ pattern, popularized by Ken Barber.

In this pattern the module first installs a package, then applies templates and adds config files and finally starts the service, in this case Jenkins. This covers 90% of the workflow of configuring a service in Puppet. Another benefit of this approach is that it separates different concerns into separate manifests: package.pp, config.pp and so on.

The problem – Overwritten user-managed configuration

The point of configuration management tools such as Puppet is to provision a machine and keep its configuration in a given state. For example, we wanted to configure the JDK in Jenkins’ config.xml. When we spin up a new node the template is put in place and the JDK is configured. Unfortunately, the job views that are created by the user via the GUI are also stored in this file. Therefore, in a subsequent Puppet run the template overwrites the views managed by the user! One incomplete solution for this problem is to only manage the file if it’s absent:

file { "${jenkins::jenkins_home}/config.xml":
  source  => template("jenkins/config.xml.erb"),
  replace => 'no',
}

This can work, but what we really want is to have Puppet properly manage the JDK in the config.xml and leave the rest of the config, like views, intact. This cannot be accomplished by the above snippet. We need more fine-grained control over a configuration file. Luckily there is a way to solve this problem. We can do this by extending Puppet and implementing a custom type.

The solution – A custom type for managing a Jenkins JDK

Custom types are a way to extend the Puppet language and create a brand new type of resource. Instead of managing files, packages or services provided by Puppet you can create a resource for anything you want to manage, using the familiar Puppet syntax. Puppet comes shipped with dozens of types: file, package, user, group and so on. See the Puppet type reference for a full list of all resource types.

Each resource type implemented in Puppet requires two components: a type definition and a provider. The type definition is the interface of your resource. It specifies the parameters you can specify on the resource. The provider is an implementation that creates the resource. There can be several providers for a given type. For instance, the package type has several providers, one for each packaging system such as debian, rpm and so on.

Have a look at /var/lib/gems/1.9.1/gems/puppet-3.0.0/lib/puppet. It contains types at type/{typename}.rb and providers at provider/{type}/{providername}.rb. For example there is a type/package.rb and several providers, such as provider/package/apt.rb. In our case we want to define JDKs in the config.xml. Therefore we have to create a JDK type and a provider. However, since we will be manipulating XML we only have to create a single provider, not one for every operating system.

Designing the JDK type

Let’s see what the JDK type should look like. Besides a name, also known as its title, a JDK type should contain the path to the JDK on the filesystem as well as the location of the Jenkins config.xml file where we will define it. Something like this:

jenkins_jdk { "JDK7":
  java_home   => "/usr/lib/jvm/java-7-openjdk-amd64",
  config_file => "/var/lib/jenkins/config.xml"
}

Jenkins JDK type implementation

You can checkout the code for the custom type and provider on my Github. Let’s have a look at the JDK type.

require 'pathname'

Puppet::Type.newtype(:jenkins_jdk) do

  desc = "Ensure that Jenkins is configured with the given JDK"

  ensurable

  validate do
    if self[:java_home] == nil || !Pathname.new(self[:java_home]).absolute?
      raise ArgumentError, "'java_home' is missing or is not a valid directory"
    end

    if self[:config_file] == nil || !Pathname.new(self[:config_file]).absolute?
      raise ArgumentError, "'config_file' is missing"
    end
  end

  newparam(:name, :namevar => true) do
    desc "The name of the JDK"
  end

  newparam(:config_file) do
    desc "Location of the Jenkins config.xml file"
  end

  newproperty(:java_home) do
    desc "The directory of the JDK"
  end

end

We define the two parameters java_home and config_file by calling newparam on the symbol that corresponds to the parameter. We also validate the parameters via the validate method.

Jenkins JDK provider implementation

In the provider we do the actual XML manipulation with XPath expressions using REXML.

require 'rexml/document'
require 'rexml/xpath'
include REXML

Puppet::Type.type(:jenkins_jdk).provide :jenkins_jdk do
    desc "Jenkins JDK provider"

    def exists?
      read_config().root.elements["//jdks/jdk/name[. = '#{resource[:name]}']"] != nil
    end

    def create
      config = read_config()

      jdk = config.root.get_elements('jdks')[0].add_element('jdk')
      jdk.add_element('name').add_text(resource[:name])
      jdk.add_element('home').add_text(resource[:java_home])

      write_config(config)
    end

    def destroy
      config = read_config()

      config.root.delete_element("//jdks/jdk[name[. = '#{resource[:name]}']]")

      write_config(config)
    end

    def java_home
      resource[:java_home]
    end

    # Helper methods

    def read_config
      return Document.new File.read(resource[:config_file])
    end

    def write_config(config)
      File.open(resource[:config_file], 'w') do |f|
        config.write(f)
      end
    end
end

The most important methods in the provider are create, exists and destroy methods. These methods allow the type to be ensurable, so you can enable or disable the configuration of the JDK. REXML is used to check if the JDK exists or add it via XPath. Finally there are two helper methods read_config and write_config that are used to read and write the config.xml file.

Demo

If you want to see it working clone the repository and run

$ vagrant up

which will start a VM, install Jenkins and OpenJDK 7 and apply the JDK via the custom type.

Closing thoughts

One issue with custom types is the amount of development effort that goes into it. If you want to be able to fully configure the config.xml in this unobtrusive way you would have to create dozens of custom types just for Jenkins. This can be a lot of work. We were lucky that the JDK was the only config we had in config.xml. We were managing more configuration files for Jenkins, such as Maven and Kerberos settings, but these were managed in different files that could not be configured by the user so we could use a template instead of a custom type. At this point I think the module is too simple to be fit for reuse by others, but if this solves a problem you have let me know. One thing I didn’t implement yet: the custom type does not show a diff when Puppet applies the changes. If you know how to fix this in the type or provider, let me know!

Ideally I would like my Puppet code to consist of several custom types plus a little glue code. I think this is better then working with complete ‘package-config-service’ Puppet modules from the Forge. A custom type is a cohesive piece of code that manages a single bit of configuration well. Many ready-made ‘package-config-service’ modules on the other hand often have tight coupling between certain resources. For instance they might install certain services or packages, may not work for your OS and introduce dependencies between resources. So the trade-off is between using a ready-made module with a quick setup, versus a certain amount of development work to create a few cohesive, reusable custom types and some glue code.

In closing I would like to know your opinion about how to solve these issues, your take on Puppet, modules and custom types. Feel free to comment! 🙂