Summary of Configuration Management sprint

Summary of Configuration Management sprint

Last week, before DrupalCamp Colorado, Greg “heyrocker” Dunlap, David Strauss, Larry “Crell” Garfield, and Karoly “chx” Negyesi met to hash out architectural details and whip up some prototype code for the Drupal 8 Configuration Management initiative.

What problems are we trying to solve?
Currently there is no good way to move Drupal configuration information between environments because this data is scattered throughout the database in a variety of formats, oftentimes intermingled with content.
This also makes it impossible to version control this information, to store history, and to be able to rollback changes.
Every module stores their configuration data in a different format, there is no standardization at all, even within core.
There is also no standard API for saving this information (aside from the simple case of the variables table) so developer often roll their own solutions.
The entire contents of the variables table is loaded on each page request, even for rarely-accessed data, leading to memory bloat.
It is cumbersome to manage information that is different between server environments for the same project (database information, api keys, etc.)
We specifically are NOT (yet) trying to solve the problem of contextual configuration; only the underlying API that handles getting/setting this configuration data and shuffling it around to different sites.
The code that was developed at the sprint as a prototype is available at http://drupal.org/sandbox/heyrocker/1145636 for your reviewing pleasure. The main things to look at are the includes/config.inc and modules/config/config.test files.
What follows is a summary of the results. Your feedback is welcomed!

Overall Architecture
The proposed configuration system employes a three-level architecture:

Diagram summarizing the information below.

Level 1: Signed file storage
At the lowest level, all configuration data will be stored on-disk using digitally signed JSON files in a sites/$sitename/config directory. See also “Security considerations” below for more information on the information in this section.

Individual configuration files (prefix.example.json.php) will look like the following:

<?php die(); 723fd490de3fb7203c3a408abee8c0bf3c2d302392[snip]
{
  “string_value”: “string”,
  “integer_value”: 1,
  “array_or_object_value”: [
    “value1”,
    “value2”],
  “boolean_value”: false,
}

The first line prevents direct browsing of the file, as well as provides a digital signature to verify the integrity of the files.

Typically, a module will expose one such file with all of their configuration data, like module.$modulename.json.php. Core might do files like core.site_information.json.php, and something like Flag which other modules hook into to provide “discoverable” stuff might do both a module-specific configuration file module.flag.json.php for “global” module settings, as well as define a flag.$flagname.json.php.

The advantage of storing configuration this way is that rather than the current system where the entire contents of tha variable table is loaded into memory on every page request, thus limiting the data that can be stored there, we can instead target loading of only specific parts of configuration we need for the current request.

Level 2: Active configuration wrapper
This layer moves the configuration data from the filesystem to something that can be read and accessed much more readily. For the vast majority of Drupal sites this will be database storage, but for high-performance sites could be something like MongoDB or Redis.

In the default configuration, this will push data into a central table called “config”:

CREATE TABLE config (
name varchar(255) NOT NULL DEFAULT '' COMMENT 'The identifier for the configuration entry, such as module.example (the name of the file, minus .json.php).',
data longtext NOT NULL COMMENT 'The raw JSON data for this configuration entry.',
PRIMARY KEY (name),
);

While at first glance the structure looks like the “variables” table in Drupal 7 and below the fundamental difference is this table stores configuration objects (say, every site information setting: name, mission etc) the variable stored single values (like the site name). Also, as said above, we are not loading the whole table in memory.

All site configuration data gets read out of this wrapper. The data here gets updated on two conditions:

UI changes (automatic): When the save button is clicked on an admin page, data gets written to both the wrapper and the .json.php file from layer 1 (the digital signatures for the changed file gets regenerated).
Code changes (manual): When migrating configuration from dev to prod, for example, the underlying files will have changed, but the data in the database table will not. Data will continue to be read from the active store so that the site doesn’t break. Site admins can replace the contents of the active store with the contents on disk via an administrative interface and/or a drush command.
Level 3: Configuration API
At this level are the actual API functions that module developers will interact with in order to manipulate configuration values; essentially, a replacement for variable_get()/variable_set().

<?php
// Load a set of configuration out of the active store.
// ‘prefix.name’ refers to the filename of the .json.php file, without the extension.
$config = config(‘prefix.name’);
// Access a single value out of the store.
echo $config>value;
// Change a value and save it back to both the active store and the filesystem.
$config>value = ‘new value’;
$config->save();
?>

I’m a module developer. How do these proposed changes affect me?
variable_set()/variable_get()
In Drupal 7 and below, all variables are global, so accessing and saving them is done this way:

<?php
// Load the site name out of configuration.
$site_name = variable_get(‘site_name’, ‘Drupal’);
// Change the site name to something else.
variable_set(‘site_name’, ‘This is the dev site.’);
?>

In Drupal 8, configuration will only be lazy-loaded when needed. The above code would therefore change as follows:

<?php
// Load the site name out of configuration.
$site_name = config(‘core.site_information’)->name;
// Change the site name to something else.
$config = config(‘core.site_information’);
$config>name = ‘My Awesome Site’;
$config>save();
?>

For “discoverable” chunks like entities or views, you can load all “sub-configuration” files (e.g. list of all views, list of all node types) with the following call (exact API call TBD): config_get_names_with_prefix(‘entity.node.’);. This will return something like array(‘entity.node.page’, ‘entity.node.page.article’);. So to retrieve all settings for the page node type run config(‘entity.node.page’). As a side note, this does mean that config(‘entity.node’) will not return anything (TBD whether it throws an exception).

Declaring your own configuration options
In the past, configuration was declared in a mish-mash of ways. system_settings_form() would automagically save all form elements as variables in the variable table. Modules like Flag and Views would employ a hook_X_default() type of pattern, and so on.

Under the new system, declaring your own “variable” style configuration options happens in a module.$modulename.json.php file shipped with your module. You declare the defaults for these variables in just this one place, as opposed to every time you try and retrieve them.

An example file might look like:

module.book.json.php:

<?php die();
{
    “book_child_type”: “book”,
    “book_block_mode”: “all pages”,
    “book_allowed_types”: [
        “book”
    ]
}

During module installation, the configuration file will be copied to a user’s sites/$sitename/config directory, and from then on the site-specific settings will be stored there.

Defining default views, flags, etc. could be done much the same, but ultimately get stored in the active store and .json.php files.

But my configuration data is way more complex than that!
Because we are using JSON as a storage format, configuration objects can be arbitrarily complex. There is, however, one catch. JSON does not differentiate between an object and a hash (what PHP calls an associative array or dictionary), so we are only able to support one of them. After some discussion we determined that mapping all such data to an associative array would provide the most robust structure, as it allows for discrete ordering of properties, and $object->1 is invalid.

<?php
$config = config(‘module.mymodule’);
$config>alternate_name; // A simple property
$config>instances[0][‘name’]; // An array
// You can’t do this…
$config>foo>bar>baz = 1;
// But you can do this instead…
$config>foo[‘bar’][‘baz’] = 1;
?>

If your configuration data is especially complex, you’re probably a fancy-pants developer and want to use a custom configuration class.

I’m a super-fancy-pants developer and need something more powerful than this. How do I override it?
The $config object will be a of a common class that provides the ->save() logic. For advanced use cases, it will be possible to override it, like so:

<?php
class MyAdvancedConfig extends DrupalConfig {
  function someHelperMethod() {
    $this>pants = ‘fancy’;
    echo “Ma’am, you have my compliments on your fancy pants.”;
  }
}
$config = config(‘core.site’, ‘MyAdvancedConfig’);
$config>someHelperMethod();
$config->save();
?>

Migrating configuration from dev to prod
The overall workflow here would be as follows:

On your development server, perform whatever configuration changes are needed through the UI. Create Views, check checkboxes, etc. These changes will get written to both the database table and the file system so the two are in sync (it will also re-generate the digital signatures of the underlying files).
When finished with your edits, review the changes in the sites/$sitename/config directory with $vcs diff (or your tool of choice) to confirm they look as expected.
If everything looks good, move the changed files to your production server in the usual way (SFTP, $vcs add/commit/push, $vcs update/pull). Nothing will immediately change, as your site will still be reading from the active store.
Finally, go to admin/configuration/system/config or run drush config-update (note: final location/command likely different; making these up right now). This will outline the differences between on-disk and the active store. If the changes seem good, go ahead and confirm to overwrite the content of the active store with the on-disk configuration. The admin tool will offer to regenerate the file signatures if necessary.
Handling local overrides
It’s often handy, for things like API keys or $db_url, to have site-specific configuration that is not checked into version control.

In Drupal 7 and below, you’d do the following in settings.php (or settings.local.php):

<?php
$conf[‘site_name’] = ‘This is the dev site’;
?>

Under the proposed system, there would be a special file where all local overrides are kept (this file would not be checked into version control):

/sites/default/config/local.json.php

Within this file, would be particular keys/values that you’re overriding on a per-value basis:

<?php die(); 723fd490de3fb7203c3a408abee8c0bf3c2d302392[snip]
{
  “core.site_information”: {“site_name”: “This is the dev site”}
}

Security considerations
A bunch of talks were had with greggles, coltrane, and chx from the Drupal security team about the security of the web UI writing files directly to disk. Since these files may contain extremely sensitive data, such as database passwords and whatnot, it’s imperative that they not be readable by outside eyes, nor writable by rogue processes.

Here’s what was figured out:

Within the root /sites/sitename/config/ there will be a .htaccess that does “Deny from all” (and IIS web.config that does the equivalent) so that these configuration files cannot be read at all, under normal conditions. Attempting to access the files directly will result in a 403 error.
However, “just in case” this protection fails (or for web servers that do not support these override systems), all configuration files end in a .php extension. In this case, if you access them directly you will just get a PHP syntax error. There will also be a status check under admin/reports/status that throws flaming red warnings if .htaccess protection is turned off.
In the root of the /sites/sitename/ there will be a read-only key.php file that will be written during installation, and contains a site-specific private key. This private key will be used to digitally verify that the contents of configuration files are not roguely changed on disk. An example key.php would be:
<?php die(); c3a408abee8c0bf3c2d302392

The first line of all configuration files will consist of a automatically-generated digital signature, based on the private key:
<?php die(); 723fd490de3fb7203c3a408abee8c0bf3c2d302392[snip]
// All JSON data follows here.

If there’s a mismatch between this hash and the file contents, the configuration system will refuse to load the file unless the administrator explcitly approves it. This will prevent someone from writing mischievous values to the file through some other exploit.

The goal here is “defense in depth”, so an attacker would need to break through multiple layers in order to break anything undetected.

Leave a Reply

Your email address will not be published. Required fields are marked *