Code Vigorous


navigation
home
github
mozillians
email
about
Dustin J. Mitchell

Introducing CI-Admin

22 Aug 2018

A major focus of recent developments in Firefox CI has been putting control of the CI process in the hands of the engineers working on the project. For the most part, that means putting configuration in the source tree. However, some kinds of configuration don’t fit well in the tree. Notably, configuration of the trees themselves must reside somewhere else.

CI-Configuration

This information is collected in the ci-configuration repository. This is a code-free library containing YAML files describing various aspects of the configuration – currently the available repositories (projects.yml) and actions.

This repository is designed to be easy to modify by anyone who needs to modify it, through the usual review processes. It is even Phabricator-enabled!

CI-Admin

Historically, we’ve managed this sort of configuration by clicking around in the Taskcluster Tools. The situation is analogous to clicking around in the AWS console to set up a cloud deployment – it works, and it’s quick and flexible. But it gets harder as the configuration becomes more complex, it’s easy to make a mistake, and it’s hard to fix that mistake. Not to mention, the tools UI shows a pretty low-level view of the situation that does not make common questions (“Is this scope available to cron jobs on the larch repo?”) easy to answer.

The devops world has faced down this sort of problem, and the preferred approach is embodied in tools like Puppet or Terraform:

  • write down the desired configuration in a human-parsable text files
  • check it into a repository and use the normal software-development processes (CI, reviews, merges..)
  • apply changes with a tool that enforces the desired state

This “desired state” approach means that the tool examines the current configuration, compares it to the configuration expressed in the text files, and makes the necessary API calls to bring the current configuration into line with the text files. Typically, there are utilities to show the differences, partially apply the changes, and so on.

The ci-configuration repository contains those human-parsable text files. The tool to enforce that state is ci-admin. It has some generic resource-manipulation support, along with some very Firefox-CI-specific code to do weird things like hashing .taskcluster.yml.

Making Changes

The current process for making changes is a little cumbersome. In part, that’s intentional: this tool controls the security boundaries we use to separate try from release, so its application needs to be carefully controlled and subject to significant human review. But there’s also some work to do to make it easier (see below).

The process is this:

  • make a patch to either or both repos, and get review from someone in the “Build Config - Taskgraph” module
  • land the patch
  • get someone with the proper access to run ci-admin apply for you (probably the reviewer can do this)

Future Plans

Automation

We are in the process of setting up some automation around these repositories. This includes Phabricator, Lando, and Treeherder integration, along with automatic unit test runs on push.

More specific to this project, we also need to check that the current and expected configurations match. This needs to happen on any push to either repo, but also in between pushes: someone might make a change “manually”, or some of the external data sources (such as the Hg access-control levels for a repo) might change without a commit to the ci-configuration repo. We will do this via a Hook that runs ci-admin diff periodically, notifying relevant people when a difference is found. These results, too, will appear in Treeherder.

Grants

One of the most intricate and confusing aspects of configuration for Firefox CI is the assignment of scopes to various jobs. The current implementation uses a cascade of role inheritance and * suffixes which, frankly, no human can comprehend. The new plan is to “grant” scopes to particular targets in a file in ci-configuration. Each grant will have a clear purpose, with accompanying comments if necessary. Then, ci-admin will gather all of the grants and combine them into the appropriate role definitions.

Worker Configurations

At the moment, the configuration of, say, aws-provsioner-v1/gecko-t-large is a bit of a mystery. It’s visible to some people in the AWS-provisioner tool, if you know to look there. But that definition also contains some secret data, so it is not publicly visible like roles or hooks are.

In the future, we’d like to generate these configurations based on ci-configuration. That both makes it clear how a particular worker type is configured (instance type, capacity configuration, regions, disk space, etc.), and allows anyone to propose a modification to that configuration – perhaps to try a new instance type.

Terraform Provider

As noted above, ci-admin is fairly specific to the needs of Firefox CI. Other users of Taskcluster would probably want something similar, although perhaps a bit simpler. Terraform is already a popular tool for configuring cloud services, and supports plug-in “providers”. It would not be terribly difficult to write a terraform-provider-taskcluster that can create roles, hooks, clients, and so on.

This is left as an exercise for the motivated user!

Links