Writing a Custom Resource for Concourse— Detecting Pull Request Close/Merge Events

Published in

ITNEXT

7 min readMar 25, 2018

Click here to share this article on LinkedIn »

Recently I’ve been playing around with the excellent Concourse CI project. It’s a really cool project that’s been gaining a lot of traction as an open source CI/CD solution. Take a look at this page for why Concourse is so great and how it compares to other CI/CD solutions.

Inside of Concourse all interactions are done through resources and jobs. Its model is functional in the sense that pipelines are composed of stateless jobs with well defined inputs and outputs which are modeled by resources. Everything runs inside of its own container which prevents any pollution of the automation environment between tasks. In order for any steps within a job to share anything, the inputs and outputs must be explicitly defined. This can be a little tedious at times, but it keeps everything happening inside of the pipelines very explicit. All of these characteristics make it ideal for using it as a CI/CD solution because it prevents a lot of the more mysterious and frustrating issues that cause builds to break.

Main Idea and Motivation

Implementing CI/CD using any kind of automation system requires the system to interact with the events happening inside of source control solutions like Github. In particular, one of the more common events that trigger a CI/CD system are pull requests. The community as well as the maintainers of Concourse at Pivotal have been really good with providing resources for most cases. There is already an excellent github-pullrequest-resource available for dealing with Github pull requests, but one case it doesn’t really handle is the ability to detect a pull request that has been merged or closed.

This can be useful when modeling CI/CD in such a way so that an open PR creates an environment for the application source code to deploy onto while a closed or merged PR initiates a cleanup of that environment to reclaim resources. This type of flow is especially useful within Cloud platforms where the cost is billed on a per-use basis, so that the platform is only using resources while there are open pull requests. This was a use case I really needed in my current projects using Concourse, so I decided to take this opportunity to dive a little deeper and write my own custom resource and hopefully document the process so that it can be useful to others. For the tldr; please check out the project repo.

Understanding Concourse Resources

Before we first start, we should first try to understand how to implement a custom resource for Concourse. I don’t want to go too deep into the spec here, but basically a Concourse resource is just a container that implements three scripts:

/opt/resource/check: checking for new versions of the resource
/opt/resource/in : pulling a version of the resource down
/opt/resource/out: idempotently pushing a version up

All resources should implement these 3 scripts, but they don’t all have to do something. For operations that don’t fit the semantics of the resource, the script can be a noop. In the case of our resource, we really only need to implement the check and the in scripts because we aren’t updating anything from closed or merged pull requests, merely fetching information about them and triggering downstream jobs.

Understanding the `check script`

Now that we understand which scripts we need to implement for this resource, let’s dive a little deeper into the spec to understand what check should be doing for this resource. Breaking down the spec:

A resource type’s check script is invoked to detect new versions of the resource.
It is given the source configuration and current version on stdin.
source is an arbitrary JSON object which specifies the location of the resource, including any credentials. This is passed verbatim from the pipeline configuration.
version is a JSON object with string fields, used to uniquely identify an instance of the resource.
This will be omitted from the first request, in which case the resource should return the current version (not every version since the resource’s inception).
It must print the array of new versions in chronological order to stdout, including the requested version if it’s still valid.

The above spec is basically saying the Concourse runtime will be running the script using something like the following command:

echo {...source config json...} | /opt/resource/check

For the first check invocation, the input to the script will only include the source config, but in subsequent requests, the check script will also be passed the current version which tells the resource to return the next valid version objects.

Given this spec, we can start to design what our source configuration and version objects should look like.

Defining the version and source config

Since we want to be returning information about closed and merged pull requests from Github, let’s try to understand what kind of information is available. Because Github provides an excellent GraphQL API we can do some exploration using the explorer to see what kind of data we can fetch about pull requests. After some experimentation, I came up with the following GraphQL query:

Since our query will be what is required to run the check script, we can pretty much use the input parameters (with some additional fields for credentials and API endpoint) as the fields for our source config:

source:
  graphql_api: https://api.github.com/graphql
  access_token: ((github-access-token))
  base_branch: master
  owner: ((github-owner))
  repo: ((repo-name))
  first: ((num-to-fetch))
  states:
    - closed
    - merged

This query will return a payload that looks something like this:

Based on this information, we can try to return an array of version objects that look something like this:

{
    "id": "MDExOlB1bGxSZXF1ZXN0MTcxMjQ5NjI1",
    "cursor": "Y3Vyc29yOnYyOpK5MjAxOC0wMi0yNVQxMjozNDo0NC0wODowMM4KNQ/Z",
    "number": "1",
    "url": "https://github.com/shinmyung0/fixture-repo/pull/1",
    "baseBranch": "master",
    "headBranch": "test-merged-branch",
    "state": "MERGED",
    "timestamp": "2018-02-25T20:34:44Z"
}

Some important points to highlight are fields like the cursor which can be passed in to our GraphQL query’s after field, to only return pull requests after a particular cursor. Since the "current”version object is passed into the check script to fetch “new” version objects, this is something that would be useful to have included.

Implementing the check script

Now that we have a clear idea of what our check script should do, we can start to actually implement it. Since we know that Concourse resources can be written in any language as long as they satisfy the spec, we can pick the best language for what we are doing. Because we are using GraphQL, I decided to implement this using JS. It’s easy to deal with asynchronous network calls, easy to deal with JSON, and there are plenty of client libraries out there for GraphQL. A library I really enjoy using for GraphQL in JS is Apollo. Given all these choices the pseudocode for the check script would look something like this:

#! /usr/bin/env node// the shebang allows this file to be directly executable
async function check() {  // read stdin
  // parse and validate input json
  // use configuration to run GraphQL query to fetch PRs
  // convert response payload to version objects
  // output to stdout}

check()

For the actual implementation check the source code here.

Understanding the in script

Let’s take a deep dive in the spec for the in script.

The in script is passed a destination directory as $1. The script must fetch the resource and place it in the given directory.
The script is given on stdin the configured source and a precise version of the resource to fetch.
The script must emit the fetched version, and may emit metadata as a list of key-value pairs.

Based on this spec, the Concourse runtime will basically be invoking the in script in something like the following manner:

echo {... some json ...} | /opt/resource/in outputdir

Because our pull request resource is simply fetching info about a pull request, we don’t need to be fetching anything additionally other than outputting the version object as a file. So we can say that the result of the above execution of the in script will result in some outputdir/pull_request file that contains the version object that was passed in to stdin. It will also emit the current version to stdout .

Implementing the in script

Based on our understanding of what the in script should do, the pseudocode for the in script would look something like the following:

#! /usr/bin/env nodeasync function doIn() {  // read stdin, parse, and validate
  // extract given .version key
  // output version object to $1/pull_request file
  // emit version to stdout}doIn()

Downstream jobs can read the $1/pull_request file and extract information about the recently closed or merged pull request.

Writing Tests

Since we are using Node JS, we can use Jest to easily write some unit and integration tests. Unit tests are pretty straight forward to write, but in order to be able to run integration tests, we need to setup a fixture repo with some closed or merged pull requests to test actual API calls against.

Because making calls against this repo requires a Github Access token, we can pass this in as an environment variable to the integration test suite. A good thing to do would be to make sure to validate that an access token has been set within the test. Check out the test code to see in detail how this is setup.

CI/CD and Publishing to Docker Hub

We can use Travis CI to easily setup some CI/CD for this project. Basically all the CI/CD job needs to do is run all the tests and then if successful and the commit is tagged as a release, build a Docker image and then publish it to the public Docker Hub. Pretty straightforward, so I won’t go into too much detail here. But if you’re curious, please check out the Travis CI config as well as the build script.

Conclusion

This has been a fun project to publish for this month. It gave me a really good opportunity to do a deep dive into Concourse which is something that I’ve been using for work quite a bit lately. I’m feeling more confident in my ability to open source things on a regular basis which is also something I’m committed to doing this year. Overall Concourse is a really awesome project that I highly recommend for any teams looking for a really nice CI/CD solution.