Simon Willison’s Weblog

Subscribe

Dynamic content for GitHub repository templates using cookiecutter and GitHub Actions

28th August 2021

GitHub repository templates were introduced a couple of years ago to provide a mechanism for creating a brand new GitHub repository starting with an initial set of files.

They have one big limitation: the repositories that they create share the exact same contents as the template repository. They’re basically a replacement for duplicating an existing folder and using that as the starting point for a new project.

I’m a big fan of the Python cookiecutter tool, which provides a way to dynamically create new folder structures from user-provided variables using Jinja templates to generate content.

This morning, inspired by this repo by Bruno Rocha, I finally figured out a neat pattern for combining cookiecutter with repository templates to compensate for that missing dynamic content ability.

The result: datasette-plugin-template-repository for creating new Datasette plugins with a single click, python-lib-template-repository for creating new Python libraries and click-app-template-repository for creating Click CLI tools.

Cookiecutter

I maintain three cookiecutter templates at the moment:

Having installed cookiecutter (pip install cookiecutter) each of these can be used like so:

% cookiecutter gh:simonw/datasette-plugin
plugin_name []: visualize counties
description []: Datasette plugin for visualizing counties
hyphenated [visualize-counties]: 
underscored [visualize_counties]: 
github_username []: simonw
author_name []: Simon Willison
include_static_directory []: y
include_templates_directory []: 

Cookiecutter prompts for some variables defined in a cookiecutter.json file, then generates the project by evaluating the templates.

The challenge was: how can I run this automatically when a new repository is created from a GitHub repository template? And where can I get those variables from?

Bruno’s trick: a self-rewriting repository

Bruno has a brilliant trick for getting this to run, exhibited by this workflow YAML. His workflow starts like this:

name: Rename the project from template

on: [push]

jobs:
  rename-project:
    if: ${{ github.repository != 'rochacbruno/python-project-template' }}
    runs-on: ubuntu-latest
    steps:
       # ...

This means that his workflow only runs on copies of the original repository—the workflow is disabled in the template repository itself by that if: condition.

Then at the end of the workflow he does this:

      - uses: stefanzweifel/git-auto-commit-action@v4
        with:
          commit_message: "Ready to clone and code"
          push_options: --force

This does a force push to replace the contents of the repository with whatever was generated by the rest of the workflow script!

This trick was exactly what I needed to get cookiecutter to work with repository templates.

Gathering variables using the GitHub GraphQL API

All three of my existing cookiecutter templates require the following variables:

  • A name to use for the generated folder
  • A one-line description to use in the README and in setup.py
  • The GitHub username of the owner of the package
  • The display name of the owner

I need values for all of these before I can run cookiecutter.

It turns out they are all available from the GitHub GraphQL API, which can be called from the initial workflow copied from the repository template!

Here’s the GitHub Actions step that does that:

- uses: actions/github-script@v4
  id: fetch-repo-and-user-details
  with:
    script: |
      const query = `query($owner:String!, $name:String!) {
        repository(owner:$owner, name:$name) {
          name
          description
          owner {
            login
            ... on User {
              name
            }
            ... on Organization {
              name
            }
          }
        }
      }`;
      const variables = {
        owner: context.repo.owner,
        name: context.repo.repo
      }
      const result = await github.graphql(query, variables)
      console.log(result)
      return result

Here I’m using the actions/github-script action, which provides a pre-configured, authenticated instance of GitHub’s octokit/rest.js JavaScript library. You can then provide custom JavaScript that will be executed by the action.

await github.graphql(query, variables) can then execute a GitHub GraphQL query. The query I’m using here gives me back the current repository’s name and description and the login and display name of the owner of that repository.

GitHub repositories can be owned by either a user or an organization—the ... on User / ... on Organization syntax provides the same result here for both types of nested object.

The output of this GraphQL query looks something like this:

{
  "repository": {
    "name": "datasette-verify",
    "description": "Verify that files can be opened by Datasette",
    "owner": {
      "login": "simonw",
      "name": "Simon Willison"
    }
  }
}

I assigned an id of fetch-repo-and-user-details to that step of the workflow, so that the return value from the script could be accessed as JSON in the next step.

Passing those variables to cookiecutter

Cookiecutter defaults to asking for variables interactively, but it also supports passing in those variables as command-line parameters.

Here’s part of my next workflow steps that executes cookiecutter using the variables collected by the GraphQL query:

- name: Rebuild contents using cookiecutter
  env:
    INFO: ${{ steps.fetch-repo-and-user-details.outputs.result }}
  run: |
    export REPO_NAME=$(echo $INFO | jq -r '.repository.name')
    # Run cookiecutter
    cookiecutter gh:simonw/python-lib --no-input \
      lib_name=$REPO_NAME \
      description="$(echo $INFO | jq -r .repository.description)" \
      github_username="$(echo $INFO | jq -r .repository.owner.login)" \
      author_name="$(echo $INFO | jq -r .repository.owner.name)"

The env: INFO: block exposes an environment variable called INFO to the step, populated with the output of the previous fetch-repo-and-user-details step—a string of JSON.

Then within the body of the step I use jq to extract out the details that I need—first the repository name:

export REPO_NAME=$(echo $INFO | jq -r '.repository.name')

Then I pass the other details directly to cookiecutter as arguments:

cookiecutter gh:simonw/python-lib --no-input \
  lib_name=$REPO_NAME \
  description="$(echo $INFO | jq -r .repository.description)" \
  github_username="$(echo $INFO | jq -r .repository.owner.login)" \
  author_name="$(echo $INFO | jq -r .repository.owner.name)"

jq -r ensures that the raw text value is returned by jq, as opposed to the JSON string value which would be wrapped in double quotes.

Cleaning up at the end

Running cookiecutter in this way creates a folder within the root of the repository that duplicates the repository name, something like this:

datasette-verify/datasette-verify

I actually want the contents of that folder to live in the root, so the next step I run is:

mv $REPO_NAME/* .
mv $REPO_NAME/.gitignore .
mv $REPO_NAME/.github .

Here’s my completed workflow.

This almost worked—but when I tried to run it for the first time I got this error:

![remote rejected] (refusing to allow an integration to create or update .github/workflows/publish.yml)

It turns out the credentials provided to GitHub Actions are forbidden from making modifications to their own workflow files!

I can understand why that limitation is in place, but it’s frustrating here. For the moment, my workaround is to do this just before pushing the final content back to the repository:

mv .github/workflows .github/rename-this-to-workflows

I leave it up to the user to rename that folder back again when they want to enable the workflows that have been generated for them.

Give these a go

I’ve set up three templates using this pattern now:

Each of these works the same way: enter a repository name and description, click “Create repository from template” and watch as GitHub copies the new repository and then, a few seconds later, runs the workflow to execute the cookiecutter template to replace the contents with the final result.

You can see examples of repositories that I created using these templates here:

This is Dynamic content for GitHub repository templates using cookiecutter and GitHub Actions by Simon Willison, posted on 28th August 2021.

Next: Building a desktop application for Datasette (and weeknotes)

Previous: Weeknotes: Getting my personal Dogsheep up and running again