I’m available for freelance work. Let’s talk »

Working with many git repos

Monday 12 October 2020

Some of my work on the Open edX team at edX requires working with the three dozen or so repos that form the backbone of the Open edX software. That often means doing the same thing to all of them (tagging, logs, etc).

To make it easier to work with a collection of repos, I have a shell function to run the same command on the git directories found under the current directory. It gets a little more complicated than that: I might have 100 repos in the current directory, but only the ones that have certain master branches should be included in an operation.

My function is called “gittreeif”: it takes a branch name and a command, and walks the current directory tree looking for git repos that have that branch. For each one, it executes the command:

$ gittreeif origin/juniper.master git status

I also define “gittree”, which runs on every repo regardless of its branches.

Here is the definition of gittreeif. Put it in your shell startup file (.bashrc, .zshrc, whatever):

# Run a command for every repo found somewhere beneath the current directory.
#
#   $ gittree git fetch --all --prune
#
# To only run commands in repos with a particular branch, use gittreeif:
#
#   $ gittreeif branch_name git fetch --all --prune
#
# If the command has subcommands that need to run in each directory, quote the
# entire command:
#
#   $ gittreeif origin/foo 'git log --format="%s" origin/foo ^$(git merge-base origin/master origin/foo)'
#
# The directory name is printed before each command.  Use -q to suppress this,
# or -r to show the origin remote url instead of the directory name.
#
#   $ gittreeif origin/foo -q git status
#
gittreeif() {
    local test_branch="$1"
    shift
    local show_dir=true show_repo=false
    if [[ $1 == -r ]]; then
        # -r means, show the remote url instead of the directory.
        shift
        local show_dir=false show_repo=true
    fi
    if [[ $1 == -q ]]; then
        # -q means, don't echo the separator line with the directory.
        shift
        local show_dir=false show_repo=false
    fi
    find . -name .git -type d -prune | while read d; do
        local d=$(dirname "$d")
        git -C "$d" rev-parse --verify -q "$test_branch" >& /dev/null || continue
        if [[ $show_dir == true ]]; then
            echo "---- $d ----"
        fi
        if [[ $show_repo == true ]]; then
            echo "----" $(git -C "$d" config --get remote.origin.url) "----"
        fi
        if [[ $# == 1 && $1 == *' ']]; then
            (cd "$d" && eval "$1")
        else
            (cd "$d" && "$@")
        fi
    done
}

gittree() {
    # @ is in every repo, so this runs on all repos
    gittreeif @ "$@"
}

Let’s say I want to summarize the changes between two tags. Here’s a convenient alias to put in your ~/.gitconfig:

[alias]
    relnotes = log --pretty='%h %ad %an: %s' --date=short --no-merges

The git command to show the changes between “old-commit” and “new-commit” is:

git log new-commit ^old-commit

Putting it all together: to see the changes between juniper.2 and juniper.3 in all the repos that have Juniper branches, using “relnotes” to get the summary style I like:

$ gittreeif \
    open-release/juniper.master \
    git relnotes open-release/juniper.3 ^open-release/juniper.2
---- ./ecommerce ----
 ca9cddb4 2020-08-05 Ned Batchelder: Upgrade Django to 2.2.15
---- ./devstack ----
 10f02ca 2020-08-17 Zachary Trabookis: Remove `xqueue` as `DEFAULT_SERVICES` for
 8ff8dd0 2020-08-17 Zachary Trabookis: Make additional adjustments to the docume
 57455fe 2020-08-10 Zachary Trabookis: Add `xqueue` to default services to provi
 3ca4c9d 2020-07-29 Zachary Trabookis: Make sure to pass in `DOCKER_COMPOSE_FILE
 cef4aa2 2020-07-28 Zachary Trabookis: Updated `README` to include necessary inf
 9415683 2020-07-27 Zachary Trabookis: Update `docker` commands to be `docker-co
 67c7c9b 2020-08-16 morenol: Do not use openedx release for registrar and edx-mk
 56312bc 2020-08-04 Guruprasad Lakshmi Narayanan: Remove duplicate section
 34a46a3 2020-07-24 Guruprasad Lakshmi Narayanan: Remove the non-release service
---- ./xqueue ----
 f004caa 2020-08-05 Ned Batchelder: Upgrade Django to 2.2.15
---- ./edx-e2e-tests ----
---- ./edx-platform ----
 d9e0ca5e70 2020-08-12 Ali-D-Akbar: This commit contains security fixes for the
 c8421f66fc 2020-08-07 uzairr: Fix xss vulnerabilities in templates
 47ab6af637 2020-08-06 Attiya Ishaque: [YONK-1759] Version bump of studio-fronte
 8dd78619c9 2020-08-05 Ned Batchelder: Upgrade Django to 2.2.15
 b295389e96 2020-07-23 Zachary Trabookis: Set `SESSION_COOKIE_SAMESITE=Lax` for
 91af099933 2020-07-23 uzairr: Fix xss in templates
 0e45ecb743 2020-07-22 Ali-D-Akbar: Sustaining xss fixes This commit contains xs
 3757f0d11e 2020-07-06 Florian Haas: Fix profile image URLs for image storage on
---- ./edx-analytics-pipeline ----
---- ./repo-tools/repo-tools ----
---- ./edx-notes-api ----
 ad53edd 2020-08-05 Ned Batchelder: Upgrade Django to 2.2.15
---- ./cs_comments_service ----
 3079804 2020-08-19 Samuel Walladge: Bump codecov to latest version
---- ./course-discovery ----
 e984f273 2020-08-05 Ned Batchelder: Upgrade Django to 2.2.15
---- ./credentials ----
 7a7aab55 2020-08-05 Ned Batchelder: Upgrade Django to 2.2.15
---- ./src/edx-analytics-configuration ----
---- ./src/edx-documentation ----
---- ./src/configuration ----
 05bb4edcf 2020-08-24 Feanil Patel: Improve sandboxing. (#5953) (#5960)
 860994c0d 2020-08-21 Feanil Patel: Timmc/codejail improvements (#5956)
---- ./src/enterprise-catalog ----
 f886da6 2020-08-05 Ned Batchelder: Upgrade Django to 2.2.15
---- ./src/blockstore ----
---- ./src/edx-analytics-data-api ----
 64b4c7f 2020-08-05 Ned Batchelder: Upgrade Django to 2.2.15
---- ./src/frontend-app-publisher ----
---- ./src/edx-app-android ----
---- ./src/notifier ----
---- ./src/edx-analytics-dashboard ----
 b8dfa559 2020-08-05 Ned Batchelder: Upgrade Django to 2.2.15
---- ./src/frontend-app-support-tools ----
---- ./src/edx-app-ios ----
---- ./src/edx-demo-course ----
---- ./src/ecommerce-worker ----
---- ./src/frontend-app-learning ----
---- ./src/edx-certificates ----
---- ./src/frontend-app-profile ----
---- ./src/license-manager ----
 85003a6 2020-08-05 Ned Batchelder: Upgrade to Django 2.2.15
---- ./src/testeng-ci ----
---- ./src/frontend-app-gradebook ----
---- ./src/edx-developer-docs ----
---- ./src/frontend-app-account ----

This is how I do it. There are probably other tools to do the same job. Maybe someone will point them out... :)

Comments

[gravatar]
This is very similar to how my 'allgit' tool works: https://github.com/inventhouse/allgit

Allgit has a bunch of other useful filters, like '-m' to run on repositories with locally modified files, and '-t' to run an arbitrary command or script to decide whether to include a repo.

It also has powerful workflows to checkout out a suite of branches if you have repositories with different branch naming schemes or branching schedules that need to be in a coherent state, and much more.

Check it out, and ask if you have any questions!
[gravatar]
I wrote a command line tool to manage multiple repos.

https://github.com/nosarthur/gita

It can batch delegate git commands/aliases from any working directory. It also has the concept of group and context so that you can run command to a group of repos.

Add a comment:

Ignore this:
Leave this empty:
Name is required. Either email or web are required. Email won't be displayed and I won't spam you. Your web site won't be indexed by search engines.
Don't put anything here:
Leave this empty:
Comment text is Markdown.