Django: Pinpoint upstream changes with Git

Measured inspection.

Django’s release notes are extensive and describe nearly all changes. Still, when upgrading between Django versions, you may encounter behaviour changes that are hard to relate to any particular release note.

To understand whether a change is expected or a regression, you can use Django’s Git repository to search the commits between versions. I often do this when upgrading client projects, at least the larger ones.

In this post, we’ll cover Django’s branching structure, determining and searching through those commits, a worked example, and advanced behavioural searching with git bisect.

Django’s branching structure

Most open source projects use a single main branch, tagged for release when appropriate. However, because Django maintains support for multiple feature versions simultaneously, its branching structure is more complicated. Here’s an example:

* main
|
| * stable/5.0.x
| |
⋮ ⋮
|/
|
*
⋮
* * stable/4.2.x
| |
⋮ ⋮
|/
|
*
⋮

There’s a main branch, representing the future version of Django, and stable/<version>.x branches representing released versions (at least, released in alpha). When it is time for an alpha release of a new version, a new stable/<version>.x branch is created from main.

Commits are always merged to main. Then, they may be copied onto relevant stable/<version>.x branches with git cherry-pick, also known as backporting, if the merger deems relevant (mergers are typically the Django fellows). Typically, only bug fixes are backported, depending on Django’s Supported Versions Policy.

(If you’re particularly interested, the backporting script is hosted in Django’s wiki.)

Clone and update Django’s repository

Before inspecting inter-version history, ensure you have an up-to-date clone of Django’s repository. If you’re cloning fresh, you’re fine. But if you have an existing clone, you want to update any local stable/* branches so they include all backported commits. Here’s a short command using git for-each-ref and a while loop to run git pull on all local stable/ branches:

$ git for-each-ref 'refs/heads/stable/*' --format="%(refname:short)" | \
while read entry
do
  git switch $entry
  git pull
done

For example, here’s what I see when I run it when all branches are up-to-date:

Switched to branch 'stable/3.0.x'
Your branch is up to date with 'upstream/stable/3.0.x'.
Already up to date.
Switched to branch 'stable/3.1.x'
Your branch is up to date with 'upstream/stable/3.1.x'.
Already up to date.
...
Switched to branch 'stable/5.0.x'
Your branch is up to date with 'upstream/stable/5.0.x'.
Already up to date.

Find changes between versions

To see the changes between versions <old> and <new>, we want commits starting at the point <old> branched from main and ending at the tip of the <new> branch. This start point is what Git calls the merge base between <old> and main, as it’s the base that a merge between <old> and main would use for conflicts. Git’s git merge-base command can report the merge base between two branches. For example, the point that Django 4.2 branched from main is:

$ git merge-base main stable/4.2.x
9409312eef72d1263dae4b0303523260a54010c5

We can double-check with git show:

$ git show --stat 9409312eef72d1263dae4b0303523260a54010c5
commit 9409312eef72d1263dae4b0303523260a54010c5
Author: Mariusz Felisiak <...>
Date:   Sun Jan 15 19:12:57 2023 +0100

    Updated man page for Django 4.2 alpha.

 docs/man/django-admin.1 | 434 ++++++++++++++++++++++++++++++++++++++++++++++++++++----------------------------------------------------------------------------------------------------------
 1 file changed, 142 insertions(+), 292 deletions(-)

Yup, that looks right—it’s ex-Fellow Mariusz preparing for the Django 4.2 alpha release right before creating stable/4.2.x.

For the complete log between <old> and <new>, use git log with git merge-base to template the start point. Use the two-dot range syntax to select commits between two points. So, from Django 4.2 to 5.0:

$ git log --oneline $(git merge-base main stable/4.2.x)..stable/5.0.x

For the diff between versions, use git diff with its three-dot syntax between the versions. Note that Git is inconsistent here: three dots for diff mean the same as two dots for log.

$ git diff $(git merge-base main stable/4.2.x)...stable/5.0.x

Both commands give an overwhelming number of changes (go Django contributors!). They get more useful when you add some filtering options. For example, use git log -S to narrow commits to those that added or removed a given string:

$ git log -S FORMS_URLFIELD_ASSUME_HTTPS $(git merge-base main stable/4.2.x)...stable/5.0.x
commit 92af3d4d235448446e53e982275315bedcc4c204
Author: Mariusz Felisiak <...>
Date:   Tue Nov 28 20:04:21 2023 +0100

    [5.0.x] Refs #34380 -- Added FORMS_URLFIELD_ASSUME_HTTPS transitional setting.

    This allows early adoption of the new default "https".

    Backport of a4931cd75a1780923b02e43475ba5447df3adb31 from main.

Or use git diff with a pathspec to limit the diff to particular files:

$ git diff $(git merge-base main stable/4.2.x)...stable/5.0.x -- 'django/contrib/admin/*.css'
diff --git django/contrib/admin/static/admin/css/base.css django/contrib/admin/static/admin/css/base.css
index 72f4ae169b3..44f2fc8802e 100644
--- django/contrib/admin/static/admin/css/base.css
+++ django/contrib/admin/static/admin/css/base.css
@@ -22,11 +22,11 @@ :root {

     --breadcrumbs-fg: #c4dce8;
     --breadcrumbs-link-fg: var(--body-bg);
-    --breadcrumbs-bg: var(--primary);
+    --breadcrumbs-bg: #264b5d;
...

There are many other useful git log and git diff options for narrowing down changes. I cover my top picks in Boost Your Git DX.

A worked example

I’ve recently been working on upgrading a client project from Django 4.1 to 4.2. One change that I found was an admin unit test started failing its assertNumQueries() assertion. The test looked like this:

class BookAdminTest(TestCase):
    ...

    def test_save_change_list_view_num_of_queries(self) -> None:
        ...  # Some setup

        with self.assertNumQueries(7):
            """
            1. SAVEPOINT ...
            2. SELECT "django_session" ...
            3. SELECT "core_user" ...
            4. SELECT COUNT(*) AS "__count" FROM "library_book"
            5. SELECT COUNT(*) AS "__count" FROM "library_book"
            6. SELECT "library_book" ...
            7. RELEASE SAVEPOINT ...
            """
            self.client.post("/admin/library/book/", ...)

And the failure message looked like this:

E   AssertionError: 9 != 7 : 9 queries executed, 7 expected
E   Captured queries were:
E   1. SAVEPOINT ...
E   2. SELECT "django_session" ...
E   4. SELECT COUNT(*) AS "__count" FROM "entity_entity"
E   5. SELECT COUNT(*) AS "__count" FROM "entity_entity"
E   6. SELECT "entity_entity" ...
E   7. SAVEPOINT ...
E   8. RELEASE SAVEPOINT ...
E   9. RELEASE SAVEPOINT ...

There were two extra queries. Thankfully, the test author had diligently copied simplified versions of the queries into a comment. I could easily compare and see the new queries were #7, a SAVEPOINT, and #8, a RELEASE SAVEPOINT. These are SQL for a nested transaction, and in Django we typically use transaction.atomic() to create transactions, nested or not.

I didn’t immediately spot any relevant release note for this extra transaction, so I checked Django’s history. I checked Django’s Git log for commits that:

  1. Were between version 4.1 and 4.2.
  2. Added or removed the string “atomic”, with git log -S.
  3. Affected django/contrib/admin, with pathspec limiting.

The combined command found precisely the responsible commit straight away:

$ git log $(git merge-base main stable/4.1.x)..stable/4.2.x -S atomic -- django/contrib/admin
commit 7a39a691e1e3fe13588c8885a222eaa6a4648d01
Author: Shubh1815 <...>
Date:   Sat Sep 24 15:42:28 2022 +0530

    Fixed #32603 -- Made ModelAdmin.list_editable use transactions.

Looking at the commit, it turned out it added a pretty clear release note:

$ git show 7a39a691e1e3fe13588c8885a222eaa6a4648d01
commit 7a39a691e1e3fe13588c8885a222eaa6a4648d01
Author: Shubh1815 <shubhparmar14@gmail.com>
Date:   Sat Sep 24 15:42:28 2022 +0530

    Fixed #32603 -- Made ModelAdmin.list_editable use transactions.

...
diff --git docs/releases/4.2.txt docs/releases/4.2.txt
index 5a849cbbe5..5774bfef7b 100644
--- docs/releases/4.2.txt
+++ docs/releases/4.2.txt
@@ -51,6 +51,9 @@ Minor features
 * The ``admin/base.html`` template now has a new block ``nav-breadcrumbs``
   which contains the navigation landmark and the ``breadcrumbs`` block.

+* :attr:`.ModelAdmin.list_editable` now uses atomic transactions when making
+  edits.
+
 :mod:`django.contrib.admindocs`
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

...

I must have skimmed over it, woops! Well, at least I had verified that the behaviour change was expected. I opted to update the test’s expected query count and comment:

class BookAdminTest(TestCase):
    ...

    def test_save_change_list_view_num_of_queries(self) -> None:
        ...  # Some setup

        with self.assertNumQueries(9):
            """
            1. SAVEPOINT ...
            2. SELECT "django_session" ...
            3. SELECT "core_user" ...
            4. SELECT COUNT(*) AS "__count" FROM "library_book"
            5. SELECT COUNT(*) AS "__count" FROM "library_book"
            6. SELECT "library_book" ...
            7. SAVEPOINT ...
            8. RELEASE SAVEPOINT ...
            9. RELEASE SAVEPOINT ...
            """
            self.client.post("/admin/library/book/", ...)

Search by behaviour with git bisect

Searching through the log and diff is hard when you don’t know which files to look at or strings to search for. If you’re observing a behaviour change and don’t know its cause, try using git bisect to find the responsible commit.

See my git bisect basics post for an introduction to the command. Here, we’ll discuss the specifics of bisecting Django in your project.

First, use an editable install of your local Django repository within the target project. With Pip, use pip install -e:

$ python -m pip install -e ~/Projects/django

Replace ~/Projects with the appropriate path to the Django repository.

Second, ensure you can run your behaviour test on both Django versions. That behaviour test might be loading a page under ./manage.py runserver, a unit test with ./manage.py test or pytest, or some other command. Your project must work sufficiently on both Django versions before you can bisect between them.

Switch your Django repository to the older version:

$ git switch stable/4.2.x

Then, run the behaviour test in your project:

$ ./manage.py runserver
...
[24/Apr/2024 05:25:59] "GET / HTTP/1.1" 200 1797

Similarly, repeat on the newer version:

$ git switch stable/5.0.x
$ ./manage.py runserver
...
[24/Apr/2024 05:26:25] "GET / HTTP/1.1" 200 1761

If your test does not run smoothly on both versions, modify the project until it does. Typically, this means acting on deprecation warnings from the old version. But in the worst case, you may need to fork code between the old and new versions, especially if you’re trying to upgrade many versions. Try this pattern for forking based on Django’s version tuple:

import django

if django.VERSION >= (5, 0):
    # do the new thing
    ...
else:
    # do the old thing
    ...

Third, run the bisect. In the Django repository, start the bisect and label the old and new versions like so:

$ git bisect start
$ git bisect old $(git merge-base main stable/4.2.x)
$ git bisect new stable/5.0.x

Do the usual thing of iterating with the old and new subcommands until you find the responsible commit. Remember to finish up with git bisect reset.

Finally, roll back your project to use the non-editable install of Django. For example, with Pip:

$ python -m pip install -r requirements.txt

Fin

See you in the Django’ Git log,

—Adam


Read my book Boost Your Git DX for many more Git lessons.


Subscribe via RSS, Twitter, Mastodon, or email:

One summary email a week, no spam, I pinky promise.

Related posts:

Tags: ,