Python: Diffing unit tests to keep a copy-pasted code in sync

2024-04-26 Snip snip, undo, cut, copy, paste.

Copy-paste-tweaking library code feels like a dirty but inevitable programming practice. Often driven by deadlines or other constraints, it seems all projects end up with something copy-pasted in and tweaked for one specific use case.

When we find ourselves doing this, it’s essential to consider the long-term maintenance of those copies. After all, “software engineering is programming integrated over time” (see previously). We want to add a defence that alerts us to any relevant upstream changes. But since that is hard to do robustly, it is often omitted.

One approach is to maintain a fork, but that is heavy-handed and requires per-release maintenance. In this post, we’ll cover an alternative I recently tried, using a unit test. This test asserts that the diff between the upstream code and our project’s copy-pasted version is constant. The test fails if either version changes, smoothing upgrades and ensuring we consider any further tweaks.

A Djangoey example

I recently worked on a Django project that heavily extends Django’s admin site. Most of these extensions were done as usual, extending classes or templates as required. However, one case needed a copy-paste-tweak of the upstream “fieldset” template used to render form fields. That tweak looks something like this:

                 {% if field.is_readonly %}
                     <div class="readonly">{{ field.contents }}</div>
                 {% else %}
-                    {{ field.field }}
+                    {% block field %}
+                        {{ field.field }}
+                    {% endblock %}
                 {% endif %}
             {% endif %}
         </div>

The extra {% block %} allows extending templates to modify the rendering of select fields.

When upgrading to a later Django version, the upstream template and corresponding CSS changed. That caused the tweaked template to render incorrectly since it still had the old base. In particular, the fields stopped stacking horizontally, leading to some unusably lengthy pages.

The fix was to integrate the upstream changes into the copied template. Doing so revealed that some smaller changes had also been missed from previous Django versions. I added a diffing unit test like the one below to ensure future upstream changes will not be missed.

import difflib
import re
from pathlib import Path
from textwrap import dedent

import django
from django.conf import settings
from django.test import SimpleTestCase


class CopiedTemplateTests(SimpleTestCase):
    """
    Tests to check synchronization of templates that we’ve copy-paste-tweaked
    from Django. These tests fail when either version changes, so we may need
    to integrate upstream changes before regenerating the included diffs.

    Get updated diffs on failure by using pytest --pdb and print(diff).
    """

    def test_admin_includes_fieldset(self):
        upstream_version = (
            (
                Path(django.__path__[0])
                / "contrib/admin/templates/admin/includes/fieldset.html"
            )
            .open()
            .readlines()
        )
        our_version = (
            (settings.BASE_DIR / "templates/admin/includes/fieldset.html")
            .open()
            .readlines()
        )
        diff = "".join(
            difflib.unified_diff(
                upstream_version, our_version, fromfile="upstream", tofile="ours"
            )
        )
        diff = re.sub(r"^ \n", "\n", diff, flags=re.MULTILINE)
        expected_diff = dedent(
            """\
            --- upstream
            +++ ours
            @@ -17,7 +17,9 @@
                                             {% if field.is_readonly %}
                                                 <div class="readonly">{{ field.contents }}</div>
                                             {% else %}
            -                                    {{ field.field }}
            +                                    {% block field %}
            +                                        {{ field.field }}
            +                                    {% endblock %}
                                             {% endif %}
                                         {% endif %}
                                     </div>
            """
        )
        assert diff == expected_diff

Here’s how the test works:

The two template files are read into lists of lines using pathlib. The path for the upstream version is computed from the Django module’s __path__ attribute, which will be inside the project’s virtual environment. The project version uses Django’s BASE_DIR setting, which points at the project root.
The diff between the two versions is computed using Python’s difflib.unified_diff(). It’s neat this is built-in!
The diff is modified with a regular expression to strip the whitespace on blank lines. This is to make it compatible with the expected diff.
The diff is compared with its expected version. To keep the expected diff inside the test without weird indentation, its multiline string is dedented with textwrap.dedent().

When the test fails, under pytest, it looks like this:

>       assert diff == expected_diff
E       AssertionError: assert '--- upstream...     </div>\n' == '--- upstream...     </div>\n'
E
E         Skipping 326 identical leading characters in diff, use -v to show
E         - lock fields %}
E         ?           -
E         + lock field %}
E           +                                        {{ field.field }}
E           +                                    {% endblock %}...
E
E         ...Full output truncated (3 lines hidden), use '-vv' to show

This “diff of diffs” isn’t the easiest to read, but it at least gives an idea of where the unexpected differences lie. Unfortunately, the failure can’t differentiate whether the upstream or project version changed, but that should be obvious in most situations.

Per the docstring, the updated diff can be retrieved by running pytest with its --pdb option and print(diff):

$ pytest --pdb example/tests.py
========================= test session starts =========================
...
example/tests.py:55: AssertionError
>>>>>>>>>>>>>>>>>>>>>>>>>>>> entering PDB >>>>>>>>>>>>>>>>>>>>>>>>>>>>>

>>>>>>>>>>>>>> PDB post_mortem (IO-capturing turned off) >>>>>>>>>>>>>>
> /.../example/tests.py(55)test_admin_includes_fieldset()
-> assert diff == expected_diff
(Pdb) print(diff)
--- upstream
+++ ours
@@ -17,7 +17,9 @@
                                 {% if field.is_readonly %}
                                     <div class="readonly">{{ field.contents }}</div>
                                 {% else %}
-                                    {{ field.field }}
+                                    {% block field %}
+                                        {{ field.field }}
+                                    {% endblock %}
                                 {% endif %}
                             {% endif %}
                         </div>

(Pdb)

This can then be copy-pasted back into the test file.

With this test in place, I am confident that the project will merge future upstream changes to this template.

Diffing classes and functions

This approach can be adapted to copy-paste-tweaked classes or functions by using Python’s inspect module to gather their source code. Whilst I’d normally recommend subclassing, or patching with patchy, it could be helpful when edits to the middle of a function are required. Below is an imagined example with a modified copy of Django’s timesince filter.

import difflib
import inspect
import re
from textwrap import dedent

from django.test import SimpleTestCase
from django.utils.timesince import timesince as upstream_timesince

from example.timesince import timesince as our_timesince


class CopiedFunctionTests(SimpleTestCase):
    """
    Tests to check synchronization of functions that we’ve copy-paste-tweaked.
    These tests fail when either version changes, so we may need to integrate
    upstream changes before regenerating the included diffs.

    Get updated diffs on failure by using pytest --pdb and print(diff).
    """

    def test_timesince(self):
        upstream_version = inspect.getsource(upstream_timesince).splitlines(
            keepends=True
        )
        our_version = inspect.getsource(our_timesince).splitlines(keepends=True)
        diff = "".join(
            difflib.unified_diff(
                upstream_version, our_version, fromfile="upstream", tofile="ours"
            )
        )
        diff = re.sub(r"^ \n", "\n", diff, flags=re.MULTILINE)
        expected_diff = dedent(
            """\
            --- upstream
            +++ ours
            @@ -45,6 +45,10 @@
                 if reversed:
                     d, now = now, d
                 delta = now - d
            +
            +    # Return “Now” for small differences.
            +    if -10 <= delta.total_seconds() <= 10:
            +        return "Now"

                 # Ignore microseconds.
                 since = delta.days * 24 * 60 * 60 + delta.seconds
            """
        )
        assert diff == expected_diff

This test works similarly to the one before. The difference is that each function's source code is retrieved using inspect.getsource().

Fin

Let me know if you try this technique and how well it works.

Never split the difference,

—Adam

Learn how to make your tests run quickly in my book Speed Up Your Django Tests.

One summary email a week, no spam, I pinky promise.

Related posts:

Tags: django, python