Django’s Field Choices Don’t Constrain Your Data

Bargaining with the Pegasus

This post is a PSA on the somewhat unintuitive way Field.choices works in Django.

Take this Django model definition:

from django.db import models


class Status(models.TextChoices):
    UNPUBLISHED = "UN", "Unpublished"
    PUBLISHED = "PB", "Published"


class Book(models.Model):
    status = models.CharField(
        max_length=2,
        choices=Status.choices,
        default=Status.UNPUBLISHED,
    )

    def __str__(self):
        return f"{self.id} - {Status(self.status).label}"

If we open up manage.py shell to manipulate them, we can easily create a Book with a given status choice:

In [1]: from core.models import Status, Book

In [2]: Book.objects.create(status=Status.UNPUBLISHED)
Out[2]: <Book: 1 - Unpublished>

The choices list constrains the value of status during model validation in Python:

In [3]: book = Book.objects.get(id=1)

In [4]: book.status = 'republished'

In [5]: book.full_clean()
---------------------------------------------------------------------------
ValidationError                           Traceback (most recent call last)
<ipython-input-7-e64237e0a92a> in <module>
----> 1 book.full_clean()

.../django/db/models/base.py in full_clean(self, exclude, validate_unique)
   1220
   1221         if errors:
-> 1222             raise ValidationError(errors)
   1223
   1224     def clean_fields(self, exclude=None):

ValidationError: {'status': ["Value 'republished' is not a valid choice."]}

This is great for ModelForms and other cases using validation. Users can’t select invalid choices and get messaging about what’s wrong.

Unfortunately, it’s still easy for us, as developers, to write this invalid data to the database:

In[6]: book.save()

Woops!

It’s also possible to update all our instances to an invalid status in one line:

In[8]: Book.objects.update(status="republished")
Out[8]: 1

So, what gives? Why does Django let us declare the set of choices we want the field to take, but then let us easily circumvent that?

Well, Django’s model validation is designed mostly for forms. It trusts that other code paths in your application “know what they’re doing.”

If we want to prevent this, the most general solution is to get the database itself to reject bad data. Not only will this make your Django code more robust, but any other applications using the database will use the constraints too.

We can add such constraints using CheckConstraint class, added in Django 2.2. For our model, we need define and name a single filter CheckConstraint, in Meta.constraints:

class Book(models.Model):
    status = models.CharField(
        max_length=2,
        choices=Status.choices,
        default=Status.UNPUBLISHED,
    )

    def __str__(self):
        return f"{self.id} - {Status(self.status).label}"

    class Meta:
        constraints = [
            models.CheckConstraint(
                name="%(app_label)s_%(class)s_status_valid",
                check=models.Q(status__in=Status.values),
            )
        ]

The Q object represents a single expression we’d pass into Model.objects.filter(). Constraints can have any amount of logic on the fields in the current model. This includes all kinds of lookups, comparisons between fields, and database functions.

Running makemigrations, we get a migration that looks like this:

from django.db import migrations, models


class Migration(migrations.Migration):

    dependencies = [
        ("core", "0001_initial"),
    ]

    operations = [
        migrations.AddConstraint(
            model_name="book",
            constraint=models.CheckConstraint(
                check=models.Q(status__in=["UN", "PB"]),
                name="%(app_label)s_%(class)s_status_valid",
            ),
        ),
    ]

If we try to apply this while the database contains invalid data, it will fail:

$ python manage.py migrate
Operations to perform:
  Apply all migrations: core
Running migrations:
  Applying core.0002_book_status_valid...Traceback (most recent call last):
...
  File "/.../django/db/backends/sqlite3/base.py", line 396, in execute
    return Database.Cursor.execute(self, query, params)
django.db.utils.IntegrityError: CHECK constraint failed: status_valid

If we clean that data up manually and try again, it will pass:

$ python manage.py migrate
Operations to perform:
  Apply all migrations: core
Running migrations:
  Applying core.0002_book_status_valid... OK

From that point on, the database won’t allow us to insert invalid rows, or update the valid rows to be invalid:

In [4]: book.save()
---------------------------------------------------------------------------
...
/.../django/db/backends/sqlite3/base.py in execute(self, query, params)
    394             return Database.Cursor.execute(self, query)
    395         query = self.convert_query(query)
--> 396         return Database.Cursor.execute(self, query, params)
    397
    398     def executemany(self, query, param_list):

IntegrityError: CHECK constraint failed: status_valid

In [5]: Book.objects.update(status='republished')
---------------------------------------------------------------------------
...
/.../django/db/backends/sqlite3/base.py in execute(self, query, params)
    394             return Database.Cursor.execute(self, query)
    395         query = self.convert_query(query)
--> 396         return Database.Cursor.execute(self, query, params)
    397
    398     def executemany(self, query, param_list):

IntegrityError: CHECK constraint failed: status_valid

Great!

Currently Django doesn’t have a way of showing these IntegrityErrors to users in model validation. Nothing will catch and turn them into ValidationErrors which can carry user-facing messages. As per the documentation:

In general constraints are not checked during full_clean(), and do not raise ValidationErrors.

There’s an open ticket #30581 to improve this.

In our case, since we are still using choices, this is okay. Validation already won’t allow users to select invalid statuses.

For more complex constraints, we might want to duplicate the logic in Python with a custom validator.

Further Reading

Fin

Check constraints are really neat. Having the data constrained at the lowest level possible gives us the strongest guarantees of its quality.

I hope this post helps you consider using them,

—Adam


Read my book Boost Your Git DX to Git better.


Subscribe via RSS, Twitter, Mastodon, or email:

One summary email a week, no spam, I pinky promise.

Related posts:

Tags: