Django Event Logger

Aman Mathur
Technology at upGrad
5 min readJul 29, 2018

--

How often has a Product Manager walked upto you months after a feature release demanding data on a particular workflow? As we go deeper into the realm of data-driven decision making, dealing with such requirements has become a norm rather than an exception. Read on to see how we created a Django add-on to tackle this problem.

One of Computer Science’s earliest and most fundamental model of computation is the Finite State Machine. Many logical entities behave in a fashion that is analogous to a finite state machine, such as an automated teller machine (ATM), certain disease-causing pathogens and even the behaviour of ghosts in Pac-Man can be represented like a FSM. The transitions between the various state of such FSMs can be influenced by a mix of both deterministic and probabilistic factors.

The FSM for a Ghost in Pac-Man

Let’s take a more relatable example of a FSM that needs to be modelled in a typical e-commerce application: the life-cycle of an order. The happy-path here would be for an order to be created and then being paid for. After this the seller confirms the payment, packages the order and then dispatches it for delivery. Finally the order gets delivered to the customer who confirms the delivery. It is easy to see that it is possible to complicate this life-cycle to include processes like returns and refunds, post-delivery payment, delivery address updation, feedback on order delivery, etc. Moreover, there are states such as order cancelled, which can be reached from any of the various other states, challenging our linear view of this system, and forcing us to model these transitions as a complex graph and not a simple chain.

Modelling FSMs in Databases

Relational databases, the most popular forms of persistence, do not provide the flexibility of modelling graphs easily, hence this dynamic “status” attribute is usually kept as just an additional column in the database, which gets updated for a particular row whenever a transition occurs. The check whether a transition is valid or not happens at the application layer, and the database field is either a free text or some enumerated type. This allows us to model the FSM behaviour of the object in a relational database. For our example, let’s assume we add a “status” column to our orders table which gets updated via the relevant APIs. Several popular web-frameworks support this implementation of FSM such as Django-FSM or Spring Statemachine.

A frequent analytical requirement in these cases is around the various state transitions such as the time taken in a particular state or the count of transitions from one specific state to another. In our running example it would translate to business requirements such as: “What is the turn-around-time (TAT) in the order packaging state?”, “How many customers cancel an order once it is dispatched?”. Our current implementation is ill-equipped to answer these queries. A log of all transitions, along with some relevant meta-information is required to address such questions.

This is where the Django Event Logger is helpful. Without the overhead of writing code for logging each update on a column, one can customize the fields over which the changes need to be logged in a simple JSON format. This enables logging these changes to a new table in the database along with the before and after values, the object id, object type and the timestamp of the update operation. We deliberately chose to log changes in the same database as that of the application so that in the event of a rollback of a SQL transaction, our logs also get rollbacked.

Features

Sometimes logging just the transition is insufficient. One may require more information regarding the event such as who triggered it or under which process was it triggered, such as logging which user or process was responsible for verifying the payment. In terms of APIs this would translate to the user who made the API call and the API which triggered the transition. Since such parameters are fairly specific to the workflow of a Django app, the Event Logger supports adding custom code in the configuration to access these variables.

The event logger is also generic and independent of whether the column is modelled as an FSM or not. Even simple columns can be configured, and any updates to them can also be easily logged in the table. Hence, it is possible to log any changes to the “delivery address” attribute of an order as well, along with who made the update.

Another feature of the event logger is that we can specify conditional logic on when the logging needs to be enabled. This is useful to avoid creation of unnecessary data to prevent our database from bloating up. For example, suppose we wish to log changes to the delivery-related fields such as packaging, dispatch and receiver confirmation only for orders where the user opts for in-house delivery rather than some delivery partner. The ability to specify logging over an object based on its other related attributes allows us to do just that without writing complex conditional statements in SQL triggers.

Shortcomings

This method of logging and its flexibility does lead to certain shortcomings and restrictions. Firstly, since the logging is at an Django application level, we miss out on updates from raw SQL queries which are executed directly on the database. Moreover, under the hood, the logger uses Django signals to get notified of updates, which has a limitation of missing out on bulk updates.

Another overhead of adding the Django Event Logger to your app is that you will have to manage its migrations manually. As mentioned above, the logging schema is fairly flexible and configurable and is derived at from a configuration JSON making it infeasible to generate migrations automatically. Thus, the migrations cannot be computed automatically, and after making changes to the configuration JSON, one has to manually generate the migration.

Django Event Logger @ UpGrad

The Django Event Logger in action at UpGrad

At UpGrad, we use the Django Event Logger to log all updates in the life cycle of a student’s assignment submission, which contains steps like creation, assignment to a grader, completion of grading, publishing of the evaluation and other interim states. The app also allows us to conveniently log changes made to sensitive fields like our learner’s submission, assigned grader, etc. This data empowers the Data Analytics team to create dashboards to measure the efficiency of grading and easily identify bottlenecks in the process, enabling us to take data-driven decisions on our grading processes.

Want to work with the awesome Tech team at UpGrad? Check out the positions now open!

--

--