Aug. 20, 2021

Common Pitfalls Using Django Signals

published by	Oleg Pesok
in blog	Instawork Engineering
original entry	Common Pitfalls Using Django Signals

There can be some strong opinions on the built-in signals available in Django and whether or not they should be used, but we’ll skip that here and instead focus on some basics of how best to use them and avoid some common pitfalls.

For the uninitiated, signals are synchronously (this will be important later) firing events that trigger handlers. There are a few common built in ones, we’ll be discussing post_save and pre_save signals that are triggered whenever a Django model is saved.

Infinite Loops

It’s actually easy to write an infinite loop using signals. Consider the following three possible implementations of an incrementing counter every time our model is saved.

# VERY BAD
@receiver(post_save, sender=MyModel)
def update_save_count(sender, instance, created, **kwargs):
    instance.save_count = instance.save_count + 1
    instance.save()

# OK
@receiver(pre_save, sender=MyModel)
def update_timestamp(sender, instance, **kwargs):
    instance.save_count = instance.save_count + 1

# BETTER
class MyModel(Model):
    def save(self, *args, **kwargs):
        self.save_count = self.save_count + 1
        super().save(*args, **kwargs)

The first time a model is saved the first signal handler will run away into an infinite loop. Since it’s a post_save signal that means a save() has just happened. We are updating a value and running save() again which will trigger the signal handler, which will trigger another save() which will…continue indefinitely.

The second example will work just fine, but it means separating some of the logic of what’s happening to a model’s data away from the model code. There are some legitimate reasons to do this, but more than likely the best approach is the last one for most needs. It unambiguously keeps any manipulations to the model data directly in the model class itself.

Race Conditions

Study the following three examples where we are firing an asynchronous task to do some processing on model data from a signal.

@task
def task_do_stuff(instance_id):
    instance = MyModel.objects.get(id=instance_id)
    if instance.save_count > 10:
        # do some stuff

# BAD
@receiver(post_save, sender=MyModel)
def do_stuff(sender, instance, created, **kwargs):
    task_do_stuff.delay(instance.id)

# ALSO BAD
@receiver(post_save, sender=MyModel)
def do_stuff(sender, instance, created, **kwargs):
    transaction.on_commit(task_do_stuff.delay(instance.id))

# GOOD
@receiver(post_save, sender=MyModel)
def do_stuff(sender, instance, created, **kwargs):
    transaction.on_commit(lambda: task_do_stuff.delay(instance.id))

Let’s walk through the above three cases to see why things might go wrong.
In the first case, we are naively calling the async task in the post save signal. Seems like this should be fine and most of the time it will be. What if the actual save to the database happens after the task is queued, picked up by a worker, and executed? In this case the model data will be stale. This is because the task managed to run so quickly that it’s still reading the OLD data from the database rather than the new data that triggered the signal in the first place.

What can we do to avoid this race condition? We have to wait for Django to commit the changes to the database and only then queue our celery task. Why is the second example bad? Well, instead of only queuing the task after the data has been committed to the database, we are queuing the task immediately and then whatever queuing the task returns will be executed after the data is committed.

The only right answer here is to either have a separate callback that will queue the task or use a lambda to accomplish the same thing as seen in the third example.

Inconsistent Data

Another scenario we see with signals is trying to run queries against models that we just updated within a signal handler for that model and getting unexpected results.

# Bad
@receiver(post_save, sender=MyModel)
def do_stuff(sender, instance, created, **kwargs):
    model_count = MyModel.objects.filter(a=b, c=d).count()
    logger.info("Created model, new count: {}".format(model_count))

# Good
@receiver(post_save, sender=MyModel)
def do_stuff(sender, instance, created, **kwargs):
    transaction.on_commit(lambda: do_stuff(instance.id))

def do_stuff():
    model_count = MyModel.objects.filter(a=b, c=d).count()
    logger.info("Created model, new count: {}".format(model_count))

The bad example above can return stale data for similar reasons to what we saw in the section above around race conditions. The reason is that this query may execute before data is committed to the database in one transaction and get old results. The way to get around this is much like in the previous section. Execute your queries in a transaction.on_commit handler.

Slow API Requests

One last pitfall with using signals is that your handler is executed synchronously, which means that if it’s part of an API request, it will need to fully complete before a response is returned to the user. It’s rare that what you’re doing in a signal handler needs to be computed before returning the response. If it does, then you’re done, but if not, there’s a better way. Consider these examples.

# Bad
@receiver(post_save, sender=MyModel)
def do_stuff(sender, instance, created, **kwargs):
    model_count = MyModel.objects.filter(a=b, c=d).count() 
    logger.info("Created model, new count: {}".format(model_count))

# Good
@receiver(post_save, sender=MyModel)
def do_stuff(sender, instance, created, **kwargs): 
    transaction.on_commit(lambda: task_do_stuff.delay())

@task
def task_do_stuff():
    model_count = MyModel.objects.filter(a=b, c=d).count()   
    logger.info("Created model, new count: {}".format(model_count))

If you’ve been following along, you probably wouldn’t make this mistake for other reasons, but it’s easy to accidentally start shoving lots of logic into a signal. That temptation should be avoided and instead immediately write an asynchronous task to do the dirty work and put all of that code there.

Slow Testing

Testing signal handlers, when written the right way, using transaction.on_commit can either lead to broken tests or incredibly slow tests. If you naively use the standard TestCase you’ll discover that anything inside your commit handlers doesn’t actually get called.

To have the code in a commit execute, you can’t use Django’s TestCase but instead the much heavier/slower TransactionTestCase which will actually write data to the database rather than open a transaction at the start and then roll back when the test finishes.

Luckily, this problem has been solved in newer versions of Django. If you’re not fortunate enough to be running the latest version, there’s a 3rd party package available, django-capture-on-commit-callbacks. To summarize its usage, here are some examples.

# Bad (will fail)
class TestSignal(TestCase):
    @mock("task_do_stuff")
    def test_signal(self, mock_task):
        signals.do_stuff()
        mock_task.assert_called_once()

# Bad (very slow)
class TestSignal(TransactionTestCase):
    @mock("task_do_stuff")
    def test_signal(self, mock_task):
        signals.do_stuff()
        mock_task.assert_called_once()

# Good
class TestSignal(TestCase):
    @mock("task_do_stuff")
    def test_signal(self, mock_task):
        with capture_on_commit_callbacks(execute=True):       
            signals.do_stuff()
        mock_task.assert_called_once()

Conclusion

To conclude, as simple as Django’s signals may seem, there are quite a few mistakes that are very easy to make, even by seasoned developers. Be careful and follow the above guidelines to make sure you write fast and bug free Django code. Do you have any challenges or solutions you’ve found around using Signals? Let us know.

Common Pitfalls Using Django Signals was originally published in Instawork Engineering on Medium, where people are continuing the conversation by highlighting and responding to this story.