django-planet
July 3, 2023

Organizing Database Queries: Managers vs. QuerySets

in blog Django Beats
original entry Organizing Database Queries: Managers vs. QuerySets

In this article Mariusz Felisiak introduces concepts of Django’s Managers and QuerySet to organize common database queries. Django on Fly.io is pretty sweet! Check it out: you can be up and running on Fly.io in just minutes.

I’m always focusing on writing readable and maintainable code. When our project becomes bigger and more complex, the number of models and complexity of data access control can increase significantly. We’re starting to ask ourselves questions: How to organize data structure? How to control access to our data in a reliable way? How to share common logic that protects our data? Where to keep data filters? etc. This article is trying to answer some of these questions by introducing concepts of Django’s Managers and QuerySets. We will describe the biggest differences between them and how they can be used to organize common database queries.

Problem

In all projects there exists logic behind retrieving objects from the database and annotating them. Annotations are really helpful for dealing with calculated/virtual fields (not stored in the database) as they allow moving calculations to the database engine and make them flexible to use. We’re talking about things like:

python3 manage.py shell
>>> from django.db.models import Value
>>> from django.db.models.functions import Concat
>>> from my_app.models import Person
>>>
>>> persons = Person.objects.annotate(
...     full_name=Concat("first_name", Value(" "), "last_name"),
... )
>>> persons.first().full_name
"Joe Doe"

Django doesn’t support virtual fields, but they may be implemented in the near future because Django contributors are actively working on this 🚀.

It’s not uncommon to use the same filters in various places. Django has a concept of Managers and QuerySets which allow us to organize those elements in a reliable way. Let’s explore what they are and how to deal with them 🔎.

Managers and QuerySets

Manager is the interface that defines database operations for our models. Django adds a default manager available through the objects attribute, if we do not define a custom one. As a consequence, each non-abstract model has at least one manager that is a gateway to the model’s data. Have you ever asked yourself what is this objects that we use so often? 🤔 Now you know, it’s a Manager! Django allows us to define multiple managers for the same model. Moreover, methods in the manager can return everything that works for us, they are not limited to the QuerySets. Let’s look at an example:

# models.py

from datetime import date, timedelta

from django.db import models
from django.db.models.functions import Concat


class PersonManager(models.Manager):
    def with_extra_fields(self):
        return self.annotate(
            full_name=Concat("first_name", models.Value(" "), "last_name"),
        )

    def experienced(self):
        return self.filter(
            join_date__lt=date.today() - timedelta(days=1000)
        )


class PersonAnalyticsManager(models.Manager):
    def number_of_unique_names(self):
        return self.aggregate(
            count=models.Count("last_name", distinct=True),
        )["count"]


class Person(models.Model):
    first_name = models.CharField(max_length=255)
    last_name = models.CharField(max_length=255)
    join_date = models.DateField(default=date.today)

    # We redefine the default manager.
    objects = PersonManager()
    # We add a second manager with analytic methods.
    analytics = PersonAnalyticsManager()

    def __str__(self):
        return f"{self.first_name} {self.last_name}"

With managers you can freely share annotations, aggregations, reuse common filters, and more:

python3 manage.py shell
>>> from .models import Person
>>> Person.objects.with_extra_fields().order_by("full_name")
<QuerySet [<Person: Catherine Smith>, <Person: Joe Doe>, <Person: Omega Smith>]>
>>> Person.objects.experienced()
<PersonQuerySet [<Person: Joe Doe>]>
>>> Person.analytics.number_of_unique_names()
2

Unfortunately, there are some inconveniences to using managers. The biggest one is that manager methods cannot be chained 😞, which is really powerful when working with QuerySets.

>>> Person.objects.experienced().with_extra_fields()
...
AttributeError: 'QuerySet' object has no attribute 'with_extra_fields'

How to deal with this? There is a way! We can take the best of both worlds of Managers and QuerySets by defining a custom QuerySet and using it as a manager. For this we will use the as_manager() method. Let’s rewrite our example:

# models.py

from datetime import date, timedelta

from django.db import models
from django.db.models.functions import Concat


class PersonQuerySet(models.QuerySet):  # Changed to a custom QuerySet.
    def with_extra_fields(self):
        return self.annotate(
            full_name=Concat("first_name", models.Value(" "), "last_name"),
        )

    def experienced(self):
        return self.filter(
            join_date__lt=date.today() - timedelta(days=1000)
        )

    def number_of_unique_names(self):
        return self.aggregate(
            count=models.Count("last_name", distinct=True),
        )["count"]


class Person(models.Model):
    first_name = models.CharField(max_length=255)
    last_name = models.CharField(max_length=255)
    join_date = models.DateField(default=date.today)
    # QuerySet used as a Manager.
    objects = PersonQuerySet.as_manager()

    def __str__(self):
        return f"{self.first_name} {self.last_name}"

and check if it works 🥳:

python3 manage.py shell
>>> Person.objects.experienced()
<PersonQuerySet [<Person: Joe Doe>]>
>>> Person.objects.experienced().with_extra_fields().get().full_name
'Joe Doe'
>>> Person.objects.experienced().number_of_unique_names()
1
>>> Person.objects.number_of_unique_names()
2 

In the next section we will cover even more handy usages of custom QuerySets.

Full example

We need an existing or new Django project. Here are some great resources for getting started with Django or deploying your Django app to Fly.io.

With a project ready, let’s get started!

In our example, we’ll use the following models:

# models.py

from datetime import date

from django.db import models

from .querysets import PersonQuerySet
from .managers import OrderManager


class Person(models.Model):
    first_name = models.CharField(max_length=255)
    last_name = models.CharField(max_length=255)
    join_date = models.DateField(default=datetime.date.today)
    # We redefine the default manager.
    objects = PersonQuerySet.as_manager()


class Order(models.Model):
    seller = models.ForeignKey("Person", on_delete=models.CASCADE)
    sale_date = models.DateField(default=date.today)
    deleted = models.BooleanField(default=False)
    # We redefine the default manager.
    objects = OrderManager()

Let’s start by implementing soft-delete for orders.

Soft-delete with a custom Manager

We want to be able to delete orders but at the same time we want to keep historical data to have a track of all orders, including deleted ones. For this, we will use the deleted flag on the Order model. The questions are:

  • How to inject this feature into our existing project without rewriting all the QuerySets related to orders?
  • How to make sure deleted orders won’t leak in some places?

Overriding the default manager (objects) is ideal in such cases as it allows us to change the underlying base QuerySet by defining a special method called get_queryset(). This is how we can do this:

# managers.py

from django.db.models import Manager


class OrderManager(Manager):
    def get_queryset(self):
        return super().get_queryset().filter(deleted=False)

If we want to have backdoor access to all orders in our Django project (including deleted ones), we can define a second manager bypassing this restriction:

from django.db import models

class Order(models.Model):
    seller = models.ForeignKey("Person", on_delete=models.CASCADE)
    sale_date = models.DateField(default=datetime.date.today)
    deleted = models.BooleanField(default=False)

    objects = OrderManager()
    historical = models.Manager()

Let’s find out how it works in practice:

>>> from order.models import Order, Person
>>> from django.db.models import Count
>>> Order.objects.count()  # Deleted orders are not included.
7
>>> Order.objects.filter(deleted=True).count()
0
>>> Order.historical.count()  # All orders are included.
8
>>> Order.historical.filter(deleted=True).count()
1
>>> # Deleted orders are not included when using related managers.
>>> persons = Person.objects.with_extra_fields().annotate(Count("order"))
>>> for person in persons:
...     print(person.full_name, person.order__count)
... 
Joe Doe 1
Omega Smith 3
Catherine Smith 4

Deleted orders will not leak to any QuerySet unless we will explicitly use the historical manager. We can add more methods to our QuerySet and mix them freely as we will see in the next section.

Sharing and chaining more annotations, aggregations, and filters

Let’s add one more method to our QuerySet and check out how handy it is to work with different data requirements. Suppose sellers are considered active when they have at least 5 sales in the last 30 days, we can filter them out with the active() method:

from datetime import date, timedelta

from django.db.models import Count, Value, Q, QuerySet
from django.db.models.functions import Concat


class PersonQuerySet(QuerySet):
    def with_extra_fields(self):
        return self.annotate(
            full_name=Concat("first_name", Value(" "), "last_name"),
        )

    def experienced(self):
        return self.filter(
            join_date__lt=date.today() - timedelta(days=1000)
        )

    def active(self):  # <- Our new method.
        # Sellers with more then 5 orders in the last 30 days are
        # considered active.
        count_from_date = date.today() - timedelta(days=30)
        return self.annotate(
            # Deleted orders are not taken into account because we
            # filtered them out in OrderManager.get_queryset().
            recent_sales_counter=Count(
                "order", filter=Q(order__sale_date__gte=count_from_date)
            ),
        ).filter(recent_sales_counter__gte=5)

    def number_of_unique_names(self):
        return self.aggregate(
            count=Count("last_name", distinct=True),
        )["count"]

Let’s give it a spin:

python3 manage.py shell
>>> from my_app.models import Order, Person
>>> # Active sellers with more then 5 orders in the last 30 days.
>>> Person.objects.active()
<PersonQuerySet [<Person: Joe Doe>, <Person: Catherine Smith>]>
>>> # Active sellers with the number or recent orders, ordered
>>> # by the number of orders. We're chaining 4 QuerySet
>>> # methods, two standard and two custom.
>>> Person.objects.with_extra_fields().active().order_by(
...     "-recent_sales_counter",
... ).values("full_name", "recent_sales_counter")
<PersonQuerySet [
  {'full_name': 'Joe Doe', 'recent_sales_counter': 10},
  {'full_name': 'Catherine Smith', 'recent_sales_counter': 5}
]>
>>> # Active experienced sellers.
>>> Person.objects.with_extra_fields().active().experienced()
<PersonQuerySet [<Person: Joe Doe>]>
>>> # We can exclude experienced sellers.
>>> from django.db.models import Exists, OuterRef
>>> Person.objects.active().exclude(
...     Exists(Person.objects.experienced().filter(pk=OuterRef("pk"))),
... )
<PersonQuerySet [<Person: Catherine Smith>]>
>>> # and so on ... 

We can do whatever the heart desires 💃🕺, the Django documentation has many more examples.

Closing Thoughts

Here we made an introduction to extremely useful concepts of Managers and QuerySets in Django that allow us to keep queries under control. Check the Django documentation if you want to dive deeper into all possibilities. If you’ve never used them, it’s time to help your future self and make your work with the ORM smoother, easier to maintain, and more secure!