in blog | Django Beats |
---|---|
original entry | Organizing Database Queries: Managers vs. QuerySets |
In this article Mariusz Felisiak introduces concepts of Django’s Managers and QuerySet to organize common database queries. Django on Fly.io is pretty sweet! Check it out: you can be up and running on Fly.io in just minutes.
I’m always focusing on writing readable and maintainable code. When our project becomes bigger and more complex, the number of models and complexity of data access control can increase significantly. We’re starting to ask ourselves questions: How to organize data structure? How to control access to our data in a reliable way? How to share common logic that protects our data? Where to keep data filters? etc. This article is trying to answer some of these questions by introducing concepts of Django’s Managers and QuerySets. We will describe the biggest differences between them and how they can be used to organize common database queries.
In all projects there exists logic behind retrieving objects from the database and annotating them. Annotations are really helpful for dealing with calculated/virtual fields (not stored in the database) as they allow moving calculations to the database engine and make them flexible to use. We’re talking about things like:
python3 manage.py shell
>>> from django.db.models import Value
>>> from django.db.models.functions import Concat
>>> from my_app.models import Person
>>>
>>> persons = Person.objects.annotate(
... full_name=Concat("first_name", Value(" "), "last_name"),
... )
>>> persons.first().full_name
"Joe Doe"
Django doesn’t support virtual fields, but they may be implemented in the near future because Django contributors are actively working on this 🚀.
It’s not uncommon to use the same filters in various places. Django has a concept of Managers and QuerySets which allow us to organize those elements in a reliable way. Let’s explore what they are and how to deal with them 🔎.
Manager is the interface that defines database operations for our models. Django adds a default manager available through the objects
attribute, if we do not define a custom one. As a consequence, each non-abstract model has at least one manager that is a gateway to the model’s data. Have you ever asked yourself what is this objects
that we use so often? 🤔 Now you know, it’s a Manager! Django allows us to define multiple managers for the same model. Moreover, methods in the manager can return everything that works for us, they are not limited to the QuerySets. Let’s look at an example:
# models.py
from datetime import date, timedelta
from django.db import models
from django.db.models.functions import Concat
class PersonManager(models.Manager):
def with_extra_fields(self):
return self.annotate(
full_name=Concat("first_name", models.Value(" "), "last_name"),
)
def experienced(self):
return self.filter(
join_date__lt=date.today() - timedelta(days=1000)
)
class PersonAnalyticsManager(models.Manager):
def number_of_unique_names(self):
return self.aggregate(
count=models.Count("last_name", distinct=True),
)["count"]
class Person(models.Model):
first_name = models.CharField(max_length=255)
last_name = models.CharField(max_length=255)
join_date = models.DateField(default=date.today)
# We redefine the default manager.
objects = PersonManager()
# We add a second manager with analytic methods.
analytics = PersonAnalyticsManager()
def __str__(self):
return f"{self.first_name} {self.last_name}"
With managers you can freely share annotations, aggregations, reuse common filters, and more:
python3 manage.py shell
>>> from .models import Person
>>> Person.objects.with_extra_fields().order_by("full_name")
<QuerySet [<Person: Catherine Smith>, <Person: Joe Doe>, <Person: Omega Smith>]>
>>> Person.objects.experienced()
<PersonQuerySet [<Person: Joe Doe>]>
>>> Person.analytics.number_of_unique_names()
2
Unfortunately, there are some inconveniences to using managers. The biggest one is that manager methods cannot be chained 😞, which is really powerful when working with QuerySets.
>>> Person.objects.experienced().with_extra_fields()
...
AttributeError: 'QuerySet' object has no attribute 'with_extra_fields'
How to deal with this? There is a way! We can take the best of both worlds of Managers and QuerySets by defining a custom QuerySet and using it as a manager. For this we will use the as_manager()
method. Let’s rewrite our example:
# models.py
from datetime import date, timedelta
from django.db import models
from django.db.models.functions import Concat
class PersonQuerySet(models.QuerySet): # Changed to a custom QuerySet.
def with_extra_fields(self):
return self.annotate(
full_name=Concat("first_name", models.Value(" "), "last_name"),
)
def experienced(self):
return self.filter(
join_date__lt=date.today() - timedelta(days=1000)
)
def number_of_unique_names(self):
return self.aggregate(
count=models.Count("last_name", distinct=True),
)["count"]
class Person(models.Model):
first_name = models.CharField(max_length=255)
last_name = models.CharField(max_length=255)
join_date = models.DateField(default=date.today)
# QuerySet used as a Manager.
objects = PersonQuerySet.as_manager()
def __str__(self):
return f"{self.first_name} {self.last_name}"
and check if it works 🥳:
python3 manage.py shell
>>> Person.objects.experienced()
<PersonQuerySet [<Person: Joe Doe>]>
>>> Person.objects.experienced().with_extra_fields().get().full_name
'Joe Doe'
>>> Person.objects.experienced().number_of_unique_names()
1
>>> Person.objects.number_of_unique_names()
2
In the next section we will cover even more handy usages of custom QuerySets.
We need an existing or new Django project. Here are some great resources for getting started with Django or deploying your Django app to Fly.io.
With a project ready, let’s get started!
In our example, we’ll use the following models:
# models.py
from datetime import date
from django.db import models
from .querysets import PersonQuerySet
from .managers import OrderManager
class Person(models.Model):
first_name = models.CharField(max_length=255)
last_name = models.CharField(max_length=255)
join_date = models.DateField(default=datetime.date.today)
# We redefine the default manager.
objects = PersonQuerySet.as_manager()
class Order(models.Model):
seller = models.ForeignKey("Person", on_delete=models.CASCADE)
sale_date = models.DateField(default=date.today)
deleted = models.BooleanField(default=False)
# We redefine the default manager.
objects = OrderManager()
Let’s start by implementing soft-delete for orders.
We want to be able to delete orders but at the same time we want to keep historical data to have a track of all orders, including deleted ones. For this, we will use the deleted
flag on the Order
model. The questions are:
Overriding the default manager (objects
) is ideal in such cases as it allows us to change the underlying base QuerySet by defining a special method called get_queryset()
. This is how we can do this:
# managers.py
from django.db.models import Manager
class OrderManager(Manager):
def get_queryset(self):
return super().get_queryset().filter(deleted=False)
If we want to have backdoor access to all orders in our Django project (including deleted ones), we can define a second manager bypassing this restriction:
from django.db import models
class Order(models.Model):
seller = models.ForeignKey("Person", on_delete=models.CASCADE)
sale_date = models.DateField(default=datetime.date.today)
deleted = models.BooleanField(default=False)
objects = OrderManager()
historical = models.Manager()
Let’s find out how it works in practice:
>>> from order.models import Order, Person
>>> from django.db.models import Count
>>> Order.objects.count() # Deleted orders are not included.
7
>>> Order.objects.filter(deleted=True).count()
0
>>> Order.historical.count() # All orders are included.
8
>>> Order.historical.filter(deleted=True).count()
1
>>> # Deleted orders are not included when using related managers.
>>> persons = Person.objects.with_extra_fields().annotate(Count("order"))
>>> for person in persons:
... print(person.full_name, person.order__count)
...
Joe Doe 1
Omega Smith 3
Catherine Smith 4
Deleted orders will not leak to any QuerySet unless we will explicitly use the historical
manager. We can add more methods to our QuerySet and mix them freely as we will see in the next section.
Let’s add one more method to our QuerySet and check out how handy it is to work with different data requirements. Suppose sellers are considered active when they have at least 5 sales in the last 30 days, we can filter them out with the active()
method:
from datetime import date, timedelta
from django.db.models import Count, Value, Q, QuerySet
from django.db.models.functions import Concat
class PersonQuerySet(QuerySet):
def with_extra_fields(self):
return self.annotate(
full_name=Concat("first_name", Value(" "), "last_name"),
)
def experienced(self):
return self.filter(
join_date__lt=date.today() - timedelta(days=1000)
)
def active(self): # <- Our new method.
# Sellers with more then 5 orders in the last 30 days are
# considered active.
count_from_date = date.today() - timedelta(days=30)
return self.annotate(
# Deleted orders are not taken into account because we
# filtered them out in OrderManager.get_queryset().
recent_sales_counter=Count(
"order", filter=Q(order__sale_date__gte=count_from_date)
),
).filter(recent_sales_counter__gte=5)
def number_of_unique_names(self):
return self.aggregate(
count=Count("last_name", distinct=True),
)["count"]
Let’s give it a spin:
python3 manage.py shell
>>> from my_app.models import Order, Person
>>> # Active sellers with more then 5 orders in the last 30 days.
>>> Person.objects.active()
<PersonQuerySet [<Person: Joe Doe>, <Person: Catherine Smith>]>
>>> # Active sellers with the number or recent orders, ordered
>>> # by the number of orders. We're chaining 4 QuerySet
>>> # methods, two standard and two custom.
>>> Person.objects.with_extra_fields().active().order_by(
... "-recent_sales_counter",
... ).values("full_name", "recent_sales_counter")
<PersonQuerySet [
{'full_name': 'Joe Doe', 'recent_sales_counter': 10},
{'full_name': 'Catherine Smith', 'recent_sales_counter': 5}
]>
>>> # Active experienced sellers.
>>> Person.objects.with_extra_fields().active().experienced()
<PersonQuerySet [<Person: Joe Doe>]>
>>> # We can exclude experienced sellers.
>>> from django.db.models import Exists, OuterRef
>>> Person.objects.active().exclude(
... Exists(Person.objects.experienced().filter(pk=OuterRef("pk"))),
... )
<PersonQuerySet [<Person: Catherine Smith>]>
>>> # and so on ...
We can do whatever the heart desires 💃🕺, the Django documentation has many more examples.
Here we made an introduction to extremely useful concepts of Managers and QuerySets in Django that allow us to keep queries under control. Check the Django documentation if you want to dive deeper into all possibilities. If you’ve never used them, it’s time to help your future self and make your work with the ORM smoother, easier to maintain, and more secure!