published by | Danielle Knudson |
---|---|
in blog | Instawork Engineering |
original entry | Improving UI performance with Django’s Prefetch |
As an engineer at Instawork, it’s my responsibility to make sure our web UI loads quickly to provide a great user experience. It can be hard to keep a page fast as we add features and display more information. I recently worked on such a page, which gave me an opportunity to learn about Django’s tools for optimizing database query performance.
A core feature of Instawork is connecting our business partners with qualified hospitality professionals. To make the connections easy, each professional has a standalone profile page that displays their experience, references, and credentials. Businesses can also browse the same profiles in a searchable list.
Each professional’s profile page only makes a handful of database queries to get their information. This doesn’t result in any noticeable performance problems. For example, in the markup below we will make a total of three database queries.
<h1>{{professional.name}}</h1>
<h2>Working in {{professional.city}}</h2>
<h3>Work Experience</h3>
{% for experience in professional.work_experiences %}
<p>{{experience.job_title}}</p>
<p>{{experience.company}}</p>
<p>{{experience.start_date}} - {{experience.end_date}}</p>
{% endfor %}
<h3>References</h3>
{% for reference in professional.references %}
<p>{{reference.body}}</p>
<p>{{reference.name}}</p>
<p>{{reference.relation}}</p>
<p>{{reference.date}}</p>
{% endfor %}
Behind the scenes, Django makes the following queries to the database:
# Get the professional's information
SELECT *
FROM professionals
WHERE id = <professional_id>;
# Get the professional's work experiences
SELECT *
FROM work_experiences
JOIN professionals ON work_experiences.professional_id = professionals.id
WHERE professionals.id = <professional_id>;
# Get the professional's references
SELECT *
FROM references
JOIN professionals ON references.professional_id = professionals.id
WHERE professionals.id = <professional_id>;
If you want to access the same location, work experience, and reference data for a list of professionals, that’s 2 additional database queries for every professional. If we have 10 professionals, that’s a total of 21 queries — the first to retrieve all the professionals and then 2 queries for each professional to their work experiences and references. This is well-known as the “N+1 query problem.”
If we list more than 10 professionals or try to access other professional information that is stored in another related model, our number of queries will increase linearly.
Conveniently, Django provides a way to easily improve database performance out of the box. Django’s prefetch_related() method reduces the number of queries made to your database by combining all related model lookups into one additional query.
In the professionals list example, we can use prefetch_related() to get the professionals and their related work experiences and references in 3 queries instead of 21.
class FavoriteProfessionalsView(ListView):
context_object_name = 'professionals'
def get_queryset(self):
# Find the business user's favorite professionals
qs = Professional.objects.filter(
companypreference__state='favorite',
companypreference__company=self.request.user.company_id,
).prefetch_related(
'workexperience_set',
'reference_set',
)
return qs
Django will then execute the 3 queries below to find the professionals and their work experiences and references.
# Get professionals
SELECT *
FROM professionals
INNER JOIN professional_preferences ON professionals.id = professional_preferences.professional_id
WHERE professional_preferences.state = 'favorite'
AND professsional_preference.company_id = <company_id> LIMIT 10;
# Get work_experiences for professionals found in the query above
SELECT *
FROM work_experiences WHERE work_experiences.professional_id IN (1, 2, 3, 4, 5, 6, 6, 7, 8, 9, 10);
# Get references for professionals found in the query above
SELECT *
FROM references WHERE references.professional_id IN (1, 2, 3, 4, 5, 6, 6, 7, 8, 9, 10);
No matter how many professionals we add to the list, there will only be a total of 3 queries. It only makes sense to optimize the list view here because the minimum number of queries for professional’s profile page view will always be 3, whereas the list view will increase linearly if prefetch_related() isn’t used.
Django also provides the Prefetch() object. This object is used to control the behavior of the prefetch_related() lookup. This came in handy when I added the current list of positions each professional is currently verified to work to the searchable list.
To get the appropriate information without having more logic in our templates, I modified the query in our FavoriteProfessionalsListView.
class FavoriteProfessionalsView(ListView):
context_object_name = 'professionals'
def get_queryset(self):
# Find the business user's favorite professionals
qs = Professional.objects.filter(
companypreference__state='favorite',
companypreference__company=self.request.user.company_id,
).prefetch_related(
'workexperience_set',
'Reference_set',
Prefetch(
lookup='professionalposition_set',
queryset=ProfessionalPosition.objects.filter(approved=True),
to_attr='prefetched_positions',
),
)
return qs
I specified the related lookup attribute on our queryset, provided the exact queryset we wanted to be used in the lookup, and set the results to the prefetch_positions attribute on each user. It’s then possible to access the prefetched data via the attribute defined in the Prefetch().
{% for professional in professionals %}
…
{% for approved_position in professional.prefetched_positions %}
<p>{{approved_position.name}}</p>
{% endfor %}
{% endfor %}
I was able to add this feature to our list of professionals without sacrificing database performance — only an additional query is needed for a total of 4 queries to render the list.
These methods are great for improving database performance which can help create a snappier UI. But what happens if you have many views which share UI components?
Recently, I found we were duplicating the same professionals query across our views and inconsistently naming prefetch lookups. Each template had to know the correct attribute to use to access the prefetched model data on our professional models.
To better organize and DRY up our code, I created a method on our ProfessionalManager class to take a queryset of professionals and prefetch the needed data on those professionals. Now this query lives in a single place, making it easy for to modify or add to it without having to do that for each query in every view.
class ProfessionalManager(models.Manager):
def favorites_for_company(self, company_id):
return self.filter(
companypreference__state='favorite',
companypreference__company=company_id,
).prefetch_related(
'workexperience_set',
'Reference_set',
Prefetch(
lookup='professionalposition_set',
queryset=ProfessionalPosition.objects.filter(approved=True),
to_attr='prefetched_positions',
),
)
When accessing related models for a single instance, you don’t get the same performance improvements with prefetch_related() because the same number of queries needs to be made. The advantage is when you’re working with multiple instances and trying to reference their related models. If you’re using the same data across views, it’s best to put your queries in your model layer so that your views can reference the same method without having to implement the queries themselves — drying up your code and keeping your views lean.
Are you already using Django’s prefetch_related() or the Prefetch object? I’d like to hear how you’re using it in your own project. Please comment below!
Improving UI performance with Django’s Prefetch was originally published in Instawork Engineering on Medium, where people are continuing the conversation by highlighting and responding to this story.