Rob Oakes
Aug 12, 2020

Wagtail Snippets: Querying an Abstract Page Model

It's commonly accepted wisdom that Django's abstract model classes provide a powerful way to share code, at the cost of not being able to execute queries across child model subclasses. But is that always the case? This post looks at one mechanism to query children of abstract parents in the context of Wagtail Page models.

When creating models in a Django application, there's often a need for models to share properties or code. You might be building a website, for example, that is going to have standalone content pages and blog pages. Other than the page layout, these two types of content may be almost identical. They're both likely to have a title, a summary, a cover image, and a body.

Throughout the site, it would be awfully convenient to utilize a common set of template tags or helper methods that leverage the same interface rather than writing code that targets each model separately. Likewise, it would be very handy if you could have the two models share common field definitions and associated methods.

Django provides a number of ways to allow you to share code and properties through the use of model inheritance. There are three primary types:

  • Abstract Base Classes. Abstract models allow for common fields to be placed in a parent class, but tables are only created for derived models. This allows for multiple models to share code and fields, but lets each child model have its own table. There is no overhead from extra tables and joins.
  • Multi-table Inheritance. When using multi-table inheritance, tables are created for both parent and child models with the two linked across a OneToOneField. Because each model has its own table, it is possible to query both the parent or child. This can be an enormously flexible way for sharing some fields between two models (such as with a content and blog page), while still allowing each to have their own unique fields, templates, or other code.
  • Proxy Models. When using a proxy model, a table is only created for the original (parent) model; but there is an opportunity to have an alias of the model fields with different Python behavior.

Each of the three approaches brings with it significant benefits, but also some drawbacks:

  • The parent class of the abstract base class cannot be used in isolation.
  • Multi-table inheritance has significant overhead and incurs a performance penalty on each query since a join is required between child and parent tables.
  • Due to the way in which proxy models work, it is not possible to change the model's fields.

Abstract Models Are the Way to Go

While there are many rules of thumbs about what is the "best" way to manage model inheritance, in general, the community advises that abstract models are the way to go. Sometimes, though, the inability to query across child classes can cause severe problems. I recently ran across one such case while working with the Page models of the popular content management system Wagtail.

Wagtail Page Models Use Multi-Table Inheritance

In order to make the CMS work in a uniform manner, the authors of Wagtail chose to create a single Page object from which all other types of pages in the site must inherit. While this doesn't follow the Django community's best practice notions, it is remarkably handy. Pages share quite a lot of code, and it is very convenient to interact with components of the site in a uniform manner.

The example in the listing shows what extending the model to create blog and content pages might look like in practice. In my implementation, I've created models with three fields:

  • body: providing the HTML text of the page
  • author: which links to the account of the user who wrote the post
  • feature: a boolean indicating whether the article is "special" and should appear in a particular location in page templates

As implemented, the three models use multi-table inheritance with common attributes -- such as title, URL stub/path, etc. -- defined on the parent model, and fields specific to the page types defined on the child. This approach is beneficial because queries across the main Page model can include both ContentPage and BlogPost while more specialized queries on the child can limit results to a specific model type.

from django.db import models
from django.contrib.auth import get_user_model

from wagtail.core.models import Page


class ContentPage(Page):
    body = RichTextField()
    author = models.ForeignKey(get_user_model(), null=True, blank=True)
    feature = models.BooleanField(default=False)
    
class BlogPost(Page):
    body = RichTextField()
    author = models.ForeignKey(get_user_model(), null=True, blank=True)
    feature = models.BooleanField(default=False)

As the two models grow in complexity, though, it can also result in a lot of code duplication. Once additional fields get added (perhaps a summary and a cover image), it might make sense to introduce an abstract model that includes the shared fields. The two page types can then inherit from that model (while still retaining the benefits described earlier). The code listing shows how this might be implemented.

from django.db import models
from django.contrib.auth import get_user_model

from wagtail.core.models import Page


class AbstractSiteContent(Page):
    body = RichTextField()
    author = models.ForeignKey(get_user_model(), null=True, blank=True)
    feature = models.BooleanField(default=False)
	
	class Meta:
	    abstract = True


class ContentPage(AbstractSiteContent):
    pass
    
class BlogPost(Page):
    pass

This pattern works great for preventing code duplication. It is also a convenient way to filter pages with similar fields. The Wagtail page query manager includes a method called type which can be used to filter pages by their page type. If you pass an abstract model, it will include all descendants in the results.

So where the does the problem I alluded to earlier come into play?

Problem: Query on Common Fields in Child Models

The difficulty comes when you (invariably) need to retrieve page models that include results from both content pages and blog posts. For example, maybe you are creating an endpoint that needs to fetch feature articles from all child models of AbstractSiteContent for a mobile application or single page application. What is the best way to do this?

Conventional wisdom as says that you can't. While the models share configuration and code, they don't share tables. For that reason, the only way to retrieve the two model types requires that you fetch each individually and then join the resulting structures in Python code.

In Wagtail, though, we get a little bit of wiggle room. Even though the models each have their own sets of columns, there is a way to query across both children. Despite directly inheriting from an abstract model, blog and content pages share a concrete (multi-table inheritance) parent in Page. The code in the listings shows how you can use the apps registry and a Subquery to find all feature pages on child classes of AbstractSiteContent.

Here's how it works:

  1. First, you use the Django apps registry to locate and retrieve all concrete model classes which are children of the abstract class. This is implemented in model_subclasses.
  2. After the model class reference has been retrieved, create a subquery that retrieves the primary keys of pages that match the condition. Because this is constructed as a subquery and uses the values method, the entire statement will execute in a single database query.
  3. Execute the query filter created in abstract_page_query_filter as part of a larger query using the parent model, Page.
from django.db.models import Q, Subquery
from django.apps import apps

from wagtail.models.core import Page


def model_subclasses(mclass):
	'''	Retrieve all model subclasses for the provided class
	'''
	return [m for m in apps.get_models() if issubclass(m, mclass)]


def abstract_page_query_filter(mclass, filter_params, pk_attr='page_ptr'):
	'''	Create a filter query that will be applied to all children of the provided
		abstract model class. Returns None if a query filter cannot be created.

		@returns Query or None
	'''
	if not mclass._meta.abstract:
		raise ValueError('Provided model class must be abstract')

	pclasses = model_subclasses(mclass)

	# Filter for pages which are marked as features
	if len(pclasses):

		qf = Q(pk__in=Subquery(pclasses[0].objects.filter(**filter_params).values(pk_attr)))
		for c in pclasses[1:]:
			qf |= Q(pk__in=Subquery(c.objects.filter(**filter_params).values(pk_attr)))

		return qf

	return None
	

# Example: Retrieve all page models descended from AbstractSiteContent 
# where feature is True
qs_features = Page.objects.fitler(
    abstract_page_query_filter(AbstractSiteContent, { 'feature': 'True' }))
Rob Oakes Aug 12, 2020
More Articles by Rob Oakes

Loading

Unable to find related content

Comments

Loading
Unable to retrieve data due to an error
Retry
No results found
Back to All Comments