Shallow Dive into Django ORM

A Closer Look at the Django ORM and Many-To-Many Relationships

In the last post I worked some on the data model for the KidsTasks app and discovered that a many-to-many relationship would not allow multiple copies of the same task to exist in a given schedule. Further reading showed me, without much explanation, that using a “through” parameter on the relationship definition fixed that. In this post I want to take a closer look at what’s going on in that django model magic.

Django Shell

As part of my research for this topic, I was lead to a quick description of the Django shell which is great for testing out ideas and playing with the models you’re developing. I found a good description here.  (which also gives a look at filters and QuerySets).

Additionally, I’ll note for anyone wanting to play along at home, that the following sequence of commands was quite helpful to have handy when testing different models.

 $ rm tasks/migrations db.sqlite3 -rf
 $ ./manage.py makemigrations tasks
 $ ./manage.py migrate
 $ ./manage.py shell
 Python 3.4.3 (default, Oct 14 2015, 20:33:09)
 [GCC 4.8.4] on linux
 Type "help", "copyright", "credits" or "license" for more information.
 (InteractiveConsole)

Many To Many without an Intermediate Class

I’ll start by examining what happened with my original model design where a DayOfWeekSchedule had a ManyToMany relationship with Task.

Simple Solution Code

The simplified model I’ll use here looks like this.

class Task(models.Model):
 name = models.CharField(max_length=256)
 required = models.BooleanField()

def __str__(self):
 return self.name

class DayOfWeekSchedule(models.Model):
 tasks = models.ManyToManyField(Task)
 name = models.CharField(max_length=20)

def __str__(self):
 return self.name

Note that the ManyToMany field directly accesses the Task class. (Also note that I retained the __str__ methods to make the shell output more meaningful.)

Experiment

In the shell experiment show in the listing below, I set up a few Tasks and
a couple of DayOfWeekSchedules and then add “first task” and “second
task” to one of the schedules. Once this is done, I attempt to add “first
task” to the schedule again and we see that it does not have the desired
effect.

>>> # import our models
>>> from tasks.models import Task, DayOfWeekSchedule
>>>
>>> # populate our database with some simple tasks and schedules
>>> Task.objects.create(name="first task", required=False)
<Task: first task>
>>> Task.objects.create(name="second task", required=True)
<Task: second task>
>>> Task.objects.create(name="third task", required=False)
<Task: third task>
>>> DayOfWeekSchedule.objects.create(name="sched1")
<DayOfWeekSchedule: sched1>
>>> DayOfWeekSchedule.objects.create(name="sched2")
<DayOfWeekSchedule: sched2>
>>> Task.objects.all()
<QuerySet [<Task: first task>, <Task: second task>, <Task: third task>]>
>>> DayOfWeekSchedule.objects.all()
<QuerySet [<DayOfWeekSchedule: sched1>, <DayOfWeekSchedule: sched2>]>
>>>
>>> # add a task to a schedule
>>> s = DayOfWeekSchedule.objects.get(name='sched2')
>>> t = Task.objects.get(name='first task')
>>> s.tasks.add(t)
>>> s.tasks.all()
<QuerySet [<Task: first task>]>
>>>
>>> # add other task to that schedule
>>> t = Task.objects.get(name='second task')
>>> s.tasks.add(t)
>>> s.tasks.all()
<QuerySet [<Task: first task>, <Task: second task>]>
>>>
>>> # attempt to add the first task to the schedule again
>>> s = DayOfWeekSchedule.objects.get(name='sched2')
>>> t = Task.objects.get(name='first task')
>>> s.tasks.add(t)
>>> s.tasks.all()
<QuerySet [<Task: first task>, <Task: second task>]>

Note that at the end, we still only have a single copy of “first task” in the schedule.

Many To Many with an Intermediate Class

Now we’ll retry the experiment with the “through=” intermediate class specified in the ManyToMany relationship.

Not-Quite-As-Simple Solution Code

The model code for this is quite similar.  Note the addition of the “through=” option and of the DayTask class.

from django.db import models

class Task(models.Model):
 name = models.CharField(max_length=256)
 required = models.BooleanField()

def __str__(self):
 return self.name

class DayOfWeekSchedule(models.Model):
 tasks = models.ManyToManyField(Task, through='DayTask')
 name = models.CharField(max_length=20)

def __str__(self):
 return self.name

class DayTask(models.Model):
 task = models.ForeignKey(Task)
 schedule = models.ForeignKey(DayOfWeekSchedule)

Experiment #2

This script is as close as possible to the first set.  The only difference being the extra steps we need to take to add the ManyToMany relationship.  We need to manually create the object of DayTask, initializing it with the Task and Schedule objects and then saving it.  While this is slightly more cumbersome in the code, it does produce the desired results; two copies of “first task” are present in the schedule at the end.

>>> # import our models
>>> from tasks.models import Task, DayOfWeekSchedule, DayTask
>>>
>>> # populate our database with some simple tasks and schedules
>>> Task.objects.create(name="first task", required=False)
<Task: first task>
>>> Task.objects.create(name="second task", required=True)
<Task: second task>
>>> Task.objects.create(name="third task", required=False)
<Task: third task>
>>> DayOfWeekSchedule.objects.create(name="sched1")
<DayOfWeekSchedule: sched1>
>>> DayOfWeekSchedule.objects.create(name="sched2")
<DayOfWeekSchedule: sched2>
>>> Task.objects.all()
<QuerySet [<Task: first task>, <Task: second task>, <Task: third task>]>
>>> DayOfWeekSchedule.objects.all()
<QuerySet [<DayOfWeekSchedule: sched1>, <DayOfWeekSchedule: sched2>]>
>>>
>>> # add a task to a schedule
>>> s = DayOfWeekSchedule.objects.get(name='sched2')
>>> t = Task.objects.get(name='first task')
>>> # cannot simply add directly, must create intermediate object see
>>> # https://docs.djangoproject.com/en/1.9/topics/db/models/#extra-fields-on-many-to-many-relationships
>>> # s.tasks.add(t)
>>> d1 = DayTask(task=t, schedule=s)
>>> d1.save()
>>> s.tasks.all()
<QuerySet [<Task: first task>]>
>>>
>>> # add other task to that schedule
>>> t = Task.objects.get(name='second task')
>>> dt2 = DayTask(task=t, schedule=s)
>>> dt2.save()
>>> # s.tasks.add(t)
>>> s.tasks.all()
<QuerySet [<Task: first task>, <Task: second task>]>
>>>
>>> # attempt to add the first task to the schedule again
>>> s = DayOfWeekSchedule.objects.get(name='sched2')
>>> t = Task.objects.get(name='first task')
>>> dt3 = DayTask(task=t, schedule=s)
>>> dt3.save()
>>> s.tasks.all()
<QuerySet [<Task: first task>, <Task: second task>, <Task: first task>]>

But…Why?

The short answer is that I’m not entirely sure why the intermediate class is needed to allow multiple instances.  It’s fairly clear that it is tied to how the Django code manages those relationships.  Evidence confirming that can be seen in the migration script generated for each of the models.

The first model generates these operations:

operations = [
 migrations.CreateModel(
 name='DayOfWeekSchedule',
 fields=[
 ('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
 ('name', models.CharField(max_length=20)),
 ],
 ),
 migrations.CreateModel(
 name='Task',
 fields=[
 ('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
 ('name', models.CharField(max_length=256)),
 ('required', models.BooleanField()),
 ],
 ),
 migrations.AddField(
 model_name='dayofweekschedule',
 name='tasks',
 field=models.ManyToManyField(to='tasks.Task'),
 ),
 ]

Notice the final AddField call which adds “tasks” to the “dayofweekschedule” model directly.

The second model (shown above) generates a slightly different set of migration operations:

operations = [
 migrations.CreateModel(
 name='DayOfWeekSchedule',
 fields=[
 ('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
 ('name', models.CharField(max_length=20)),
 ],
 ),
 migrations.CreateModel(
 name='DayTask',
 fields=[
 ('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
 ('schedule', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to='tasks.DayOfWeekSchedule')),
 ],
 ),
 migrations.CreateModel(
 name='Task',
 fields=[
 ('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
 ('name', models.CharField(max_length=256)),
 ('required', models.BooleanField()),
 ],
 ),
 migrations.AddField(
 model_name='daytask',
 name='task',
 field=models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to='tasks.Task'),
 ),
 migrations.AddField(
 model_name='dayofweekschedule',
 name='tasks',
 field=models.ManyToManyField(through='tasks.DayTask', to='tasks.Task'),
 ),
 ]

This time it adds task to the daytask and dayofweekschedule classes.  I have to admit here that I really wanted this to show the DayTask object being used in the DayOfWeekSchedule class as a proxy, but that’s not the case.

Examining the databases generated by these two models showed no significant differences there, either.

A Quick Look at the Source

One of the beauties of working with open source software is the ability to dive in and see for yourself what’s going on.  Looking at the Django source, you can find the code that adds a relationship in django/db/models/fields/related_descriptors.py (at line 918 in the version I checked out).

        def add(self, *objs):
            ... stuff deleted ...
            self._add_items(self.source_field_name, 
                            self.target_field_name, *objs)

(actually _add_items can be called twice, once for a forward and once for a reverse relationship).  Looking at _add_items (line 1041 in my copy), we see after building the list of new_ids to insert, this chunk of code:

                db = router.db_for_write(self.through, 
                                         instance=self.instance)
                vals = (self.through._default_manager.using(db)
                        .values_list(target_field_name, flat=True)
                        .filter(**{
                            source_field_name: self.related_val[0],
                            '%s__in' % target_field_name: new_ids,
                        }))
                new_ids = new_ids - set(vals)

which I suspect of providing the difference.  This code gets the list of current values in the relation table and removes that set from the set of new_ids.  I believe that the filter here will respond differently if we have a intermediate class defined.  NOTE: I did not run this code live to test this theory, so if I’m wrong, feel free to point out how and where in the comments.

Even if this is not quite correct, after walking through some code, I’m satisfied that the intermediate class definitely causes some different behavior internally in Django.

Next time I’ll jump back into the KidsTasks code.

Thank for reading!

KidsTasks – Working on Models

This is part two in the KidsTasks series where we’re designing and implementing a django app to manage daily task lists for my kids. See part 1 for details on requirements and goals.

Model Design Revisited

As I started coding up the models and the corresponding admin pages for the design I presented in the last section it became clear that there were several bad assumptions and mistakes in that design. (I plan to write up a post about designs and their mutability in the coming weeks.)

The biggest conceptual problem I had was the difference between “python objects” and “django models”. Django models correspond to database tables and thus do not map easily to things like “I want a list of Tasks in my DayOfWeekSchedule”.

After building up a subset of the models described in part 1, I found that the CountedTask model wasn’t going to work the way I had envisioned. Creating it as a direct subclass of Task caused unexpected (initially, at least) behavior in that all CountedTasks were also Tasks and thus showed up in all lists where Tasks could be added. While this behavior makes sense, it doesn’t fit the model I was working toward. After blundering with a couple of other ideas, it finally occurred to me that the main problem was the fundamental design. If something seems really cumbersome to implement it might be pointing to a design error.

Stepping back, it occurred to me that the idea of a “Counted” task was putting information at the wrong level. An individual task shouldn’t care if it’s one of many similar tasks in a Schedule, nor should it know how many there are. That information should be part of the Schedule models instead.

Changing this took more experimenting than I wanted, largely due to a mismatch in my thinking and how django models work. The key for working through this level of confusion was by trying to figure out how to add multiple Tasks of the same type to a Schedule. That led me to this Stack Overflow question which describes using an intermediate model to relate the two items. This does exactly what I’m looking for, allowing me to say that Kid1 needs to Practice Piano twice on Tuesdays without the need for a CountedTask model.

Changing this created problems for our current admin.py, however. I found ideas for how to clean that up here, which describes how to use inlines as part of the admin pages.

Using inlines and intermediate models, I was able to build up a schedule for a kid in a manner similar to my initial vision.  The next steps will be to work on views for this model and see where the design breaks!

Wrap Up

I’m going to stop this session here but I want to add a few interesting points and tidbits I’ve discovered on the way:

  • If you make big changes to the model and you don’t yet have any significant data, you can wipe out the database easily and start over with the following steps:
$ rm db.sqlite3 tasks/migrations/ -rf
$ ./manage.py makemigrations tasks
$ ./manage.py migrate
$ ./manage.py createsuperuser
$ ./manage.py runserver
  • For models, it’s definitely worthwhile to add a __str__ (note: maybe __unicode__?) method and a Meta class to each one.  The __str__ method controls how the class is described, at least in the admin pages. The Meta class allows you to control the ordering when items of this model are listed. Cool!
  • I found (and forgot to note where) in the official docs an example of using a single char to store the name of the week while displaying the full day name. This looks like this:
    day_of_week_choices = (
        ('M', 'Monday'),
        ('T', 'Tuesday'),
        ('W', 'Wednesday'),
        ('R', 'Thursday'),
        ('F', 'Friday'),
        ('S', 'Saturday'),
        ('N', 'Sunday'),
    )
	...
    day_name = models.CharField(max_length=1, choices=day_of_week_choices)
  • NOTE that we’re going to have to tie the “name” fields in many of these models to the kid to which it’s associated. I’m considering if the kid can be combined into the schedule, but I don’t think that’s quite right. Certainly changes are coming to that part of the design.

That’s it! The state of the code at the point I’m writing this can be found here:

git@github.com:jima80525/KidTasks.git
git checkout blog/02-Models-first-steps

Thanks for reading!

KidsTasks: The first appliciation – Data Models

This post is the first in a series where I will document that process of creating a new django app from the ground up. In this post I’ll talk a bit about the high-level requirements of the app we’ll be doing and I’ll follow-up next with a quick run-down on how to get the first steps of the app running.

Since this is a blog for experienced developers, I’m going to make a few assumptions here:

  • You know how to use git, at least at a rudimentary level
  • You have and know how to use a good editor
  • You’ve already worked through the Django tutorial in the official documentation

Note that I tend to keep diagrams and requirements informal as my goal here is not to teach/learn UML or rigid design practices, but rather to focus on learning full-stack development. I am using graphviz tied to the django-extensions tools to allow me to automatically export my app’s models and diagram them. This is quite handy. The install is here
and the instructions are here

KidsTasks

My first django app will be a replacement for the daily/weekly task list my kids are using currently. They have a paper chart on which they move magnets to indicate that a task is complete. Each day the list is reset for a new day.

Requirements

Here’s a quick enumeration of requirements. I’m sure this isn’t all of them, but it’s sufficient to start doing the data design for the project.

  • The list of tasks for each day consists of three sets:
    1) tasks due on each day of the week (i.e. Tuesday is bath day)
    2) tasks due on a particular date (i.e. get present for sister on 9/3)
    3) tasks due sometime this week, but not necessarily today (i.e. clean room) Weekly tasks can have multiple occurrences.
  • Each individual task can be “required” or not
  • The kids should be easily able to mark a task as completed (or uncompleted in case of mistakes).
  • Mom should be able to also mark individual tasks in addition to being able to add, remove and reschedule tasks.
  • A history of completed tasks should be viewable over different time frames (yesterday, last week, last month, custom date range).
  • Kids should be able to view all tasks, completed and uncompleted as well as filtering only one kind.

Data Model

UML diagram of data model.As you can see from the diagram, the fundamental data object here is a task. There are two types of tasks: Task and CountedTask with the latter sub-classing and adding a number of occurrences needed for the weekly tasks.

Tasks are brought together into DayOfWeekSchedules which allow us to list what tasks are due on Mondays, and DateSchedules, which allow us to list what’s due on the 3rd of October.

CountedTasks form WeekSchedules and HistoricDates (more on history later).

The different Schedules are combined into a Schedule object which is then owned by Kids.

Kids also have Histories which are comprised of HistoricDates which, in turn, are CountedTasks coupled with a date.

This is the basic data structure with which we’ll start our adventure. Next time I’ll do more Django-specific work, talking about how to get from the last stage (basic Django install and app creation) to the point where we can view and manipulate these objects on the admin page.

Starting a blog with SimpleProgrammer

I’m going to take a quick diversion from learning django in this post and share my impressions of the course I’ve taken that helped and encouraged me to start writing.

The course is How to Create a Blog by John Sonmez at https://simpleprogrammer.com.

It’s a free course and had some really good pointers about the soft skills required for creating a successful blog.  There’s not a lot of technical how-to (which is fine, the web is full of that information), but it focuses much more on things like:

  • Finding a focused niche to write about
  • Being persistent and consistent in writing
  • Examining ways to grow an audience by growing a network of connections in your community.

These are all topics that are near and dear to me.  Much of John’s writing focuses on helping younger developers starting out and how they can build new careers, but I found ideas that not only resonated with my own thinking but also prompted me to change some behaviors and focus on my on-line network.

I’ve not yet read John’s book on Soft Skills, but it’s on my list and I’m looking forward to seeing what new ideas I can use from that.

Thanks John, for the interesting course!

We now return you to your regularly scheduled Django-talk.

Setting up for Django development

Let’s start with a quick note on tools and environment.  I’m developing on Linux Mint (17) so the details will be geared toward that system.  I am not planning on doing many posts that are system-specific, so this generally shouldn’t be an issue, but this post is about getting set up and contains linux/ubuntu specific details.  If you’re working on a Windows system, you’ll need to do some translation here.  I’m not the person to ask about that, however.

Python versions

Mint is currently shipping with both Python 2.7 and 3.4.3 installed.  The work I’m doing here, and, if you’re starting new projects, the work you will be doing should be in Python 3.  Most of my python experience to this point is in 2.7, but that’s largely due to a key library (Cheetah templates) not supporting Python 3 when I started with it.  For new code, 3.x is the way to go.  (We’ll see if I still say that after coding in it for a while.)

virtualenv

Unfortunately, Mint ships with 2.7 as the default.  My immediate thought on learning this was to try to change it.  (NOTE: do NOT try to uninstall
python2.7 from Mint.  You’ll just end up re-installing mint.).  The recommended method for doing this to use virtualenv.  Details on how to use virtualenv can be found here:

http://docs.python-guide.org/en/latest/dev/virtualenvs/

but the install is quite simple:

$ sudo pip install virtualenv

I’ll give examples below showing how to invoke it.

Installing django 1.10

Again, this information is covered in greater detail on other sites.  My
intention here is to provide the quick method for install on a similar system for developers who have an idea of what’s going on.  I think the basic instruction for this install are pretty good here:

http://tutorial.djangogirls.org/en/django_installation/

but the basic premise is shown in the following capture:

$ virtualenv -p /usr/bin/python3 venv
Running virtualenv with interpreter /usr/bin/python3
Using base prefix '/usr'
New python executable in /home/jima/coding/server1/venv/bin/python3
Also creating executable in /home/jima/coding/server1/venv/bin/python
Installing setuptools, pip, wheel...done.

$ source venv/bin/activate
$ pip install django~=1.10
Collecting django~=1.10
  Downloading Django-1.10-py2.py3-none-any.whl (6.8MB)
    100% |████████████████████████████████| 6.8MB 105kB/s
Installing collected packages: django
Successfully installed django-1.10

Note that I’m installing django into the virtualenv rather than on my machine.  I’m doing this to allow myself to play with different versions as there’s a small project I’m thinking about that may involve converting from previous django versions.

Starting a new project

Finally, to wrap up the getting started quickly page, here’s how to start a project and test that your install is correct, from beginning to test.

$ virtualenv -p /usr/bin/python3 venv
Running virtualenv with interpreter /usr/bin/python3
Using base prefix '/usr'
New python executable in /home/jima/coding/testing/venv/bin/python3
Also creating executable in /home/jima/coding/testing/venv/bin/python
Installing setuptools, pip, wheel...done.

$ source venv/bin/activate
$ pip install django~=1.10
Collecting django~=1.10
  Using cached Django-1.10.1-py2.py3-none-any.whl
Installing collected packages: django
Successfully installed django-1.10.1
$ django-admin startproject myproject
$ cd myproject/
$ python manage.py runserver
Performing system checks...

System check identified no issues (0 silenced).

You have 13 unapplied migration(s). Your project may not work properly until you apply the migrations for app(s): admin, auth, contenttypes, sessions.
Run 'python manage.py migrate' to apply them.

September 02, 2016 - 18:47:17
Django version 1.10.1, using settings 'myproject.settings'
Starting development server at http://127.0.0.1:8000/
Quit the server with CONTROL-C.

At that point you should be able to point your browser at the development server and see the “welcome” page.

Follow Along!

The first series of this blog will be developing a web app to manage tasks for my kids.  I’ll be walking through this in steps as I learn.  If you’d like to follow along, the project is on github at

git@github.com:jima80525/KidTasks.git

and I’ll be creating branches for each blog post in which I work on it.  For this post, simple as it is, the branch can be found at

git checkout blog/01-InitialSetup

I’ll be adding more next week as I describe the basic requirements and start in on the data model!

Wrap up

That’s it! This post is much more recipe-based than I’m intending for the rest of this blog. Next time I’ll start discussing the design process and some ideas on the first app I’ll be developing.  Hopefully then we’ll start delving into things that are not just “how-to” lists.