martin056/fake_it.md

## fake_it.md

      
    Raw
  

              fake_it.md
            
          
TL;DR This article is targeted at programmers who have very little or no experience with fakers and factories. If you are already skilled in the topic this article may not be that interesting to you.

Introduction

In our Django apps we have the M(odel)T(emplate)V(iew) structure (mostly known as MVC). When we want to test its functionality we usually have to create some model instances and work with them and the database.
A nice and easy approach for doing so is to create fake data in our tests in the form of factories and fakers.
In the following article we are going to look over some practical examples and techniques. This will improve your tests readability and behavior.
What are we going to use?

In our Django projects we use faker to generate fake data and factory_boy to create factories of our models.
Why and Where can I use them?

Why?


We usually want to generate some random strings and numbers.
We usually want to generate some unique instances of our models that have no common fields.

Where?

Have you ever hard coded "asdf" or 1 in your tests where you needed something randomly?
Furthermore, let's say we have the model User and every user has field name. We want to test some functionality connected to it. If our tests need just one user we can easily name it "Adam" and everything will work fine (except all our of tests will have the user "Adam").
That seems OK but what happens when we want to test anything connected to a list of 10, 100, 1000 users? We have to think of a 1000 unique names? Or we can name them in randomly typed characters?
All of this is possible but will surely block our work flow and will definitely bring us trouble.
That's where our faker and factories come in!
Simple Django project

For better illustration of my examples, I've created a simple Django app. You can find it here.
The project is called MyLibrary and has the following model schema:

That is how it looks in our models.py file:
# in models.py

import uuid

from djmoney.models.fields import MoneyField

from django.db import models
from django.utils import timezone

from my_library.users.models import User


class Library(models.Model):
    address = models.CharField(max_length=255)
    librarian = models.OneToOneField('User',
                                     related_name='library',
                                     on_delete=models.CASCADE)


class Book(models.Model):
    library = models.ForeignKey(Library,
                                related_name='books',
                                on_delete=models.CASCADE)

    public_id = models.UUIDField(unique=True, default=uuid.uuid4, editable=False)
    title = models.CharField(max_length=255)
    description = models.CharField(max_length=500)


class BookBorrow(models.Model):
    book = models.ForeignKey(Book,
                             related_name='book_borrows',
                             on_delete=models.CASCADE)
    user = models.ForeignKey(User,
                             related_name='book_borrows',
                             on_delete=models.CASCADE)
    start_date = models.DateTimeField(default=timezone.now)
    end_date = models.DateTimeField(null=True, blank=True)
    charge = MoneyField(max_digits=10, decimal_places=2, default_currency='GBP')
    returned = models.BooleanField(default=False)

    class Meta:
        unique_together = ('book', 'user', 'start_date', )
Functionality

Let's say we want to have a view where we show all books borrowed by a single user. This is how this simple view would look like:
# in views.py
from django.shortcuts import render
from django.http import HttpResponseNotAllowed

from .models import BookBorrow


def book_borrow_list(request, user_id):
    if request.method == 'GET':
        object_list = BookBorrow.objects.filter(user__id=user_id)

        return render(request, 'book_borrow_list.html', locals())

    return HttpResponseNotAllowed(['GET'])
Now we have some custom logic in our system. That's right! It's time for adding some tests.
Plan our tests

Let's take a look at our view. It should list all books that are borrowed by a user. This straightforwardly lead us to two test cases:

Test if all borrowed books by user X are listed;
Test if listed borrowed books are only for user X (not for user Y, now for user Z and so on).

Test it

Let's add our first test. For this example we want to test the behavior of our system:

If we have 5 books borrowed by a single user we have to assure that the view lists them all.
If we have 5 books borrowed by a single user and 5 books borrowed by another user we have to assure that the view lists the books of the user whose ID is given in the url.

# in tests/test_views.py
from test_plus import TestCase

from my_library.users.models import User

from ..models import Book, BookBorrow, Library


class BookBorrowListViewTests(TestCase):
    def setUp(self):
        self.librarian = User.objects.create(name='Librarian',
                                             email='librarian@test.com',
                                             is_superuser=True)
        self.library = Library.objects.create(address='Test address',
                                              librarian=self.librarian)
        self.user = User.objects.create(name='Tester', email='tester@test.com')
        self.url = self.reverse('library:book_borrow_list',
                                user_id=self.user.id)

    def test_with_several_books_borrowed_by_one_user(self):
        book1 = Book.objects.create(
            library=self.library,
            title='Test Title 1',
            description='asdf'
        )
        book2 = Book.objects.create(
            library=self.library,
            title='Test Title 2',
            description='asdf'
        )
        book3 = Book.objects.create(
            library=self.library,
            title='Test Title 3',
            description='asdf'
        )
        book4 = Book.objects.create(
            library=self.library,
            title='Test Title 4',
            description='asdf'
        )
        book5 = Book.objects.create(
            library=self.library,
            title='Test Title 5',
            description='asdf'
        )

        BookBorrow.objects.create(user=self.user, book=book1, charge=1)
        BookBorrow.objects.create(user=self.user, book=book2, charge=1)
        BookBorrow.objects.create(user=self.user, book=book3, charge=1)
        BookBorrow.objects.create(user=self.user, book=book4, charge=1)
        BookBorrow.objects.create(user=self.user, book=book5, charge=1)

        response = self.get(self.url)

        self.assertEqual(200, response.status_code)
        self.assertContains(response, book1.title)
        self.assertContains(response, book2.title)
        self.assertContains(response, book3.title)
        self.assertContains(response, book4.title)
        self.assertContains(response, book5.title)

    def test_with_several_books_borrowed_by_one_user_and_another_user(self):
        another_user = User.objects.create(name='Another user', email='another@test.com')

        user1_book1 = Book.objects.create(
            library=self.library,
            title='Test Title 1',
            description='asdf'
        )
        user1_book2 = Book.objects.create(
            library=self.library,
            title='Test Title 2',
            description='asdf'
        )
        user1_book3 = Book.objects.create(
            library=self.library,
            title='Test Title 3',
            description='asdf'
        )
        user1_book4 = Book.objects.create(
            library=self.library,
            title='Test Title 4',
            description='asdf'
        )
        user1_book5 = Book.objects.create(
            library=self.library,
            title='Test Title 5',
            description='asdf'
        )

        user2_book1 = Book.objects.create(
            library=self.library,
            title='Test Title A',
            description='asdf'
        )
        user2_book2 = Book.objects.create(
            library=self.library,
            title='Test Title B',
            description='asdf'
        )
        user2_book3 = Book.objects.create(
            library=self.library,
            title='Test Title C',
            description='asdf'
        )
        user2_book4 = Book.objects.create(
            library=self.library,
            title='Test Title D',
            description='asdf'
        )
        user2_book5 = Book.objects.create(
            library=self.library,
            title='Test Title F',
            description='asdf'
        )

        BookBorrow.objects.create(user=self.user, book=user1_book1, charge=1)
        BookBorrow.objects.create(user=self.user, book=user1_book2, charge=1)
        BookBorrow.objects.create(user=self.user, book=user1_book3, charge=1)
        BookBorrow.objects.create(user=self.user, book=user1_book4, charge=1)
        BookBorrow.objects.create(user=self.user, book=user1_book5, charge=1)

        BookBorrow.objects.create(user=another_user, book=user2_book1, charge=1)
        BookBorrow.objects.create(user=another_user, book=user2_book2, charge=1)
        BookBorrow.objects.create(user=another_user, book=user2_book3, charge=1)
        BookBorrow.objects.create(user=another_user, book=user2_book4, charge=1)
        BookBorrow.objects.create(user=another_user, book=user2_book5, charge=1)

        response = self.get(self.url)

        self.assertEqual(200, response.status_code)
        self.assertContains(response, user1_book1.title)
        self.assertContains(response, user1_book2.title)
        self.assertContains(response, user1_book3.title)
        self.assertContains(response, user1_book4.title)
        self.assertContains(response, user1_book5.title)

        self.assertNotContains(response, user2_book1.title)
        self.assertNotContains(response, user2_book2.title)
        self.assertNotContains(response, user2_book3.title)
        self.assertNotContains(response, user2_book4.title)
        self.assertNotContains(response, user2_book5.title)
Urgh...

I think that's what most of you thought when you saw these 2 tests. Yes, they are working correctly but imagine generating 100 objects instead of 5 or having more than 2 tests for a view (which you will most likely have).
Let's review what is wrong (even though all of you noticed it):

The tests are so long in code lines that you need a couple of scroll-downs to cover them. This is totally a precondition to not read the test at all (yes, the thing you did a couple of seconds ago)
Imagine adding a new required field to the model that you use to generate objects...
We have only 2 tests but in real life projects you will have way more for your views.
Typing random strings and numbers in your tests is never a good idea.
R.I.P DRY

Let's start refactoring

Since we summarized the bad practices that our tests are currently introducing, we are ready to start refactoring them step by step.
Use faker instead of hard coded values

First, what is faker? Faker is a Python package that generates fake data for you.
The first step we are going to make is to use faker's powers for our hard coded values. This is how our tests are going to look like:
from test_plus import TestCase
from faker import Factory

from my_library.users.models import User
from my_library.library.models import Book, BookBorrow, Library


faker = Factory.create()


class BookBorrowListViewTests(TestCase):
    def setUp(self):
        self.librarian = User.objects.create(name=faker.name(),
                                             email=faker.email(),
                                             is_superuser=True)
        self.library = Library.objects.create(address=faker.street_address(),
                                              librarian=self.librarian)
        self.user = User.objects.create(name=faker.name(), email=faker.email())
        self.url = self.reverse('library:book_borrow_list',
                                user_id=self.user.id)

    def test_with_several_books_borrowed_by_one_user(self):
        book1 = Book.objects.create(
            library=self.library,
            title=faker.word(),
            description=faker.text()
        )
        book2 = Book.objects.create(
            library=self.library,
            title=faker.word(),
            description=faker.text()
        )
        book3 = Book.objects.create(
            library=self.library,
            title=faker.word(),
            description=faker.text()
        )
        book4 = Book.objects.create(
            library=self.library,
            title=faker.word(),
            description=faker.text()
        )
        book5 = Book.objects.create(
            library=self.library,
            title=faker.word(),
            description=faker.text()
        )

        BookBorrow.objects.create(user=self.user, book=book1, charge=faker.random_number())
        BookBorrow.objects.create(user=self.user, book=book2, charge=faker.random_number())
        BookBorrow.objects.create(user=self.user, book=book3, charge=faker.random_number())
        BookBorrow.objects.create(user=self.user, book=book4, charge=faker.random_number())
        BookBorrow.objects.create(user=self.user, book=book5, charge=faker.random_number())

        response = self.get(self.url)

        self.assertEqual(200, response.status_code)
        self.assertContains(response, book1.title)
        self.assertContains(response, book2.title)
        self.assertContains(response, book3.title)
        self.assertContains(response, book4.title)
        self.assertContains(response, book5.title)

    def test_with_several_books_borrowed_by_one_user_and_another_user(self):
        another_user = User.objects.create(name=faker.name(), email=faker.email())

        user1_book1 = Book.objects.create(
            library=self.library,
            title=faker.word(),
            description=faker.text()
        )
        user1_book2 = Book.objects.create(
            library=self.library,
            title=faker.word(),
            description=faker.text()
        )
        user1_book3 = Book.objects.create(
            library=self.library,
            title=faker.word(),
            description=faker.text()
        )
        user1_book4 = Book.objects.create(
            library=self.library,
            title=faker.word(),
            description=faker.text()
        )
        user1_book5 = Book.objects.create(
            library=self.library,
            title=faker.word(),
            description=faker.text()
        )

        user2_book1 = Book.objects.create(
            library=self.library,
            title=faker.word(),
            description=faker.text()
        )
        user2_book2 = Book.objects.create(
            library=self.library,
            title=faker.word(),
            description=faker.text()
        )
        user2_book3 = Book.objects.create(
            library=self.library,
            title=faker.word(),
            description=faker.text()
        )
        user2_book4 = Book.objects.create(
            library=self.library,
            title=faker.word(),
            description=faker.text()
        )
        user2_book5 = Book.objects.create(
            library=self.library,
            title=faker.word(),
            description=faker.text()
        )

        BookBorrow.objects.create(user=self.user, book=user1_book1, charge=faker.random_number())
        BookBorrow.objects.create(user=self.user, book=user1_book2, charge=faker.random_number())
        BookBorrow.objects.create(user=self.user, book=user1_book3, charge=faker.random_number())
        BookBorrow.objects.create(user=self.user, book=user1_book4, charge=faker.random_number())
        BookBorrow.objects.create(user=self.user, book=user1_book5, charge=faker.random_number())

        BookBorrow.objects.create(user=another_user, book=user2_book1, charge=faker.random_number())
        BookBorrow.objects.create(user=another_user, book=user2_book2, charge=faker.random_number())
        BookBorrow.objects.create(user=another_user, book=user2_book3, charge=faker.random_number())
        BookBorrow.objects.create(user=another_user, book=user2_book4, charge=faker.random_number())
        BookBorrow.objects.create(user=another_user, book=user2_book5, charge=faker.random_number())

        response = self.get(self.url)

        self.assertEqual(200, response.status_code)
        self.assertContains(response, user1_book1.title)
        self.assertContains(response, user1_book2.title)
        self.assertContains(response, user1_book3.title)
        self.assertContains(response, user1_book4.title)
        self.assertContains(response, user1_book5.title)

        self.assertNotContains(response, user2_book1.title)
        self.assertNotContains(response, user2_book2.title)
        self.assertNotContains(response, user2_book3.title)
        self.assertNotContains(response, user2_book4.title)
        self.assertNotContains(response, user2_book5.title)

Check faker documentation for more useful methods.

What did we do?

To be honest, we have the same test functionality as before but now there are no 'asdf', 'Test Title F', 1, etc. Faker is generating values like these randomly and we don't need to bother about similar problems any more.
Now we have no hard coded strings and integers in our code. This is actually a good achievement, especially compared to the previous implementation, but there is still a lot of code that is repeated.
The next step is to create some factories.
What is actually a Factory?

Factories are Python classes that behave similarly to Django models - they write to your database as Django models do. The thing we love about them is that they give us a lot of automation.
In other words, instead of doing <Model>.objects.create(...) we are going to have ModelFactory() and it will do the magic for us.
Let's create factories for our models:
# in factories.py

import factory
from faker import Factory

from my_library.users.models import User
from my_library.library.models import (
    Book,
    Library,
    BookBorrow,
)


faker = Factory.create()


class UserFactory(factory.DjangoModelFactory):
    class Meta:
        model = User

    name = faker.name()
    email = faker.email()


class LibraryFactory(factory.DjangoModelFactory):
    class Meta:
        model = Library

    address = faker.street_address()
    librarian = factory.SubFactory(UserFactory)


class BookFactory(factory.DjangoModelFactory):
    class Meta:
        model = Book

    library = factory.SubFactory(LibraryFactory)
    title = faker.word()
    description = faker.text()


class BookBorrowFactory(factory.DjangoModelFactory):
    class Meta:
        model = BookBorrow

    book = factory.SubFactory(BookFactory)
    user = factory.SubFactory(UserFactory)
    charge = faker.random_number()

We usually like to store out factories in a different app called seed. That's how you can easily create only one faker instance. Second great benefit is that you have all your factories in one place (since models can have foreign keys to other models in different apps) so you can make SubFactories easier.

Little clarification

In simple words:

In class Meta define the model of your factory
Use faker to generate random values
factory.SubFactory(MyModelFactory) == models.ForeignKey(MyModel)

Use them

Since we have our factories, we had better use them in the tests:
from test_plus import TestCase

from my_library.seed.factories import (
    UserFactory,
    BookFactory,
    LibraryFactory,
    BookBorrowFactory,
)


class BookBorrowListViewTests(TestCase):
    def setUp(self):
        self.librarian = UserFactory(is_superuser=True)
        self.library = LibraryFactory(librarian=self.librarian)
        self.user = UserFactory()
        self.url = self.reverse('library:book_borrow_list',
                                user_id=self.user.id)

    def test_with_several_books_borrowed_by_one_user(self):
        book1 = BookFactory(library=self.library)
        book2 = BookFactory(library=self.library)
        book3 = BookFactory(library=self.library)
        book4 = BookFactory(library=self.library)
        book5 = BookFactory(library=self.library)

        BookBorrowFactory(user=self.user, book=book1)
        BookBorrowFactory(user=self.user, book=book2)
        BookBorrowFactory(user=self.user, book=book3)
        BookBorrowFactory(user=self.user, book=book4)
        BookBorrowFactory(user=self.user, book=book5)

        response = self.get(self.url)

        self.assertEqual(200, response.status_code)
        self.assertContains(response, book1.title)
        self.assertContains(response, book2.title)
        self.assertContains(response, book3.title)
        self.assertContains(response, book4.title)
        self.assertContains(response, book5.title)

    def test_with_several_books_borrowed_by_one_user_and_another_user(self):
        another_user = UserFactory()

        user1_book1 = BookFactory(library=self.library)
        user1_book2 = BookFactory(library=self.library)
        user1_book3 = BookFactory(library=self.library)
        user1_book4 = BookFactory(library=self.library)
        user1_book5 = BookFactory(library=self.library)

        user2_book1 = BookFactory(library=self.library)
        user2_book2 = BookFactory(library=self.library)
        user2_book3 = BookFactory(library=self.library)
        user2_book4 = BookFactory(library=self.library)
        user2_book5 = BookFactory(library=self.library)

        BookBorrowFactory(user=self.user, book=user1_book1)
        BookBorrowFactory(user=self.user, book=user1_book2)
        BookBorrowFactory(user=self.user, book=user1_book3)
        BookBorrowFactory(user=self.user, book=user1_book4)
        BookBorrowFactory(user=self.user, book=user1_book5)

        BookBorrowFactory(user=another_user, book=user2_book1)
        BookBorrowFactory(user=another_user, book=user2_book2)
        BookBorrowFactory(user=another_user, book=user2_book3)
        BookBorrowFactory(user=another_user, book=user2_book4)
        BookBorrowFactory(user=another_user, book=user2_book5)

        response = self.get(self.url)

        self.assertEqual(200, response.status_code)
        self.assertContains(response, user1_book1.title)
        self.assertContains(response, user1_book2.title)
        self.assertContains(response, user1_book3.title)
        self.assertContains(response, user1_book4.title)
        self.assertContains(response, user1_book5.title)

        self.assertNotContains(response, user2_book1.title)
        self.assertNotContains(response, user2_book2.title)
        self.assertNotContains(response, user2_book3.title)
        self.assertNotContains(response, user2_book4.title)
        self.assertNotContains(response, user2_book5.title)
Now we don't really interact directly with the database from the tests and they....
Oops!

Yes, if you have tried it yet you will know that our tests started failing! The reason is that we started using factories and faker.
Why? Since our faker is supposed to generate random values when we need them, it's pretty common for it (and all other structures that work the same way) to have something like a seed for it's random generator.
In other words, we have to call faker.<provider> lazily every time we need new one. Thank God, factory_boy has an easy solution to this - LazyAttribute!
Become Lazy!

This is how our factories' fields become lazy where needed:
import factory
from faker import Factory

from my_library.users.models import User
from my_library.library.models import (
    Book,
    Library,
    BookBorrow,
)


faker = Factory.create()


class UserFactory(factory.DjangoModelFactory):
    class Meta:
        model = User

    name = faker.name()
    email = faker.email()


class LibraryFactory(factory.DjangoModelFactory):
    class Meta:
        model = Library

    address = factory.LazyAttribute(lambda _: faker.street_address())
    librarian = factory.SubFactory(UserFactory)


class BookFactory(factory.DjangoModelFactory):
    class Meta:
        model = Book

    library = factory.SubFactory(LibraryFactory)
    title = factory.LazyAttribute(lambda _: faker.word())
    description = faker.text()


class BookBorrowFactory(factory.DjangoModelFactory):
    class Meta:
        model = BookBorrow

    book = factory.SubFactory(BookFactory)
    user = factory.SubFactory(UserFactory)
    charge = faker.random_number()

factory.LazyAttribute waits a function to be given as an argument. lambda ❤️

After this fix our tests started succeeding!
DRY is still broken

Although we are completely random and cool in our tests at the moment, we still repeat a lot of code - we have <Model>.objects.create in a lot of places.
There are people that still underestimate code repetition in their tests but in my opinion if you keep DRY principals in your project you have to preserve them in the tests too!
Factories love DRY

Fortunately, factory developers have given us the ability to run away from this bad practice in such an easy way - create_batch:
from test_plus import TestCase

from my_library.seed.factories import (
    UserFactory,
    BookFactory,
    LibraryFactory,
    BookBorrowFactory,
)


class BookBorrowListViewTests(TestCase):
    def setUp(self):
        self.librarian = UserFactory(is_superuser=True)
        self.library = LibraryFactory(librarian=self.librarian)
        self.user = UserFactory()
        self.url = self.reverse('library:book_borrow_list',
                                user_id=self.user.id)

    def test_with_several_books_borrowed_by_one_user(self):
        books = BookFactory.create_batch(5, library=self.library)

        for book in books:
            BookBorrowFactory(user=self.user, book=book)

        response = self.get(self.url)

        self.assertEqual(200, response.status_code)
        for book in books:
            self.assertContains(response, book.title)

    def test_with_several_books_borrowed_by_one_user_and_another_user(self):
        another_user = UserFactory()

        user1_books = BookFactory.create_batch(5, library=self.library)
        user2_books = BookFactory.create_batch(5, library=self.library)

        for book in user1_books:
            BookBorrowFactory(user=self.user, book=book)

        for book in user2_books:
            BookBorrowFactory(user=another_user, book=book)

        response = self.get(self.url)

        self.assertEqual(200, response.status_code)
        for book in user1_books:
            self.assertContains(response, book.title)
        for book in user2_books:
            self.assertNotContains(response, book.title)
This is how we literally create a batch of objects without repeating our code!
Even though we are using factories, they are behaving like objects from the ORM - you can easily manipulate and overwrite their values.
Conclusion

Faker and factories are beneficial to have in our tests. If we are smart enough we can use them in other places - while debugging, creating seeds, etc.
This article was just an introduction to factories. If you are interested in the topic or just want to check how we handle problems that occur during testing, , write your questions in the comments!