Skip to content

Instantly share code, notes, and snippets.

@Morendil
Last active March 24, 2019 15:42
Show Gist options
  • Save Morendil/88f1602a3f6271ee270eb1a154ce8f61 to your computer and use it in GitHub Desktop.
Save Morendil/88f1602a3f6271ee270eb1a154ce8f61 to your computer and use it in GitHub Desktop.

When we talk about "entities" we talk about two different realities:

  • first, we talk in the abstract about what sort of "things" exist in our model - people, families, tax households - and what sort of relationships exist; for instance, people in a family are either parents or children
  • but also, when we start loading data, we talk about concrete people and families; how many people, how many families, how many parents and children are in the first and second family, and so on.

This is one of the strengths of OpenFisca, conceptually. The implementation… left a lot to be desired.

Previously we had this:

Figure 1

We had two "root" classes, Entity and GroupEntity.

If you wanted to add entities to your model, you had to make subclasses of Entity (for your "persons" entity) or GroupEntity (for everyone else - families, and so on). You couldn't do that in the usual Python way: class Household(GroupEntity). You had to use the special function build_entities from Core, which used some Python dark magic to create classes at runtime. These subclasses didn't really do anything, no extra behaviour for instance.

This got complicated often, for instance when working on SimulationBuilder, because we use the same term "entities" for all of this. So we had to be extra careful to distinguish between "entity classes" and "entity instances".

It also got awkward when unit testing, because we couldn't subclass Entity to do basic stuff like mocking.

The code for Entities was also hard to maintain because it held a reference to a Simulation, but a lot of the time it didn't care about that, and only used the Simulation's TaxBenefitSystem object. Except when it didn't and actually needed the Simulation.

What we really needed was a distinction between the "model level" and the "population level". TaxBenefitSystem is a "model level" class. It's abstract, what it describes is true whether you have one person or one million. Simulation is a "population level" class. It's concerned with a specific number of people and families and so on.

So this PR introduces Population and GroupPopulation. The hierarchy parallels that of Entity and GroupEntity.

Now we have this:

Figure 2

It's easier to talk about what we mean, because we have different words to talk about different levels. Simulations care about Populations. TaxBenefitSystems care about Entities.

Also, the Entities in a TaxBenefitSystem are now regular instances of the Entity class. Just like the Populations of a Simulation are regular instances of the Population class. (Respectively, groups at the model or population level are just instances of the Group classes.)

Breaking changes

  • The syntax some_entity.SOME_ROLE to access roles is no longer supported. Use the standard SomeEntity.SOME_ROLE instead. (For instance, Household.PARENT.)
  • Code that relied excessively on internal implementation details of Entity may break, and should be updated to access methods of Entity/Population instead.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment