Create a gist now

Instantly share code, notes, and snippets.

@josevalim / Secret
Last active Mar 10, 2017

What would you like to do?
Introducing Ecto changesets

Note: This proposal has been implemented but some details in it may be outdated. Check the official Ecto.Changeset documentation for more information.

Hello everyone,

This is a proposal to introduce the idea of changesets into Ecto. They will allow us to:

  1. Cast, filter and validate data before it is applied to the model

  2. Keep track of changes applied to the model as a way to do dirty tracking and more performant updates (so we don't always resend all the data on updates)

  3. Keep track of associations and changes being performed to them, in a way to automatically perform all operations at once with transactions


When you receive external data and you want to apply it to a model, there are many steps the data needs to go through:

  1. Receive the parameters

  2. Reject any bad or malicious parameters (for example, we don't want someone to be able to set role="admin" externally)

  3. Cast the parameters

  4. Validate the parameters themselves (were all required fields given? do they have the proper length? etc)

  5. Validate business rules (for example, you may not be able to add someone as a manager to a project if the manager is already allocated in three other projects)

  6. Persist the data

Different web stacks perform those operations in slightly different order. For example, early Rails versions were used to inject the data into the model after step 1, and filtering, casting and validating were all model responsibilities (often hidden from the developer).

On the other hand, Django Forms insert the data into a model only after it has been cast and validated (after step 4).

Changesets are inspired by Django Forms but without all the form specific concerns as they don't apply to Ecto. Changesets will copy the data into the model after it has been filtered, cast and validated.


Before we go further into specifics, here is an example of how we can use changesets in a Phoenix controller:

plug :scrub_params, "user" when action in [:create, :update]
plug :action

def new(conn, %{"id" => id}) do
  changeset = User.changeset %User{}
  render conn, "new.html", changeset: changeset

def create(conn, %{"id" => id, "user" => params}) do
  changeset = User.changeset %User{}, params

  if changeset.valid? do
    user = Repo.insert(changeset)
    redirect conn, to: user_path(conn, :index)
    render conn, "new.html", changeset: changeset

The User model will have:

defmodule User do
  use Ecto.Model

  schema "users" do
    field :name
    field :email
    field :age, :integer

  def changeset(user, params \\ nil) do
    |> cast(params, ~w(name email), ~w(age))
    |> validate_format(:email, ~r/@/)
    |> validate_number(:age, more_than: 18)
    |> validate_unique(:email)

The changeset/2 function receives the user model and its parameters and proceeds to filter, cast and validate the given data.

Notice cast/2 receives the parameters, the model and a list of required and optional fields and returns a changeset. Casting is done based on the type given to the schema. Custom types will allow users to provide custom casting rules.

Any parameter that was not explicitly listed in the required or optional fields list will be ignored. Furthermore, if a parameter is given as required but it does not exist in the model nor in the parameter list, it will be marked with an error and the changeset is deemed invalid.

After casting, the changeset will be passed to validate_*/2 functions that will validate only the changed fields. In other words: if a field did not change, we won't validate it at all. For example, if the request above is changing only the e-mail, only the e-mail validation will be triggered, and the age one won't run.

Finally, params is given a default of nil in the User.changeset/2 function. In case there are no parameters, the changeset is simply returned without running any validations (as there aren't any changes).

Notice there is no DSL involved in casting, filtering and validating a changeset. One of the benefits of this approach is that it is easier to provide different changesets contexts. For example, one could write:

def changeset(user, :create, params) do
  # Changeset on create

def changeset(user, :update, params) do
  # Changeset on update

The changeset

In order to show what a changeset actually is, here is a sketch of its module, with documentation:

defmodule Ecto.Changeset do
  @doc """
  The fields are:

  * `valid?`      - Stores if the changeset is valid
  * `model`       - The changeset root model
  * `params`      - The parameters as given on changeset creation
  * `changes`     - The `changes` from parameters that were approved in casting
  * `validations` - All validations performed in the changeset
  * `errors`      - All errors from validations
  defstruct valid?: false, model: nil, params: nil, changes: %{},
            validations: [], errors: []

  @doc """
  Convert the given `params` into a changeset for `model`
  keeping only the set of `required` and `optional` keys.

  This functions receives the `params` and cast them according
  to the schema information from `model`. All fields that are
  not listed in `required` or `optional` are ignored.

  If casting of all fields is successful and all required fields
  are present either in the model or in the given params, the
  changeset is returned as valid.

  If params are nil, no casting nor validation is performed, and
  the map of changes is kept empty. The changeset is still kept
  as invalid though.
  def cast(params, model, required, optional)

  @doc """
  Validates the given `field`.

  It invokes the given `function` to perform the validation
  only if a change for the given `field` exists and the change
  value is not nil. The function must return `:ok` or
  `{:error, message}`.

  In case of an error, it will be stored in the `errors` field
  of the changeset and the `valid?` flag will be set to false.
  Furthermore, the `validation_metadata` will be stored in
  the `validations` list.
  def validate_field(changeset, field, validation_metadata, function)

  # Manyof validation functions will be built on top of `validate_field/4`;
  # * validate_format
  # * validate_unique
  # * validate_number
  # * etc

  @doc """
  Adds an error to the changeset.

  ## Examples

      changeset = add_error(changeset, :name, "is invalid")

  def add_error(changeset, key, error)
  ## Change manipulation functions
  @doc """
  Updates a change.
  The `function` is invoked with the change value only if there is
  a change for the given `key`. Notice though the value of the change
  can still be nil.
  def update_change(changeset, key, function)

  @doc """
  Puts a change with the given key and value.
  def put_change(changeset, key, function)
  @doc """
  Deletes a change with the given key.
  def delete_change(changeset, key, function)

Repo changes

Repo.insert/2 and Repo.update/2 will be modified to accept changesets. On insert, all fields will be sent to the database, including the ones in the model and the changes made. On update, only the changes will be sent. Modifications done directly to the model won't be persisted. For this reason, callbacks should also receive the changeset rather than models.


How associations will tie into changesets still needs to be explored but the main idea is that, since changesets provide a tree of changes, we can traverse associations applying those changes inside a transaction. We will need some configuration around this mechanism (for example, what happens when an association is removed from the tree), but this can be done in the schema configuration. Furthermore, we should also extend the validation mechanism to validate association fields, transforming association changes into their own changesets.


Please send any feedback or questions to the mailing list as everyone will receive updates. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment