Skip to content

Instantly share code, notes, and snippets.

@stephanbuys
Created August 1, 2017 09:10
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save stephanbuys/2fa2860c4526a9cc37dbf5af898ecc56 to your computer and use it in GitHub Desktop.
Save stephanbuys/2fa2860c4526a9cc37dbf5af898ecc56 to your computer and use it in GitHub Desktop.

Introduction

Diesel (http://diesel.rs) is an ORM (Object-relational mapper) and Query Builder written in Rust. It supports Postgresql, Mysql and SQLite. It makes use of Rust's custom derive functionality to generate all the code you need in order to get all the power of Rust's type system when interacting with a database. This means that you get compile-time validation of your code, thus eliminating possible runtime errors.

Diesel is very powerful, but by just following the examples might still leave you scratching your head, asking questions like "where do these types come from?", "what should I add here", "how does this work?". This article is meant to shed a little bit of light on the topic from the perspective of a intermediate-level Rust developer.

There are many topics that this article will not cover, but the hope is that you will be able to "grok" enough of Diesel so that the rest becomes obvious and that you understand how it works, to the extent that new territory that you explore in this crate also becomes simpler to understand.

First Steps

You should first run through the Getting Started guide on the official Diesel website. This article focusses on Postgresql, and by far the easiest way to get a Postgresql server running for your tests is by using Docker (www.docker.com)... the following one-liner should do the trick:

https://gist.github.com/3b2fd93a8f151a354406f4694367b6dd

This is not for production use, the whole container will be removed when the container is stopped (use docker stop postgres), but it is super handy for the getting started guide.

A word about the format of the rest of this article

This articles assumes that you have worked through the Getting Started guide; it deliberately glosses over a whole set of features.This is intentional and done in order to keep the article as concise as possible.

Core Components And Community

At its core Diesel consists of 4 main components:

The diesel crate, diesel_codegen crate (the code generator) and the diesel cli, which by now you should have been introduced to.

The last major component, and 'secret weapon' or 'unofficial guide' (according to me) is the the test suite in the official repo (https://github.com/diesel-rs/diesel/tree/master/diesel_tests). It served as my guide whenever I got stuck, although it will surely be superseded by documentation as the project continues to evolve.

Environment

When I started off with Diesel I was baffled by the infer_schema! macro in schema.rs (more on this below), which in turn used "dotenv:DATABASE_URL". The dotenv crate is just a convenience library that allows you to put your DATABASE_URL in a hidden environment file, ala .env (dot env, get it?).

It follows that you can also just specify DATABASE_URL in your environment, which is particularly handy when you run your Diesel-using Rust binary in a docker container ala 12-Factor apps.

The Diesel cli tool also reads the .env file, and if it's not available you will need to define DATABASE_URL in the environment or pass it the --database-url parameter.

Definitely check out the examples/sqlite/getting_started_step_3/README.md file in order to learn how to configure the DATABASE_URL for SQLite (it doesn't use a URI format).

Basic Flow

The core workflow for creating an app built with Diesel can be broken down as follows:

Design A Schema

This is very obvious to seasoned SQL veterans, luckily for the rest of us the diesel cli's migration subcommand allows us to easily iterate on our design and even evolve it over time. It is, however, very useful to have a clear idea of what you want your database to look like up front, and it should be noted that Diesel only works with tables that have a primary key.

Create Migrations

Follow the pattern in the Getting Started guide, and note that you can add more migrations at any time using the diesel migration generate subcommand. Migrations will be run in order whenever you run diesel migration run. You can rerun the last migration by issuing diesel migration redo and if you truly get stuck, you can run the following command, but please never do so on a production database, diesel database reset.

When designing your tables you should put plurals of your table names. As an example, Diesel will take the model User and search for the table users. You can define custom table names, but knowing the Diesel developers' assumptions will probably spare you some confusion.

Diesel will take PascalCase Rust structs (which will probably describe single objects, or rows in your database table) and translate them into snake_case table names with an s tucked at the end to "pluralise" it. For example, AFancyNamedObject will be assumed to map to a table named a_fancy_named_objects.

Infer The Schema From The Database

Diesel has the ability to inspect your actual database and infer a schema for use in Rust, this will be used to create the necessary DSL (domain specific language) that allows you to interact with your database in a safe, lightning fast, and strongly typed fashion.

The Getting Started guide and examples use the infer_schema! macro. The major disadvantage of using this macro is, firstly, that you have no idea how Diesel actually interprets your database's types (this matters for your models, which will be revealed later) and secondly it requires a bootstrapped database instance during compile time, which can be a bit of a pain when compiling as part of a pipeline that might not have your database and compiler toolchain available to each other.

It is recommended that you use of the diesel print-schema subcommand and simply copy and paste the inferred schema into a file in your project (in the Getting Started guide this is schema.rs but it could just as easily be pasted into lib.rs or main.rs).

What you will see is a table! macro similar to:

https://gist.github.com/e6d5195b3bbe25318c6f58c224e4e89c

You want to grab the datatypes (Integer and VarChar in this case) and immediately run over to docs.rs/diesel and plug it into the doc search bar. This will take you straight to diesel::types::Foo (also explore diesel::types) and allows you to inspect the implemented ToSql and FromSql traits for each type. To take one example, Integer maps to i32 in Rust. This is super useful when implementing your models, or wrestling with compile-time errors.

Create Models

As with the schema, you don't have to put your models in the models.rs file. It is recommended that you split your models out using the Rust modules system, and something like the following might assist, especially when dealing with lots of models.

https://gist.github.com/1e494dd16aeb60f91cc997eed73d43d0

I initially struggled to split the "magic" used to drive Diesel from idiomatic Rust. It turns out Diesel is just good old familiar Rust once you know how it is structured and now how to deal with the generated code.

An example model expressed as a Rust struct is as follows, plucked straight from the Getting Started guide:

https://gist.github.com/8d85109ff7b6edd5ba21fa401ae9a0d7

Models are normal Rust structs that seem to map to your tables. This is usually the case for simple models, however, it is very important to note that Queryable structs, as the name implies, actually map to the results you want to obtain from a SQL query.

A Queryable struct is a single object or row (partial or complete) that you want to retrieve from a SQL query. This User object will be queryable from the users table, or any query that returns the correct types, as discussed in the Queryable section below.

Note that diesel's SQL types don't support unsigned integers in Postgresql (this is a Postgresql limitation), so it would be worthwhile looking at the appropriate diesel::*::types in the documentation to see what your database supports.

Derives, aka "The Codegen Magic"

Diesel's code generator is primarily used to embue your Rust structs with SQL magic without you having to handcode a ton of functionality. It also turns the Rust type system into the magic ingredient that can be used to construct the fast and reliable SQL that gets transmitted to the database.

In order to imbue your structs with the SQL goodness, you use Rust's custom derive functionality, for example:

https://gist.github.com/2f3899a32d7ec3dc172b604fef532f00

A more complex example:

I loved stumbling across the following in the tests... it truly made me smile at the genius of the authors, but it also made me scratch my head, "where on earth are these things used and how do they work?"

https://gist.github.com/ca823ec4b0df619ac6d884f76d6e5505

First, a 'pro tip' from a novice, you should install the cargo subcommand expand:

https://gist.github.com/0304305b9d2264d9b47dec390bdbbb93

This allows you to run, the following command and actually see what is generated (warning, this guide assumes that you know Rust, but the following might make even a seasoned Rust developer's eyes water, so feel free to skip it or use as bedtime reading).

https://gist.github.com/7eda3bf3d69f0910400472e9e4042db5

The full glory of the generated DSL is revealed... of particular interest is the columns for each model. Parsing the output is left as an exercise for the bored reader, or the readers who like a challenge.

Let's look at the derives in more detail.

PartialEq, Eq, Debug, Clone

These are your standard Rust derives that most developers rely on when appropriate. You probably want at least Debug if you like println! (or the shiny new eprintln!) for debugging.

#![deny(missing_debug_implementations)] is a very useful compiler directive as it will error on any structs that do not have Debug implementations, especially useful if you are developing a library.

Queryable

The first thing that most developers usually want to do with their databases is query the data within it. In order to do this, you need to decorate your struct as follows:

https://gist.github.com/dab57eb06e241058cabd4058caca2223

This will cause diesel_codegen to generate the query DSL needed to deserialize a query result of type (i32, String, String, i32) which maps to (Integer, Text, Text, Integer) in Postgresql, more examples of this towards the end of the article.

In SQLite a date/time type can be deserialized to String too.

A primary key column is mandatory to work with Diesel, but is probably good practice anyway.

How to use the DSL to construct queries is addressed later in the article.

Insertable

Unless you are coming from an existing system, you probably want to insert data into your database too. A typical "insertable" object would look like this:

https://gist.github.com/b1c11170529bf6e76e03115afdecf60f

Note that we have dropped the id field as the SQL server will handle this for us (this might change for advanced use-cases).

Also note that we now explicitly name the table, users, this is also needed for AsChangeset, and inferred for Identifiable

Identifiable

Inevitably, you are going to use SQL joins to construct results from more than one table at a time. In order for the join to successfully resolve the exact object in your target table, this struct needs to be annotated as follows:

https://gist.github.com/681f042be4f5f25e7b0a92b2e9c61a67

By default Diesel will assume your primary key is called id, and if not, you can override it as follows:

https://gist.github.com/e2cd06d5c2cf7b4dde583740293e7ce1

Associations

The tables that you want to enrich with your Identifiable data needs to be annotated as follows:

https://gist.github.com/53f4b65c1c9f100dd3fb65b88279532e

This will allow you to join the User data to this table by looking up the user, defined by the field user_id, corresponding to the id field in the User table. In the event where your foreign key field is not specified in the pattern type_id, you will need to manually map it:

https://gist.github.com/fec73ac24fe57ff9bb679318a1d3af15

AsChangeSet

#[derive(AsChangeset)] is used to updates, more details can be found in this work-in-progress guide: https://github.com/diesel-rs/diesel/blob/master/guide_drafts/all-about-updates.md

Using Diesel

At this point, you are ready to actually do something with Diesel, probably insert some data and make some queries. This is covered by multiple examples in the Getting Started guide. There are, however, some items that need to be unpacked to, hopefully, make the lightbulbs go on.

First and foremost, at this point, it is super important to underscore that you are now entering the normal Rust world. What I mean by this is that the codegen and magic bits of Diesel now take a backseat.

This implies that when you look at example code, there is no more slight of hand. You need only trust the complier and take time to unpack the statements and expressions like you would normally do. You'll find that you are calling normal methods, passing normal data-structures (or references to them) and that you can println!, debug, step, re-order and organise like you would with any other Rust code. It took me an unfortunately long time to comprehend this, but it is an important insight and will allow you to code without fear.

Connecting to the database

The postgresql example uses the function pub fn establish_connection() -> PgConnection to wrap the connection into a convenient function call for re-use in the rest of the example code.

PgConnection encapsulates the handle to posgresql for you, keep this object around or recreate it on demand. When needed, look at the r2d2 crate to create a pool of database connections.

Everything you execute using Diesel will depend on the connection object being available.

Inserting data

The guide specifies the following function in src/lib.rs. Let's step through it:

https://gist.github.com/4b2c874b3731e7f93b2b4d021f8ddcff

First note the reference to the connection &PgConnection. As is done for some of the other files in src/bin/, a new PgConnection could also be instantiated by calling something like:

https://gist.github.com/159e040b6f474331a12b87ef5f1f2eb8

The next line, use schema::posts;, stumped me for a long time, as this is using generated code. In the diesel::insert expression, we see the use of posts::table.

At this point, it might be a good idea to take the output of cargo expand and have a look at it in an IDE of sorts. Try the following:

https://gist.github.com/0f9c3c824574cdd402f29ed34d7c388c

If you search for mod schema under the output you will also see mod posts, and there you'll find a table struct (empty, but with a bunch of Traits impl'ed).

Next we have a new object (which is Insertable) called new_post.

Lastly we have the actual Diesel method insert. If you consult the documentation you will see that the Diesel module has 5 functions, insert, delete, insert_default_values, select and update at its core.

If you look at the insert function you'll see it accepts records: T (in this case our new user, but it can also be a Vec<T>) and returns an IncompleteInsertStatement object, which has one method called into() which accepts your table struct. You could also have called into(::schema::posts::table) and avoided the use statement.

In this case schema is the name of your module, due to the schema being generated in the schema.rs file.

Querying data

Again, we'll refer to a function in the guide, in this case the src/bin/show_posts.rs file:

https://gist.github.com/b0bf31af6d2fd091793100a17de8127b

Stepping through it from the top, first we import our own crate (the example Cargo.toml specifies the crate name as diesel_demo). The we import the Diesel crate.

use self::diesel_demo::*; gives us access to the establish_connection() function, use self::diesel_demo::models::*; gives us access to the actual structs that we defined in our models.rs files, in this case the Post struct.

use self::diesel::prelude::*; is a necessary import that bring a whole bunch of Diesel Traits and types into scope. It is needed for Diesel to work and beyond the scope of this article to dive into.

In the main() function is where we encounter the use of some magic again, specifically the use diesel_demo::schema::posts::dsl::*; line. When I first started using Diesel I was stumped between the difference of schema::tablename::* imports as used above when inserting code and schema::posts::dsl::* dsl imports used here.

A look at the output of cargo expand allows for some clarification:

https://gist.github.com/bb839c14bd7eba19d0dbeebcdd51cf3d

In short, schema::posts::dsl::* brings the columns of your table into scope. Each column type that is generated for you has a collection of expression_methods implemented on it. To rephrase this, the dsl (domain specific language) allows us to use the columns in our table's names (as defined by the schema.rs generated code) and apply logic to it in order to construct our SQL queries. (see http://docs.diesel.rs/diesel/expression_methods/global_expression_methods/trait.ExpressionMethods.html#method.desc)

Extra credit if you spotted the convenience import of table into the dsl module (I think this is for convenience).

Building queries

In SQL we construct a select expression to return values from our database, and in Diesel we use Rust's type system to construct type-checked, safe versions of those. This is great for anyone who has ever struggled with the fragility of SQL queries, and its implied security risks... only valid SQL should compile successfully.

The next expression posts.filter(published.eq(true)) reflects that we want to run the filter method on the posts table (conveniently also imported into our context by the use schema::posts::dsl::* statement). filter takes a constructed filter as its input. A filter is constructed by combining columns and their expression methods.

To inspect the results of this you can rewrite the relevant code as:

https://gist.github.com/10bdce8038b1c9c27aa30f6a51d15b23

If you look at the output to the terminal you will see a SelectStatement object is returned. You can also use println!("SQL: {}", debug_sql!(posts_with_sql)); to look at the SQL that would be generated (make use you import diesel using #[macro_use] extern crate diesel;).

The main takeaway from this section is that you first build up the relevant SQL query by using your table and columns imported from the dsl module, and that this is introspect-able by println! and debug_sql.

You should familiarize yourself as much as possible with the different expression_methods you can call on columns - it will get you well on your way to building the SQL queries you want to use in your applications.

Getting results

The last item we'll handle in this article is actually reading your results. In the show_posts.rs binary this is achieved by the lines:

https://gist.github.com/b26e4f2bfca659052fac387131a12a80

As can be seen we are calling the .load() method of the SelectStatement struct that we inspected a bit closer above. The .load() method is generic, so we need to give the compiler some hints as to what type we want to return. The parameter for the load function is a reference to (or borrow of) the connection object returned by establish_connection.

load returns a result object, which in the case of this executable we deal with by just unwrapping the result with the expect() method. We then pass the result, if we were successful, to the results binding.

See diesel::prelude::LoadDsl or diesel::prelude::FirstDsl in the docs for some alternatives to load. As will be seen the methods return a QueryResult which is just a Result<T, Error>; type alias.

In all the methods just mentioned, either return a single object (QueryResult<T>) or a vector of objects (QueryResult<Vec<T>>). In other words, one row or multiple rows of the selected columns in the database table.

Wrestling with results

If we wanted to get all the results back, we could have used the code:

https://gist.github.com/3b949a040511b4cee74623a2fa9f53e9

load and get_results are equivalent. That is, they both return a Vec<T> result (QueryResult). Note that the code was reformatted to illustrate the return-type hint given to the compiler in the binding, let results: Vec<Post> = ....

Lastly, sometimes you will use a select method or join method (not illustrated in this article, but do check out the 'unofficial guide' aka diesel_tests mentioned above) which will return rows that do not map directly to the fields in your models. Use a tuple to collect your results from your load method in that case. This may look like:

https://gist.github.com/f322b02a8d0165cd40d6a9f962c59b1a

Above, we expressed the Post model or struct as a tuple of its constituent parts.

What we didn't cover and what's next

There is a lot that wasn't covered in this article, even though it is quite a large volume of information to consume in its own right. As mentioned initially, the goal was to explain enough of Diesel and its structure so that it would become possible for interested developers to "grok" or understand it, and then help themselves.

I hope you made it this far and that you enjoyed the journey, please send feedback and enjoy Rust and Diesel.

Special thanks to Sean Griffin, the author of Diesel, for the fact-checking he did on this article.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment