Skip to content

Instantly share code, notes, and snippets.

@matthewmueller
Last active April 3, 2020 15:03
Show Gist options
  • Star 7 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save matthewmueller/506147abee368f0ea028eb036b8a0a68 to your computer and use it in GitHub Desktop.
Save matthewmueller/506147abee368f0ea028eb036b8a0a68 to your computer and use it in GitHub Desktop.
Sneak peak of Prisma's DataModel 2.0. We'd love to hear your thoughts and feedback. Chat with us in the #datamodel-2 channel in https://prisma.slack.com

Datamodel 2.0

Today we want to give you a sneak peak at some upcoming changes to Prisma's Datamodel. We've spent the last couple months iterating on a design that we're excited to share here today.

The goals of this design are the following:

  • Readable: Anybody in your organization should be able to look at your datamodel for the first time and roughly understand your data's structure and relationships.
  • Complete: Your datamodel should be the source of truth for your data sources. You should be able to copy your datamodel from 1 computer to another, run prisma deploy and get an exact replica of your data sources.
  • Flexible: Your datamodel should support all kinds of workflows and environments. Your datamodel should feel small to an indie hacker, but be powerful enough to scale up to the needs of large organizations.
  • Multilingual: As we move further into the world of polyglot persistence, your datamodel should be able to span and thread many different data sources without losing the unique features that make each data source great.

Today I want to focus on readability. In the past, we've relied on GraphQL's SDL syntax for our Datamodel.

GraphQL SDL

type Report @db(name: "reports") {
  createdAt: DateTime! @db(name: "created_at")
  id: Int! @id(strategy: SEQUENCE) @sequence(name: "reports_id_seq", initialValue: 1, allocationSize: 1)
  standup: Standup! @db(name: "standup_id")
  status: ReportStatus!
  updatedAt: DateTime! @db(name: "updated_at")
  user: User! @db(name: "user_id")
}

enum ReportStatus {
  ASKED
  COMPLETE
  PENDING
  SKIP
}

type Standup @db(name: "standups") {
  channelId: String! @db(name: "channel_id")
  createdAt: DateTime! @db(name: "created_at")
  id: Int! @id(strategy: SEQUENCE) @sequence(name: "standups_id_seq", initialValue: 1, allocationSize: 1)
  isThreaded: Boolean! @db(name: "is_threaded") @default(value: false)
  posts: [Post]
  questions: [Question]
  reports: [Report]
  standupsUsers: [StandupsUser]
  team: Team! @db(name: "team_id")
  timezone: String!
  updatedAt: DateTime! @db(name: "updated_at")
}

type StandupsUser @db(name: "standups_users") {
  createdAt: DateTime! @db(name: "created_at")
  isStandupOwner: Boolean! @db(name: "is_standup_owner") @default(value: false)
  standup: Standup! @db(name: "standup_id")
  status: StandupUserStatus!
  updatedAt: DateTime! @db(name: "updated_at")
  user: User! @db(name: "user_id")
}

enum StandupUserStatus {
  ACTIVE
  INACTIVE
  INVITED
}

type Team @db(name: "teams") {
  botAccessToken: String! @db(name: "bot_access_token") @unique
  botSlackId: String! @db(name: "bot_slack_id")
  costPerUser: Int! @db(name: "cost_per_user") @default(value: 100)
  createdAt: DateTime! @db(name: "created_at")
  id: Int! @id(strategy: SEQUENCE) @sequence(name: "teams_id_seq", initialValue: 1, allocationSize: 1)
  minimumMonthlyCost: Int! @db(name: "minimum_monthly_cost") @default(value: 0)
  standups: [Standup]
  stripeId: String @db(name: "stripe_id")
  teamAccessToken: String! @db(name: "team_access_token") @unique
  teamName: String! @db(name: "team_name")
  teamSlackId: String! @db(name: "team_slack_id") @unique
  trialEnds: DateTime! @db(name: "trial_ends")
  updatedAt: DateTime! @db(name: "updated_at")
  users: [User]
}

type User @db(name: "users") {
  avatarUrl: String @db(name: "avatar_url")
  createdAt: DateTime! @db(name: "created_at")
  email: String
  firstName: String @db(name: "first_name")
  id: Int! @id(strategy: SEQUENCE) @sequence(name: "users_id_seq", initialValue: 1, allocationSize: 1)
  isTeamOwner: Boolean! @db(name: "is_team_owner") @default(value: false)
  lastName: String @db(name: "last_name")
  reports: [Report]
  reviews: [Review]
  slackId: String! @db(name: "slack_id")
  standupsUsers: [StandupsUser]
  team: Team! @db(name: "team_id")
  timezone: String!
  updatedAt: DateTime! @db(name: "updated_at")
  username: String!
}

GraphQL SDL Growing Pains

This foundation has served us well, but as time went on a couple issues became more clear in our space:

  • Default optional makes sense in GraphQL, but is a bad default for databases. Database fields should be required by default.
  • We need to be able to support many model-level attributes. Database indexes, constraints, even primary keys can span multiple fields. GraphQL doesn't have this problem, but for us it meant too much clutter up front.

Additionally, we wanted to take some insights from auto-formatters like gofmt and prettier to make it easier to read large datamodels, temper distracting syntax debates, and reduce visual noise in pull requests.

Without further ado, here's what we came up with. I hope you'll like it as much as we do.

Datamodel 2.0 Example

model Report {
  id         Int           @id @pg.serial("reports_id_seq")
  createdAt  DateTime
  standup    Standup
  status     ReportStatus
  updatedAt  DateTime
  user       User
}

enum ReportStatus {
  Asked     = "ASKED"
  Complete  = "COMPLETE"
  Pending   = "PENDING"
  Skip      = "SKIP"
}

model Standup {
  id             Int             @id @pg.serial("standups_id_seq")
  channelId      String
  createdAt      DateTime
  isThreaded     Boolean         @default(false)
  name           Citext
  posts          Post[]
  questions      Question[]
  reports        Report[]
  standupsUsers  StandupsUser[]
  team           Team
  time           Time
  timezone       String
  updatedAt      DateTime
}

enum StandupUserStatus {
  Active    = "ACTIVE"
  Inactive  = "INACTIVE"
  Invited   = "INVITED"
}

model StandupUser {
  @@id([ standup, user ])
  
  createdAt       DateTime
  isStandupOwner  Boolean            @default(false)
  standup         Standup
  status          StandupUserStatus
  time            Time
  updatedAt       DateTime
  user            User
}

model Team {
  id                  Int         @id @pg.serial("teams_id_seq")
  botAccessToken      String      @unique
  botSlackId          String
  costPerUser         Int         @default(100)
  createdAt           DateTime
  minimumMonthlyCost  Int         @default(0)
  standups            Standup[]
  stripeId            String?
  teamAccessToken     String      @unique
  teamName            String
  teamSlackId         String      @unique
  trialEnds           DateTime
  updatedAt           DateTime
  users               User[]
}

model User {
  id             Int             @id @pg.serial("users_id_seq")
  avatarUrl      String?
  createdAt      DateTime
  email          String?
  firstName      String?
  isTeamOwner    Boolean         @default(false)
  lastName       String?
  reports        Report[]
  reviews        Review[]
  slackId        String
  standupsUsers  StandupsUser[]
  team           Team
  timezone       String
  updatedAt      DateTime
  username       String
}

At this point, you may be asking yourself, do I have to manually space the columns to make it look nice like that? We're going to be shipping with an auto-formatter on day 1, so you'll get nice readability without breaking your spacebar.

If you'd like to take a look at more examples, we've open-sourced a repository with many more examples and syntax variations: https://github.com/prisma/database-schema-examples

We'd love to hear your thoughts and feedback. You can find us in the #datamodel-2 channel in https://prisma.slack.com. In future posts, we'll be covering more of the syntax and design decisions in greater detail.

Stay sharp!

@dpetrick
Copy link

dpetrick commented Jun 6, 2019

  1. As someone who hasn't spend the most time with the v1.x datamodel, ever, I only roughly know what we did in the past. I would suggest that the SDL example and the new DM2 example show the same datamodel, if possible, to make comparisons easier.
  2. Did this slip lines, or do we really support default [] on strings? (expected the default to be one line lower)
scope               String        @default([])
standups          Standup[]

@divyenduz
Copy link

I understand that all fields would be required by default now! But I couldn't find a way to make them optional.

@dpetrick
Copy link

dpetrick commented Jun 6, 2019

@divyenduz note the ? at the end of types, e.g. Post?.

@matthewmueller
Copy link
Author

matthewmueller commented Jun 6, 2019

@dpetrick, good points!

  1. I'm just showing a slice of DM1.1 because I thought it might be overwhelming to see all of it. I'll adjust it and see how it looks.

  2. Good catch! That's actually a bug with rendering. We should support an array of strings at some point in the future though. In the above case scope maps to text[] in Postgres.

Updated. Any more feedback? :-)

@dpetrick
Copy link

dpetrick commented Jun 6, 2019

I like it a lot better, nice!

@lastmjs
Copy link

lastmjs commented Jun 7, 2019

I think moving from GraphQL SDL to a custom DSL is a bad idea. The points mentioned for why do not outweigh the added complexity and fragmentation of a new DSL IMO. The new DSL seems to only be slightly different from GraphQL SDL, and formatting GraphQL SDL automatically could make it look very similar to the new DSL. The biggest material change seems to be required by default... doesn't seem important enough to create an entirely new language and burden developers with it.

@matthewmueller
Copy link
Author

matthewmueller commented Jun 7, 2019

That's a very fair point about not introducing new syntax without a good reason. A couple additional factors that I didn't mention above weighed in on this tough call:

GraphQL is designed for APIs, while Prisma is designed for data modeling

While on the surface they seem similar, when you dig deeper they have different needs. In an MVC architecture, GraphQL is the C in MVC. Prisma will operate on a lower-level as a library. Prisma is the M in MVC.

Visually, the architecture looks like this:

mvc

They sit on different layers, so there wouldn't be fragmentation with existing GraphQL implementations.

We want to merge our configuration into our datamodel syntax

Right now we use YAML for configuration and GraphQL SDL for data modeling. At the end of the day, it's all static configuration so we wanted a syntax that would allow you to have everything in a single file. While we haven't 100% settled on the syntax yet, we're thinking it will look something like this:

datasource pg {
  connector = "postgres"
  url       = env("POSTGRES_URL")
  default   = true
}

datasource mgo {
  connector = "mongo"
  url       = env("MONGO_URL")
}

datasource mgo2 {
  connector = "mongo"
  url       = env("MONGO2_URL")
}

model User {
  id             Int             @id @pg.serial("users_id_seq")
  avatarUrl      String?
  createdAt      DateTime
  email          String?
  firstName      String?
  isTeamOwner    Boolean         @default(false)
  lastName       String?
  reports        Report[]
  reviews        Review[]
  slackId        String
  standupsUsers  StandupsUser[]
  team           Team
  timezone       String
  updatedAt      DateTime
  username       String
}

We want to support embedded models

In the future, we're also going to be introducing an embed block type that may vary how the underlying data is stored depending on the data type and transparently join data from different data sources

model User {
  id             Int             @id @pg.serial("users_id_seq")
  email          String?
  firstName      String?

  photos embed {
    height  Int
    width   Int
  }[]
}

That being said, none of this is set in stone and we'd love your feedback on all of it! ☺️

@mfix22
Copy link

mfix22 commented Jun 7, 2019

Hey all — just wanted to post my initial thoughts. First of all, you knocked it out in terms of readability — super easy to pick up.

I think it would be really cool if instead of:

model StandupUser {
  @@id([ standup, user ])
  standup         Standup
  user            User
  // ...
}

you used

model StandupUser {
  standup         Standup  @id
  user            User @id
  // ...
}

and inferred the compound key, which looks doable.

Second, I really like the idea of embedded models, but it would be great if you removed the embed keyword and just used the braces themselves for block delimiting. E.g.

model User {
  // ...
  photos {
    height  Int
    width   Int
  }[]
}

Finally, there were a couple things that I asked @schickling about offline, namely how optional arrays would work ([User] vs [User!] vs [User]! vs [User!]!). The syntax Johannes shared, User?[]? etc., is great, I just think it is worth adding that example to the proposal above. I also think it is worth adding an example for named versus positional arguments in directives.

Overall, looking really nice 👍

@samburgers
Copy link

samburgers commented Jun 8, 2019

Please please can this auto-spaced formatting be optional and non-default in a Prisma project. It greatly decreases readability of each line (rows are the primary collection, not the columns), and will cause unnecessary friction with version-control.

@jasonkuhrt
Copy link

jasonkuhrt commented Jun 20, 2019

Here's some sketch-y feedback, nothing too organized :)

I like this direction. It also makes a lot of sense that the sooner or later the needs of the DB layer and the (up to) Gateway layer would diverge. The Graphql SDL spec also moves quite slowly, with a much more diverse set of stakeholders than the comparatively focused ORM audience of Prisma, and as such, will evolve more slowly (take for example input unions discussed over multiple years and there have been cogent design proposals for a long time IIRC) and include/discuss features not relevant to prisma model language needs (e.g. everything input related).

About enums:

enum ReportStatus {
  Asked     = "ASKED"
  Complete  = "COMPLETE"
  Pending   = "PENDING"
  Skip      = "SKIP"
}

I wonder if would be simpler if we had the concept of unions instead:

union ReportStatus { ASKED | COMPLETE | PENDING | SKIP }

and if aliases (labels? not sure what the right term is) are really needed, support them with directives like:

union ReportStatus { 
  | "ASKED"       @alias("Asked")
  | "COMPLETE"    @alias("Complete")
  | "PENDING"     @alias("Pending")
  | "SKIP"        @alias("Skip")
}

From a DX perspective my inspiration is algebraic data types, and how nicely they compose. E.g. take TypeScript for an example that most of Prisma's audience would relate to probably.

This would also, without adding more concepts, the ability to setup unions of models (a concept which some DB's may support at varying levels):

union SearchHistory { Team | User | Org }

@mfix22 had great points, would second them all.

I wonder if model composition could serve any use e.g.:

model User {
  id             Int
  username       String
}

model Patient {
  ...User
  a String
  b String
  c String
}

model Doctor {
  ...User
  a String
  b String
  c String
}

I would also be curious about the idea of packaging models, e.g.:

import { Foo } from user/package

model Bar {
  ...Foo
}

model Bar {
  ...Foo @only("a", "b", "c")
}

I would prefer to see a community standard based on snake_case rather than camelCase

model Team {
  id                    Int         @id @pg.serial("teams_id_seq")
  bot_access_token      String      @unique
  bot_slack_id          String
  cost_per_user         Int         @default(100)
  created_at            DateTime
  minimum_monthly_cost  Int         @default(0)
  standups              Standup[]
  stripe_id             String?
  team_access_token     String      @unique
  team_name             String
  team_slack_id         String      @unique
  trial_ends            DateTime
  updated_at            DateTime
  users                 User[]
}

@jasonkuhrt
Copy link

Another thought, about the ability for the DSL to scale up/down with the developer. This is more about ergonomics and high fidelity materiality––wouldn't be fundamentally unlocking anything here.

Allow unions to be defined inline

mode Report {
  ReportStatus     "ASKED" | "COMPLETE" | "PENDING" | "SKIP"
}

Allow relations (or in document terms: nesting) to be defined inline:

model Team {
  users {
    foo  Int
    bar  String
    qux  DateTime
  }[]
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment