voodootikigod/griesser.md

## griesser.md

      
    Raw
  

              griesser.md
            
          
    Thanks. So yeah, I'm really excited to be here, talking about making
relational cool again, and the subtitle is... Or JavaScript on ACID. And like
I said, my name is Tim Griesser, Tgriesser on Github, and I'm here really
excited to be talking about SQL. Or is it Ess-kyoo-ell? I know it's an issue
that really divides programmers. So I Googled it, and the first result I came
across was this English language and use in Stack Exchange. Which I thought
was fitting, given that we're at a programming conference. And the first thing
you'll notice is I'm not a member of the English language and usage, so if I
make any mistakes up here, you'll know why. But the answer was that it was in
fact first called the structured English query language, and so that acronym
was SEQUEL. And then that was abbreviated as SQL. Now it's a matter of
preference. And there's an urban legend that it came after the structured
query language, which is the worst urban legend I ever heard. So I learned
something new, and now I'm a member. It was a big day. But it's appropriate
that we're talking about relational databases at a JavaScript conference,
because I think there's a lot of similar features between the two. At a
language level, maybe not so much. Relational, SQL is static and strongly
typed, declarative, and in some ways procedural, and JavaScript is as loosely
typed as you can get while still being a programming language, and it's object
oriented, which leads to some interesting debates as to how to write the code
correctly, and it shares some of the same operators, but we know in JavaScript
the operators don't do what we want some of the time. So what do I mean when I
say is theres a lot of similarities between SQL and JavaScript? They both sort
of share this similar quality that is not often attained by all languages,
which is what I like to term -- languages that will probably never die. As
much as people try to kill off JavaScript with different other languages that
will supersede it, it sort has this monopoly. It's the language of the web.
It's the language that runs in web browsers. And similarly, SQL has so much
momentum. It's been around since the '70s and it's been around forever. This
is a slide that we often show when we're talking about JavaScript, just sort
of illustrating that it runs everywhere. In the browsers, but also your
television and your server and your nodebots and copters and also on your
phones, and I think the same could be sort of said about SQL. Is that it
powers the largest websites in the world. It's behind Facebook and Wikipedia,
and also down to SQL-lite, flat file storage, which is in more places than you
would imagine on your phones. So coming to node.js a couple years ago, I was
sort of surprised. It seems like these two would be such a great fit. They're
both everywhere. Why is there not more support for it? And I think the reason
is that node starts with no, and no is the beginning of NoSQL, and around the
time that node came out, it was like -- let's rethink the entire way that
we're writing server side applications with this evented IO model. And at the
time, it was sort of like -- let's rethink the standard relational database,
because we're hitting web scale and we need to get past that.
And so around the time that sort of the height of the NoSQL phase or moment --
it's still going, of course -- is when node came out, and that's why there was
so much of this. So all the tutorials I felt like for node were node and
Mongo, getting started -- they have literally taken the software and web
industries by storm. You've got Mongo express angular node, which everybody
has heard of, and then my favorite, the LeBron stack, level DB slam dunk. And
what happened to relational algebra? Right? This is proven and time tested.
And SQL works great for a lot of things. And this stack overflow post that I
came across sort of summed up what I was thinking, coming to node from
outside.
Are relational databases a poor fit? And this is sort of long, so let me just
pick out a few things. There's a degree of antipathy toward relational
databases, is an observation you could make of the node community. And that
they're poorly supported, compared to non-relational databases. I sort of
agree with this. Listening to the node up podcasts, listening to couch DB and
Mongo, I guess it was like -- relational databases aren't cool anymore. I dug
a little and found there actually are really great libraries for working with
SQL in node. All the popular Open Source, and Oracle came out with another
one. But there's not a common API all of them. Node didn't assume that a
relational database was a given. In all other language systems, there's a
standard for how you're interacting with common patterns of a database. How
you connect, how you disconnect. That doesn't exist. So what that lends to is
that all the different clients have a different way of dealing with these
things that should be pretty common. And there's also, I guess -- the next
thing I noticed was that the higher level abstractions were lacking a little
bit.
And what I mean by that is that a lot of them were database-specific. So there
was a great query builder, but it works for postgres, specifically. Another
great one for another language, and no way to use the same for both. Different
APIs. Or there's a mix of an ORM layer and a query layer, so there's no
ability to drop down and use just vanilla SQL if you need to. One that was big
for me is the lack of transaction APIs. So there wasn't any mention of
transactions. In any of the higher level libraries that I had seen at the
time. And that is sort of what the second subtitle of the talk is that I'll
get to in a little bit. And then there was the jack of all trades, where it's
sort of like -- hey, you can use the same save API to save to Redis or Mongo
or relational database or the file system. It's like... If you have the same
API for all these different specialized data stores, you're really not taking
advantage of the best use cases for each of them. Or there's sort of what I
call the DIY. Like, it gets you most of the way, but modularity. So you have
to assemble all these. The pooling, the construction, the abstracting the
database API. So the ecosystem -- I would definitely agree with the Stack
Overflow post, that there's a need there, that needs to be filled.
But I really wanted this to happen. I really love writing JavaScript. So I was
like... Maybe we can do something better. And at the time, I had been writing
a lot of PHP. This was back in 2012, in a framework called Laravel, which some
might have heard of. And it makes writing PHP pretty tolerable. And I was also
using a lot of backbones. So I was thinking... Maybe I can take some of the
ideas from both of these, and I sort of translated the query builder from
Laravel into this project called Knex.JS. And this sort of illustrates what
pieces were taken from each, but the eloquent ORM from Laravel and some of the
ideas from backbone into bookshelf, and mix into promises -- and so I'm going
to talk about these two libraries that I have put together, connects and
bookshelf. The first is Knex. If you're not familiar with where the name
comes from, it's this toy that was popular in the '90s. It's sort of like a
cheaper version of Legos. You can assemble all these pieces together and build
pretty cool-looking things. And actually, it's really great if you're
procrastinating, making conference slides. I decided to make this little
motorcycle here. All right, JSConf. But the reason that I think... What Knex
tries to do is standardize some of the inconsistencies in SQL. So SQL is a
language -- an ANSI language specification. But it's not like it is, what
we're used to with, like, ES5 or ES6 and beyond. Where we sort of have this
compatibility table, and we see that Babel is in ahead of everything else, in
terms of implementing all the different features. But it feels more like the
can I use. Sort of like a language guideline. There are features that exist in
some dialects but not others. It doesn't fail, or it fails sometimes, or --
different inconsistencies that Knex tries to paper over. This is an
illustration of the different pieces as they're coming over internally. Have
things like connection pooling, to make sure you're getting connections that
you don't have to continually reconnect to the database every time you reissue
a query, mixes in different grammars from the different dialects, and creates
a client that's used in the schema and normal query building, as well as
transactions. Which I sort of break out entirely, because I'm going to spend a
little bit of time on those, and on top of that, we have higher level APIs for
doing some familiar migrations and seeding as you would see for something like
Rails.
So how you get started with Knex is just give it a connection string. If you
want to configure the pool, you can do that. And it works against a number of
different databases. Including Oracle, which is pretty cool. Somebody just
opened a pull request. And it's like -- oh, I don't use Oracle, but it's
awesome that you opened a pull request. Yay, Open Source. And it also supports
web SQL, but I don't really recommend that, because that's deprecated.
Browsers killed that off. So you can select all from accounts, where activated
is one, and then you call then on it, and it has the familiar promise chaining
API, so it issues that query and returns the result of the query and then you
can catch the error if that happens in the standard promise -- it's actually
great that I don't have to argue too much in favor of promise, now that
they're actually in the spec. But when I was starting out, that was a thing,
whether it was -- it was still up for debate whether you should use that.
Also it does joins. I don't have to read it out too much, but joins with
multiple clauses, and then subqueries, so anywhere -- so here we have a where
in clause, and anywhere you might want to use a subquery, the general rule of
thumb is that you can pass a function and then use the context of that closure
as a new subquery. And then from there, I'm not going to dive into all the
features, because that's what documentation is for. But you can do raw
queries, like if you have a specialized query that you don't want to... That
maybe does some things that aren't supported by Knex, you can issue those,
aggregates, subquery aliasing, and it tries hard not to let you screw up. It
tries to catch different errors or paper over things for you, so you don't
make too many mistakes. And so the subtitle of the talk, as I mentioned, is
JavaScript on ACID. So this is not the ACID test, which... Some are probably
familiar with.
I've mentioned this to someone, and they're like -- oh, you mean the browser
suite? No. Let me look that up, though. It's good to know that Chrome is
passing the ACID 3 test, as of today. But I mean atomicity, consistency,
isolation, and durability. I'm not going to go into each of them, because that
could be a separate talk about each of the individual terms. But transactions
-- I sort of sum up that. What they do is give you the ability to have a
snapshot state of the world as you're working with your database. So what it
allows you to do is roll back to a certain point in time, and also prevent
other connections that might be working against your databases, from altering
rows as you're working within a single atomic unit of state. And this is kind
of difficult to do in node. You have to think about this API in advance,
because in order for this to work, you have to have the same database
connection passed to every single query that you're working against. So it's
not like in synchronous languages, where you could just sort of say -- okay,
start transaction here, and then all of these will run right here, and then
end it right here. Because you sort of... With the event I/O model, you sort
of lose that context as you go with the callback. You don't really know what's
happening. So the connection explicitly has to be passed to every query. So
let's think about a situation where we would actually want to use
transactions.
So here's a simple user flow situation, where someone is registering for a
website. And they register, and you give them an ID and you have to send that
to a third party service and then you have to create maybe some other rows for
that user, and then it's all done. They're all registered. So let's think
about what could go wrong in these four steps. So first you might depend on a
module which doesn't follow semantic versioning. So something breaks, and it's
completely out of your control, or the third party API that you're working
against sort of shuts down, and your data gets into a bad state. So in order
to deal with that, then, after step 2, if that fails, then you would have to
undo what happened in step 1. Or in step 3, where you're creating additional
rows to deal with that new user, maybe one or a few of them failed, so then
you have to make sure the ones that did go through get rolled back, and then
you have to unregister the user, and then say -- hey, something messed up. And
that ends up pushing a lot of logic into your application code that you have
to deal with. All of these potential bad states.
Or you just don't deal with it, and then you have potentially bad data in your
application. Because you want this whole register user flow to happen as one
atomic event. Or here's another situation that can't be handled by the last
one. Your servers just catch on fire halfway through registering the user. And
actually... I created these libraries so I would one day have the ability to
use this graphic in a keynote. No. But... So what Knex tries to do is you can
say Knex transaction, and then it creates an instance that you treat as a
normal query builder instance, but it knows what it's supposed to be on and
the fact that it's inside an transaction. So you can pass that to other
functions that can utilize it and work with it as if it's a normal API, but
behind the scenes, you have a transaction that you're dealing with. And
originally, the API was that you would just have to pass this explicitly to
every single query that you built. So you had the transaction object, and then
you could call commit or rollback at the very end. But that seemed to be kind
of error-prone, that users would sort of forget to pass it to every single
query necessary, and then they would have sort of queries not working on the
same connection, and it caused all sorts of errors. So this is the new API.
And like I said, Knex tries to make it hard to screw up, I guess. And so this
is a little boilerplate here -- every time we call commit and roll back, this
is a promise, and then we return the promise into the transaction, and we know
if the entire promise chain fulfills, the transaction should be committed, and
if it fails, the transaction should be rolled back. So that's pretty simple.
Nesting transactions is new. And it's the idea that you shouldn't have to
worry about whether the client is already inside a transaction. So if you call
Knex transaction on something that is already a transaction Knex, it should
just create a save point. And so a save point is sort of like a save point in,
like, a video game. Where you get to a certain point, and if something fails
beyond that, you don't go back to the very beginning, but you go back to just
right here.
And so it does this transparently. You know, for you. And you don't have to
worry about it too much. So Knex likes to take what I call the batteries
included approach, where it does a lot of this for you, and also provides a
lot of different interfaces, like callbacks and streams and events, and two
string -- the different things you would want to work with in a nice manner.
So that's pretty much just a high level overview of Knex. And now I'm going to
jump into bookshelf. Which is an ORM, which stands for object relational
mapper. And in short what that tries to do is take care of standard SQL
queries for you, especially in common app operations. So when you're building
an app, there's a lot of things that are pretty standard. Like insert, return,
fetch, save, dealing with -- relating different rows of data. And you don't
want to have to write all of this by hand. So it sort of abstracts that a
little bit for you, and takes a little bit of the flexibility away, but gives
you a nicer, higher level piece to work with. So the different association
types -- pretty familiar if you've ever worked with an ORM in another
language. One to one, one to many, many to many. Polymorphic -- I don't know
if these are necessarily a great idea, but sometimes they can be useful, I
guess. So it supports those. And it builds on top of Knex. So that's where I
was talking about having a separation between a query building layer and an
ORM layer. You should be able to write raw SQL when you want and have
something that works on top of that and also allows you to drop back into it
when you need. So this is sort of what it looks like, to create a few models
with associations. And you can also sort of filter with... So here we have
comments, which is has many to a comment, but we also have moderated comments,
and we can add in the where-moderated is true. And it also supports loading
where you're trying to add data to a collection of data, if you have 26 items,
you have 26 extra queries, so it tries to do one and then another query for
the extra related results. An example here is to find an account with all the
posts in the account, and then all the comments under those posts and all the
accounts that actually made the comments on the posts for the account. So
that's what that it would look like, to provide with-related, and this is all
in the documentation. So I'm going to jump through this a little bit.
Also, as you constrain your loads, you can dynamically constrain relations,
and then load things after the fact. So if you want to fetch one row or a
collection of rows and then only load onto one, it allows you to do that as
well. And then as I was mentioning earlier, it allows you to tap into the
query chain, to dynamically add things that are maybe a little more
SQL-specific under the hood, as you're building the select statement or other
statements for the model. And so transactions in bookshelf aren't quite where
they are in Knex, where you still have to pass the explicit object to each of
the async calls. Turns out it's a little harder to retrofit that, where you
have a transactional bookshelf object. That's where it's going in the future.
But transactions are absolutely supported in bookshelf. And just a little bit
of where bookshelf sort of came from -- I mentioned that it came from some of
the ideas from backbone models and collections. And ultimately the idea was to
see if we could reuse some of the same models on the server and client. And
this was back in, like, 2012, back before I even knew there was a term for
doing this. Which I'm not going to say. Because he already covered it last
year.
And I actually sort of got there. I was able to swap out bookshelf models and
collections for, like, the backbone TodoMVC and use it targeting webSQL.
Which was kind of cool. I don't know. I built a to-do list. And that's pretty
special. But really, I think, shared models sound great, but it's not really
as great in practice, because you have to know a lot more -- or there's sort
of a limit to what you can do. You have to load all the data, and lose out on
a lot. Because you don't need as much of what you would need on the server on
the client, if you're just displaying it. So it sounds great. Not as great in
practice. And it's something that I'm sort of moving away from, some of the
model and collection conventions from backbone, based on some of the things
that I've learned, having been out there for a little while. So you can read
more in the docs. I'm not going to go too much further into it. But I think
one thing I would like to point out is that Knex and bookshelf are not the
first to do this. So I mentioned earlier there was already higher level SQL
abstractions in JavaScript that I saw, and I was like -- oh, I'll build these
instead.
And hopefully it's not the last to do this. So a few other projects --
openrecord is a really great project that's using Knex under the hood, and it
provides more of an active record syntax. And SQLize is the big one, and is
still a big one. And it didn't have support for transactions. It had a
really... The APIs were sort of rough. And since then, it's come a long way.
I've been really impressed with a lot of the development happening with
SQLize. So if you haven't looked at it for a while, take another look. They're
doing cool things, especially around the new postgres features that are coming
out. A new one that I just saw recently, by someone who had been frequently
commenting on bookshelf and Knex projects, and I guess maybe had some opinions
that were different, created this Zuul.JS. And it's really impressive. It does
transactions, doesn't have quite the separation between the query building and
the ORM layer, but does a lot for you at the ORM layer. That's worth checking
out. SQL bricks is pretty similar to Knex, but with different opinions.
NodeSQL, by the guy who does the postgres driver. Even the drivers have come a
long way since I looked at them in 2012. The node my SQL driver has a ton of
documentation and a lot of features added. It's really impressive how far it's
come in the last few years. Knex query lab allows you to demo this in the
browser and have it spit out SQL. Bookends does nested data loading and
querying really well. It's on top of bookshelf. And there's endpoints, a JSON
API compliant library, which has Knex and bookshelf under the hood, and larger
stack implementations like Sails, which has their own ORM, but they were
potentially talking about using Knex as the query builder under the hood. So
they're targeting all the different SQL drivers. That's just some of what's
out there. But I really don't think this is enough. So... We're still sort of
like... At this stage where node is a really young ecosystem. And a lot of
times, you'll hear -- don't reinvent the wheel. Like it's already been done.
But I sort of disagree with this. I think that people should absolutely be
reinventing the wheel. With an asterisk. Just not on the client side. Because
React already did this. No, I'm kidding there.
But, like, we should have more experiments out there, because I think the
types of conversations we're having since React are really different, about...
Just the web in general. Because they questioned a lot of opinions. So I think
that -- take a look at other libraries. Take a look at other languages. Which
is what -- you know, a lot of what JavaScript is about. JavaScript is sort of
late to the party of being on the server. And a lot of the great ideas coming
out around async and ES7 and ES8 are looking at the best of what's around in
other languages. So I think it's really important -- like, as I mentioned,
these were adapted from the Laravel query builder in PHP, and I would like to
move a lot further toward SQL Alchemy and query builder. But I would love to
see node become more of a target for traditional web applications. Prove that
Stack Overflow question wrong, and get more awareness out there, that you can
do this type of boring traditional website development in node, and I think
the more that that becomes something that's well known, the more people will
be writing JavaScript, and then more goes into the ecosystem. So that was sort
of the goal of creating these two libraries, to sort of say -- hey, you can do
this. And hopefully there's a lot more that comes out of this.