Skip to content

Instantly share code, notes, and snippets.

@voodootikigod
Created June 22, 2015 14:39
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save voodootikigod/36d5eb51686805459bfa to your computer and use it in GitHub Desktop.
Save voodootikigod/36d5eb51686805459bfa to your computer and use it in GitHub Desktop.
JSConf US 2015 Track A Transcript for Tim Griesser: Making Relational Cool Again (or: JavaScript on ACID)

Thanks. So yeah, I'm really excited to be here, talking about making relational cool again, and the subtitle is... Or JavaScript on ACID. And like I said, my name is Tim Griesser, Tgriesser on Github, and I'm here really excited to be talking about SQL. Or is it Ess-kyoo-ell? I know it's an issue that really divides programmers. So I Googled it, and the first result I came across was this English language and use in Stack Exchange. Which I thought was fitting, given that we're at a programming conference. And the first thing you'll notice is I'm not a member of the English language and usage, so if I make any mistakes up here, you'll know why. But the answer was that it was in fact first called the structured English query language, and so that acronym was SEQUEL. And then that was abbreviated as SQL. Now it's a matter of preference. And there's an urban legend that it came after the structured query language, which is the worst urban legend I ever heard. So I learned something new, and now I'm a member. It was a big day. But it's appropriate that we're talking about relational databases at a JavaScript conference, because I think there's a lot of similar features between the two. At a language level, maybe not so much. Relational, SQL is static and strongly typed, declarative, and in some ways procedural, and JavaScript is as loosely typed as you can get while still being a programming language, and it's object oriented, which leads to some interesting debates as to how to write the code correctly, and it shares some of the same operators, but we know in JavaScript the operators don't do what we want some of the time. So what do I mean when I say is theres a lot of similarities between SQL and JavaScript? They both sort of share this similar quality that is not often attained by all languages, which is what I like to term -- languages that will probably never die. As much as people try to kill off JavaScript with different other languages that will supersede it, it sort has this monopoly. It's the language of the web. It's the language that runs in web browsers. And similarly, SQL has so much momentum. It's been around since the '70s and it's been around forever. This is a slide that we often show when we're talking about JavaScript, just sort of illustrating that it runs everywhere. In the browsers, but also your television and your server and your nodebots and copters and also on your phones, and I think the same could be sort of said about SQL. Is that it powers the largest websites in the world. It's behind Facebook and Wikipedia, and also down to SQL-lite, flat file storage, which is in more places than you would imagine on your phones. So coming to node.js a couple years ago, I was sort of surprised. It seems like these two would be such a great fit. They're both everywhere. Why is there not more support for it? And I think the reason is that node starts with no, and no is the beginning of NoSQL, and around the time that node came out, it was like -- let's rethink the entire way that we're writing server side applications with this evented IO model. And at the time, it was sort of like -- let's rethink the standard relational database, because we're hitting web scale and we need to get past that.

And so around the time that sort of the height of the NoSQL phase or moment -- it's still going, of course -- is when node came out, and that's why there was so much of this. So all the tutorials I felt like for node were node and Mongo, getting started -- they have literally taken the software and web industries by storm. You've got Mongo express angular node, which everybody has heard of, and then my favorite, the LeBron stack, level DB slam dunk. And what happened to relational algebra? Right? This is proven and time tested. And SQL works great for a lot of things. And this stack overflow post that I came across sort of summed up what I was thinking, coming to node from outside.

Are relational databases a poor fit? And this is sort of long, so let me just pick out a few things. There's a degree of antipathy toward relational databases, is an observation you could make of the node community. And that they're poorly supported, compared to non-relational databases. I sort of agree with this. Listening to the node up podcasts, listening to couch DB and Mongo, I guess it was like -- relational databases aren't cool anymore. I dug a little and found there actually are really great libraries for working with SQL in node. All the popular Open Source, and Oracle came out with another one. But there's not a common API all of them. Node didn't assume that a relational database was a given. In all other language systems, there's a standard for how you're interacting with common patterns of a database. How you connect, how you disconnect. That doesn't exist. So what that lends to is that all the different clients have a different way of dealing with these things that should be pretty common. And there's also, I guess -- the next thing I noticed was that the higher level abstractions were lacking a little bit.

And what I mean by that is that a lot of them were database-specific. So there was a great query builder, but it works for postgres, specifically. Another great one for another language, and no way to use the same for both. Different APIs. Or there's a mix of an ORM layer and a query layer, so there's no ability to drop down and use just vanilla SQL if you need to. One that was big for me is the lack of transaction APIs. So there wasn't any mention of transactions. In any of the higher level libraries that I had seen at the time. And that is sort of what the second subtitle of the talk is that I'll get to in a little bit. And then there was the jack of all trades, where it's sort of like -- hey, you can use the same save API to save to Redis or Mongo or relational database or the file system. It's like... If you have the same API for all these different specialized data stores, you're really not taking advantage of the best use cases for each of them. Or there's sort of what I call the DIY. Like, it gets you most of the way, but modularity. So you have to assemble all these. The pooling, the construction, the abstracting the database API. So the ecosystem -- I would definitely agree with the Stack Overflow post, that there's a need there, that needs to be filled.

But I really wanted this to happen. I really love writing JavaScript. So I was like... Maybe we can do something better. And at the time, I had been writing a lot of PHP. This was back in 2012, in a framework called Laravel, which some might have heard of. And it makes writing PHP pretty tolerable. And I was also using a lot of backbones. So I was thinking... Maybe I can take some of the ideas from both of these, and I sort of translated the query builder from Laravel into this project called Knex.JS. And this sort of illustrates what pieces were taken from each, but the eloquent ORM from Laravel and some of the ideas from backbone into bookshelf, and mix into promises -- and so I'm going to talk about these two libraries that I have put together, connects and bookshelf. The first is Knex. If you're not familiar with where the name comes from, it's this toy that was popular in the '90s. It's sort of like a cheaper version of Legos. You can assemble all these pieces together and build pretty cool-looking things. And actually, it's really great if you're procrastinating, making conference slides. I decided to make this little motorcycle here. All right, JSConf. But the reason that I think... What Knex tries to do is standardize some of the inconsistencies in SQL. So SQL is a language -- an ANSI language specification. But it's not like it is, what we're used to with, like, ES5 or ES6 and beyond. Where we sort of have this compatibility table, and we see that Babel is in ahead of everything else, in terms of implementing all the different features. But it feels more like the can I use. Sort of like a language guideline. There are features that exist in some dialects but not others. It doesn't fail, or it fails sometimes, or -- different inconsistencies that Knex tries to paper over. This is an illustration of the different pieces as they're coming over internally. Have things like connection pooling, to make sure you're getting connections that you don't have to continually reconnect to the database every time you reissue a query, mixes in different grammars from the different dialects, and creates a client that's used in the schema and normal query building, as well as transactions. Which I sort of break out entirely, because I'm going to spend a little bit of time on those, and on top of that, we have higher level APIs for doing some familiar migrations and seeding as you would see for something like Rails.

So how you get started with Knex is just give it a connection string. If you want to configure the pool, you can do that. And it works against a number of different databases. Including Oracle, which is pretty cool. Somebody just opened a pull request. And it's like -- oh, I don't use Oracle, but it's awesome that you opened a pull request. Yay, Open Source. And it also supports web SQL, but I don't really recommend that, because that's deprecated. Browsers killed that off. So you can select all from accounts, where activated is one, and then you call then on it, and it has the familiar promise chaining API, so it issues that query and returns the result of the query and then you can catch the error if that happens in the standard promise -- it's actually great that I don't have to argue too much in favor of promise, now that they're actually in the spec. But when I was starting out, that was a thing, whether it was -- it was still up for debate whether you should use that.

Also it does joins. I don't have to read it out too much, but joins with multiple clauses, and then subqueries, so anywhere -- so here we have a where in clause, and anywhere you might want to use a subquery, the general rule of thumb is that you can pass a function and then use the context of that closure as a new subquery. And then from there, I'm not going to dive into all the features, because that's what documentation is for. But you can do raw queries, like if you have a specialized query that you don't want to... That maybe does some things that aren't supported by Knex, you can issue those, aggregates, subquery aliasing, and it tries hard not to let you screw up. It tries to catch different errors or paper over things for you, so you don't make too many mistakes. And so the subtitle of the talk, as I mentioned, is JavaScript on ACID. So this is not the ACID test, which... Some are probably familiar with.

I've mentioned this to someone, and they're like -- oh, you mean the browser suite? No. Let me look that up, though. It's good to know that Chrome is passing the ACID 3 test, as of today. But I mean atomicity, consistency, isolation, and durability. I'm not going to go into each of them, because that could be a separate talk about each of the individual terms. But transactions -- I sort of sum up that. What they do is give you the ability to have a snapshot state of the world as you're working with your database. So what it allows you to do is roll back to a certain point in time, and also prevent other connections that might be working against your databases, from altering rows as you're working within a single atomic unit of state. And this is kind of difficult to do in node. You have to think about this API in advance, because in order for this to work, you have to have the same database connection passed to every single query that you're working against. So it's not like in synchronous languages, where you could just sort of say -- okay, start transaction here, and then all of these will run right here, and then end it right here. Because you sort of... With the event I/O model, you sort of lose that context as you go with the callback. You don't really know what's happening. So the connection explicitly has to be passed to every query. So let's think about a situation where we would actually want to use transactions.

So here's a simple user flow situation, where someone is registering for a website. And they register, and you give them an ID and you have to send that to a third party service and then you have to create maybe some other rows for that user, and then it's all done. They're all registered. So let's think about what could go wrong in these four steps. So first you might depend on a module which doesn't follow semantic versioning. So something breaks, and it's completely out of your control, or the third party API that you're working against sort of shuts down, and your data gets into a bad state. So in order to deal with that, then, after step 2, if that fails, then you would have to undo what happened in step 1. Or in step 3, where you're creating additional rows to deal with that new user, maybe one or a few of them failed, so then you have to make sure the ones that did go through get rolled back, and then you have to unregister the user, and then say -- hey, something messed up. And that ends up pushing a lot of logic into your application code that you have to deal with. All of these potential bad states.

Or you just don't deal with it, and then you have potentially bad data in your application. Because you want this whole register user flow to happen as one atomic event. Or here's another situation that can't be handled by the last one. Your servers just catch on fire halfway through registering the user. And actually... I created these libraries so I would one day have the ability to use this graphic in a keynote. No. But... So what Knex tries to do is you can say Knex transaction, and then it creates an instance that you treat as a normal query builder instance, but it knows what it's supposed to be on and the fact that it's inside an transaction. So you can pass that to other functions that can utilize it and work with it as if it's a normal API, but behind the scenes, you have a transaction that you're dealing with. And originally, the API was that you would just have to pass this explicitly to every single query that you built. So you had the transaction object, and then you could call commit or rollback at the very end. But that seemed to be kind of error-prone, that users would sort of forget to pass it to every single query necessary, and then they would have sort of queries not working on the same connection, and it caused all sorts of errors. So this is the new API. And like I said, Knex tries to make it hard to screw up, I guess. And so this is a little boilerplate here -- every time we call commit and roll back, this is a promise, and then we return the promise into the transaction, and we know if the entire promise chain fulfills, the transaction should be committed, and if it fails, the transaction should be rolled back. So that's pretty simple. Nesting transactions is new. And it's the idea that you shouldn't have to worry about whether the client is already inside a transaction. So if you call Knex transaction on something that is already a transaction Knex, it should just create a save point. And so a save point is sort of like a save point in, like, a video game. Where you get to a certain point, and if something fails beyond that, you don't go back to the very beginning, but you go back to just right here.

And so it does this transparently. You know, for you. And you don't have to worry about it too much. So Knex likes to take what I call the batteries included approach, where it does a lot of this for you, and also provides a lot of different interfaces, like callbacks and streams and events, and two string -- the different things you would want to work with in a nice manner. So that's pretty much just a high level overview of Knex. And now I'm going to jump into bookshelf. Which is an ORM, which stands for object relational mapper. And in short what that tries to do is take care of standard SQL queries for you, especially in common app operations. So when you're building an app, there's a lot of things that are pretty standard. Like insert, return, fetch, save, dealing with -- relating different rows of data. And you don't want to have to write all of this by hand. So it sort of abstracts that a little bit for you, and takes a little bit of the flexibility away, but gives you a nicer, higher level piece to work with. So the different association types -- pretty familiar if you've ever worked with an ORM in another language. One to one, one to many, many to many. Polymorphic -- I don't know if these are necessarily a great idea, but sometimes they can be useful, I guess. So it supports those. And it builds on top of Knex. So that's where I was talking about having a separation between a query building layer and an ORM layer. You should be able to write raw SQL when you want and have something that works on top of that and also allows you to drop back into it when you need. So this is sort of what it looks like, to create a few models with associations. And you can also sort of filter with... So here we have comments, which is has many to a comment, but we also have moderated comments, and we can add in the where-moderated is true. And it also supports loading where you're trying to add data to a collection of data, if you have 26 items, you have 26 extra queries, so it tries to do one and then another query for the extra related results. An example here is to find an account with all the posts in the account, and then all the comments under those posts and all the accounts that actually made the comments on the posts for the account. So that's what that it would look like, to provide with-related, and this is all in the documentation. So I'm going to jump through this a little bit.

Also, as you constrain your loads, you can dynamically constrain relations, and then load things after the fact. So if you want to fetch one row or a collection of rows and then only load onto one, it allows you to do that as well. And then as I was mentioning earlier, it allows you to tap into the query chain, to dynamically add things that are maybe a little more SQL-specific under the hood, as you're building the select statement or other statements for the model. And so transactions in bookshelf aren't quite where they are in Knex, where you still have to pass the explicit object to each of the async calls. Turns out it's a little harder to retrofit that, where you have a transactional bookshelf object. That's where it's going in the future. But transactions are absolutely supported in bookshelf. And just a little bit of where bookshelf sort of came from -- I mentioned that it came from some of the ideas from backbone models and collections. And ultimately the idea was to see if we could reuse some of the same models on the server and client. And this was back in, like, 2012, back before I even knew there was a term for doing this. Which I'm not going to say. Because he already covered it last year.

And I actually sort of got there. I was able to swap out bookshelf models and collections for, like, the backbone TodoMVC and use it targeting webSQL. Which was kind of cool. I don't know. I built a to-do list. And that's pretty special. But really, I think, shared models sound great, but it's not really as great in practice, because you have to know a lot more -- or there's sort of a limit to what you can do. You have to load all the data, and lose out on a lot. Because you don't need as much of what you would need on the server on the client, if you're just displaying it. So it sounds great. Not as great in practice. And it's something that I'm sort of moving away from, some of the model and collection conventions from backbone, based on some of the things that I've learned, having been out there for a little while. So you can read more in the docs. I'm not going to go too much further into it. But I think one thing I would like to point out is that Knex and bookshelf are not the first to do this. So I mentioned earlier there was already higher level SQL abstractions in JavaScript that I saw, and I was like -- oh, I'll build these instead.

And hopefully it's not the last to do this. So a few other projects -- openrecord is a really great project that's using Knex under the hood, and it provides more of an active record syntax. And SQLize is the big one, and is still a big one. And it didn't have support for transactions. It had a really... The APIs were sort of rough. And since then, it's come a long way. I've been really impressed with a lot of the development happening with SQLize. So if you haven't looked at it for a while, take another look. They're doing cool things, especially around the new postgres features that are coming out. A new one that I just saw recently, by someone who had been frequently commenting on bookshelf and Knex projects, and I guess maybe had some opinions that were different, created this Zuul.JS. And it's really impressive. It does transactions, doesn't have quite the separation between the query building and the ORM layer, but does a lot for you at the ORM layer. That's worth checking out. SQL bricks is pretty similar to Knex, but with different opinions. NodeSQL, by the guy who does the postgres driver. Even the drivers have come a long way since I looked at them in 2012. The node my SQL driver has a ton of documentation and a lot of features added. It's really impressive how far it's come in the last few years. Knex query lab allows you to demo this in the browser and have it spit out SQL. Bookends does nested data loading and querying really well. It's on top of bookshelf. And there's endpoints, a JSON API compliant library, which has Knex and bookshelf under the hood, and larger stack implementations like Sails, which has their own ORM, but they were potentially talking about using Knex as the query builder under the hood. So they're targeting all the different SQL drivers. That's just some of what's out there. But I really don't think this is enough. So... We're still sort of like... At this stage where node is a really young ecosystem. And a lot of times, you'll hear -- don't reinvent the wheel. Like it's already been done. But I sort of disagree with this. I think that people should absolutely be reinventing the wheel. With an asterisk. Just not on the client side. Because React already did this. No, I'm kidding there.

But, like, we should have more experiments out there, because I think the types of conversations we're having since React are really different, about... Just the web in general. Because they questioned a lot of opinions. So I think that -- take a look at other libraries. Take a look at other languages. Which is what -- you know, a lot of what JavaScript is about. JavaScript is sort of late to the party of being on the server. And a lot of the great ideas coming out around async and ES7 and ES8 are looking at the best of what's around in other languages. So I think it's really important -- like, as I mentioned, these were adapted from the Laravel query builder in PHP, and I would like to move a lot further toward SQL Alchemy and query builder. But I would love to see node become more of a target for traditional web applications. Prove that Stack Overflow question wrong, and get more awareness out there, that you can do this type of boring traditional website development in node, and I think the more that that becomes something that's well known, the more people will be writing JavaScript, and then more goes into the ecosystem. So that was sort of the goal of creating these two libraries, to sort of say -- hey, you can do this. And hopefully there's a lot more that comes out of this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment