-
-
Save RobinL/bbabbe22d9177230648b7fc9a22a84d7 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
WEBVTT | |
00:00.000 --> 00:07.260 | |
what's up hi Joe hello so we're hanging out at the forward data conference in | |
00:07.260 --> 00:13.320 | |
wonderful Perry so good to finally hang out yeah Joe it's amazing I mean I | |
00:13.320 --> 00:16.800 | |
didn't think that we had to come to Paris to finally meet and I was | |
00:16.800 --> 00:20.700 | |
expecting somewhere in the US but hey here we are well I was actually in your | |
00:20.700 --> 00:25.660 | |
neck of the woods last week in Amsterdam but we didn't yeah somehow | |
00:25.660 --> 00:30.520 | |
managed to miss each other but I forgot to send the fax yeah so oh god the fax | |
00:30.520 --> 00:34.420 | |
machines yeah this is a this is a you know something embarrassing for the | |
00:34.420 --> 00:38.560 | |
people of Germany to to still using fax machines they still use them yeah yeah | |
00:38.560 --> 00:42.700 | |
for what I think if you want to talk to some official thing like a like a you | |
00:42.700 --> 00:46.780 | |
know government agency they they like faxes still although I have recently read | |
00:46.780 --> 00:51.280 | |
that a what is it I think the tax people are now stopping to accept faxes | |
00:51.280 --> 00:57.580 | |
finally it's oh yeah I know coming along I know someone said digital technology | |
00:57.580 --> 01:01.360 | |
could be used for this purpose I don't know it's it'll make sense when they're | |
01:01.360 --> 01:06.280 | |
older yeah no it's really pleasure to meet you finally after I've been | |
01:06.280 --> 01:10.180 | |
obviously I have a copy of your book oh you do yeah I do I use it in my class | |
01:10.180 --> 01:14.980 | |
actually I teach you all data engineering oh yeah well I mean the | |
01:14.980 --> 01:20.140 | |
university was like is there a textbook and I was like oh yes there is one | |
01:20.140 --> 01:24.460 | |
so cool is that one well if you ever need a guest lecture I'm happy to buy on | |
01:24.460 --> 01:28.720 | |
if I happen to be in the area I can pop by if not we'll do it over zoom oh that | |
01:28.720 --> 01:32.440 | |
would be interesting actually yeah yeah thanks for that yeah yeah so the | |
01:32.440 --> 01:36.320 | |
students will be thrilled yes well they'll drop the class they don't | |
01:36.320 --> 01:39.040 | |
obviously believe me what I say something about data engineering but they | |
01:39.040 --> 01:46.120 | |
will believe you that's cool I guess to kick things up for people who don't | |
01:46.120 --> 01:49.900 | |
know who you are do you want to give a quick intro sure so my name is | |
01:49.900 --> 01:54.280 | |
Hannes Mueleisen I am a I'm from Germany but I live in the Netherlands I have | |
01:54.280 --> 02:00.040 | |
been living there for 12 years I am a one of the two creators of the database | |
02:00.040 --> 02:05.440 | |
system called DuckDB and I'm also the co-founder and CEO of DuckDB Labs the | |
02:05.440 --> 02:10.300 | |
company that employs most of the DuckDB contributors I'm also a professor of data | |
02:10.300 --> 02:15.280 | |
engineering at the wonderful University in Nijmegen which is a small town that's | |
02:15.280 --> 02:23.440 | |
super cool yeah DuckDB I could make a very strong argument it's it's getting up | |
02:23.440 --> 02:27.660 | |
there with being a very widely used database I think in terms of mindshare | |
02:27.660 --> 02:31.900 | |
at least in the analytics community I would say it's probably the the the | |
02:31.900 --> 02:35.500 | |
hottest database in the world right now in my my view I think that's that's true | |
02:35.500 --> 02:40.080 | |
like we we do look we do track a bunch of these vanity metrics not too seriously | |
02:40.080 --> 02:44.400 | |
because like what what's event but oh there's like things like the you know | |
02:44.400 --> 02:49.340 | |
db engines ranking engines ranking and there is things like you know amount of | |
02:49.340 --> 02:55.980 | |
downloads but I like the metrics that that are sort of not gamed by by scripts | |
02:55.980 --> 03:02.160 | |
right because DuckDB of course is a and I should maybe explain isn't is database as a | |
03:02.160 --> 03:06.840 | |
library right data of data warehouse as a library if you want and that means that | |
03:06.840 --> 03:10.560 | |
people run it in all sorts of chronic creative places and it very often gets | |
03:10.560 --> 03:14.780 | |
installed like just you know to spin up a lambda or something like that so that | |
03:14.780 --> 03:19.300 | |
of course that would skew you download numbers but there's metrics that are not | |
03:19.300 --> 03:23.700 | |
that skewed one of them I really I'm really impressed by is the amount of | |
03:23.700 --> 03:28.380 | |
unique visitors to our website okay so we accidentally made one of the more | |
03:28.380 --> 03:33.480 | |
popular websites of our country right by just having documentation and our blog and | |
03:33.480 --> 03:38.220 | |
all sorts of things like that like it's more than a million unique visitors each | |
03:38.220 --> 03:42.840 | |
month for the website yeah it's totally wild I don't know I didn't expect that | |
03:42.840 --> 03:47.220 | |
how many downloads have you had so far downloads and that's that depends a bit | |
03:47.220 --> 03:52.500 | |
on the platform so there is there's Python there's a have a big Python | |
03:52.500 --> 03:56.340 | |
distribution so you can pipe by okay and pipe by and it's like think 7 million per | |
03:56.340 --> 03:59.760 | |
month at the moment okay I actually have no idea what the the integral of all of | |
03:59.760 --> 04:05.880 | |
this is okay there's a sum in that sense there we have that there's a bunch of | |
04:05.880 --> 04:12.060 | |
other platforms that get downloads but Python I think is the biggest npm also | |
04:12.060 --> 04:16.860 | |
has a bunch on our client there is direct downloads from a website of the CLI | |
04:16.860 --> 04:20.280 | |
the things like homebrew which goes through them and we don't can't | |
04:20.280 --> 04:26.000 | |
necessarily track it so we don't actually have a great way of of tracking | |
04:26.000 --> 04:30.080 | |
downloads but what we we do have is these extensions plugins and those go | |
04:30.080 --> 04:35.420 | |
through our download okay a server and that's and currently on the order of I | |
04:35.420 --> 04:41.660 | |
think 300 terabytes each month just in extension downloads so so that's just | |
04:41.660 --> 04:45.320 | |
somebody installing a DuckDB extension and that sums up to 300 terabytes by the | |
04:45.320 --> 04:52.680 | |
way grateful to Cloudflare for sponsoring us thank you that that would be I mean | |
04:52.680 --> 04:55.760 | |
it's not that Cloudflare charges you money but it's something that gives us like the | |
04:55.760 --> 04:59.780 | |
confidence that we can pull this off for the next 10 years wow to come yeah it's | |
04:59.780 --> 05:05.240 | |
it's quite wild it's it's somebody Jordan has said at some point that if what | |
05:05.240 --> 05:08.900 | |
you're doing is exponentially growing every day is the craziest day of your | |
05:08.900 --> 05:15.980 | |
life uh-huh and it is definitely been how is that Jordan Cigani of MotherDuck yeah okay | |
05:15.980 --> 05:20.780 | |
I'm sure like Michael Jordan no no no no that's the other yeah so that's really | |
05:20.780 --> 05:25.340 | |
something that that has surprised us obviously if you make an open source | |
05:25.340 --> 05:29.240 | |
project your default is that you no one will care mm-hmm right that's true for | |
05:29.240 --> 05:36.680 | |
99.9% of GitHub rep hoses nobody cares about them yep and when we started | |
05:36.680 --> 05:41.000 | |
building back to be obviously we thought we had a bit of a an angle on what could | |
05:41.000 --> 05:46.780 | |
you know prove to be popular but obviously you have no idea without it | |
05:46.780 --> 05:50.860 | |
happening right so you don't know it's still unlikely to happen so it's it's | |
05:50.860 --> 05:55.480 | |
it was an interesting experience I think the it's interesting if you wonder | |
05:55.480 --> 06:00.780 | |
like I think we knew that we were onto something when the VC started calling oh | |
06:00.780 --> 06:06.700 | |
right tell me more well okay so that we had been we've open source it in summer of | |
06:06.700 --> 06:15.280 | |
2019 yeah and then we spun off the company in about 21 but before that | |
06:15.280 --> 06:22.540 | |
actually in early 21 I think there was I think a Hacker News post which was a | |
06:22.540 --> 06:27.440 | |
terrible like it was a terrible article it was like here's DuckDB it's like | |
06:27.440 --> 06:32.700 | |
Postgres yes that was I think that I remember for this who wrote it I don't | |
06:32.700 --> 06:39.360 | |
have don't remember who wrote it not us it's like sequel light I remember it was | |
06:39.360 --> 06:43.680 | |
here's DuckDB it's a database it's like sequel light but with Postgres features | |
06:43.680 --> 06:47.700 | |
which is a very bad characterization of what DuckDB is but that went viral on | |
06:47.700 --> 06:53.000 | |
Hacker News and that was the sort of that I think was what pushed us over the | |
06:53.000 --> 06:58.020 | |
thousand stars in GitHub or something like that and that was when you know the | |
06:58.020 --> 07:01.340 | |
curve started to take its current from me now I think we now had something in the | |
07:01.340 --> 07:06.160 | |
order of 25,000 stars in GitHub or something which is not a lot if not a lot | |
07:06.160 --> 07:11.300 | |
if you have a JavaScript library true but for a data system it's pretty it's | |
07:11.300 --> 07:14.840 | |
pretty crazy yeah yeah it's been a wild ride I can I have to say that's | |
07:14.840 --> 07:19.440 | |
interesting walk me through the beginnings and so and actually we're | |
07:19.440 --> 07:23.660 | |
talking about it earlier so yeah welcome to the database welcome to how you named | |
07:23.660 --> 07:28.080 | |
it all those kind of fun things I think it's pretty hilarious I mean the | |
07:28.080 --> 07:32.280 | |
database name to be brief is called DuckDB because I used to have a pet duck so I | |
07:32.280 --> 07:38.760 | |
live on a boat with my family and the neighbors cats kept drowning and so we | |
07:38.760 --> 07:44.820 | |
decided to not have a cat because yeah they fall in the water it's really sad so | |
07:44.820 --> 07:48.720 | |
we thought we'll we'll have a cat I will have a we'll have a duck instead I'll have | |
07:48.720 --> 07:54.120 | |
a duck instead of a is of a cat and the duck can swim so I had we got this | |
07:54.120 --> 07:59.760 | |
little duckling called Wilbur I taught him how to swim I taught him how to fly it | |
07:59.760 --> 08:05.580 | |
was very sweet and he has since left and probably has started a ducky family | |
08:05.580 --> 08:11.340 | |
somewhere but in honor of little Wilbur the database is called DuckDB that's | |
08:11.340 --> 08:16.260 | |
yeah I don't know it was it was it was kind of obvious to me I didn't think about | |
08:16.260 --> 08:20.580 | |
it a whole lot but but yeah that was that was the that was very early on and | |
08:20.580 --> 08:26.640 | |
DuckDB is a product of Mark Rassfeld and me it originally and Mark used to be my | |
08:26.640 --> 08:30.240 | |
PhD student because I'm come from this whole academic background from back in | |
08:30.240 --> 08:32.940 | |
Amsterdam at the Centrum Visken Informatica which is like their | |
08:32.940 --> 08:38.520 | |
national research lab for mathematics and computer science it's by the way | |
08:38.520 --> 08:43.020 | |
where Python was invented oh really yeah they invented Python well Guido | |
08:43.020 --> 08:48.840 | |
invented Python while he was there wow and yeah in the same sort of research | |
08:48.840 --> 08:52.440 | |
Institute we came up with this idea for DuckDB as because we realized that | |
08:52.440 --> 08:57.360 | |
people kind of very sort of ignoring database technology for the wrong | |
08:57.360 --> 09:05.340 | |
reasons what do you mean well okay so obviously databases is this like date | |
09:05.340 --> 09:09.420 | |
relational data transformation is something that's well understood I would | |
09:09.420 --> 09:15.060 | |
argue it is also the field with a long tradition and a significant body of work | |
09:15.060 --> 09:21.060 | |
and you know best practices but people were casting that aside for reasons | |
09:21.060 --> 09:24.480 | |
like oh yeah but it's super hard to get Postgres running I'd rather run like | |
09:24.480 --> 09:29.700 | |
MongoDB mmm which is vastly inferior from a technical perspective but it was | |
09:29.700 --> 09:33.660 | |
easy to get running so we took some inspiration from that and actually said | |
09:33.660 --> 09:40.200 | |
but what if we take the sort of the body of knowledge the the orthodoxy of what | |
09:40.200 --> 09:45.780 | |
database engine should look like and just put it into a package that doesn't make | |
09:45.780 --> 09:49.920 | |
you you know hate everything and everyone around you I mean I I try to | |
09:49.920 --> 09:54.780 | |
install Oracle on a box and I'm at some point I realized that normally | |
09:54.780 --> 10:00.780 | |
consultants do this yeah because it is horrifying right and so with that to be | |
10:00.780 --> 10:05.340 | |
we really try to be like absolutely like minimalistic in terms of what you need to | |
10:05.340 --> 10:10.260 | |
install it there's no dependencies right zero you don't need root to install it | |
10:10.260 --> 10:19.680 | |
it's small ish it's like you know tens of megabytes of binary size yeah it's | |
10:19.680 --> 10:22.980 | |
just generally trying to be not an obtrusive but still contain a state-of-the-art | |
10:22.980 --> 10:27.660 | |
query engine and we've actually went since we've did since we've done that we | |
10:27.660 --> 10:32.760 | |
actually went much further so we're still doing research on in the in the field of | |
10:32.760 --> 10:37.860 | |
databases on how we can you know make make that to be better like we're doing | |
10:37.860 --> 10:41.820 | |
research for example we just wrote a paper on parsing we did bunch of papers | |
10:41.820 --> 10:46.860 | |
on bigger than memory processing which is something that surprisingly and not a lot | |
10:46.860 --> 10:53.160 | |
of work had been done in the field so we are still you know actually let's say | |
10:53.160 --> 10:56.320 | |
pushing the envelope of what relation engine can be but at the same time we're | |
10:56.320 --> 11:02.580 | |
making it trivial to use and I think that's the interesting sort of combo yeah well | |
11:02.580 --> 11:06.060 | |
and we were talking last night too you even have somebody working on like the | |
11:06.060 --> 11:12.300 | |
CSV yeah yeah it's a shout out to Pedro it's one of the I think that the second | |
11:12.300 --> 11:17.100 | |
or the third the set first or the second sort of person besides mark and me to | |
11:17.100 --> 11:23.940 | |
work on DuckDB and he was a postdoc at the CWI and he did a bunch of stuff he did | |
11:23.940 --> 11:28.800 | |
like the arts did like a tree structure for indexing but then he found his true | |
11:28.800 --> 11:34.080 | |
calling which is a CSV reader and so he's been noble calling I know he's he's | |
11:34.080 --> 11:39.720 | |
been he's been working on the CSV reader ever since and other things but it's | |
11:39.720 --> 11:45.900 | |
like his main project and it's super interesting to see you know what he's | |
11:45.900 --> 11:50.820 | |
done and I think the reason why we spending so much sort of time on CSV | |
11:50.820 --> 11:54.420 | |
reading is because it is the first thing you do yeah you're running a new | |
11:54.420 --> 11:57.960 | |
database first thing you do is you're not going to enter your data like with the | |
11:57.960 --> 12:00.540 | |
keyboard or anything like that you're not running insert statements you're going | |
12:00.540 --> 12:05.220 | |
to load some CSV files these days ideally it's going to be parquet files but like | |
12:05.220 --> 12:09.060 | |
yeah it's still gonna be CSV files and so I have spent so much time of my life | |
12:09.060 --> 12:14.520 | |
dealing with broken CSV readers out there and I'm that it's absolutely clear to us | |
12:14.520 --> 12:16.980 | |
that this needs to be absolutely top-notch we need to have the best | |
12:16.980 --> 12:22.020 | |
CSV reader in the business and I think we actually do so so that's just to keep this | |
12:22.020 --> 12:27.060 | |
initial threshold of people using your system somewhat manageable like they | |
12:27.060 --> 12:30.600 | |
need to be like I think our goal is people like point this thing at | |
12:30.600 --> 12:34.440 | |
something can be a CSV file it can be a parquet file it can be a bunch of JSON | |
12:34.440 --> 12:41.280 | |
files I worked I'd last week I worked on abro files anything pointed at it and it | |
12:41.280 --> 12:45.360 | |
will just be like yes sir here's yours here's your table all right yeah that's | |
12:45.360 --> 12:49.380 | |
our that's that's where we want to be I think we're pretty close so that's that's | |
12:49.380 --> 12:54.360 | |
and I think it comes back to this idea of like user experience I think I think | |
12:54.360 --> 12:59.220 | |
databases I always say they tend to be sold on golf courses mm-hmm because like | |
12:59.220 --> 13:03.060 | |
the CEO talks to the other CEO and they go and then they agree on a price shake | |
13:03.060 --> 13:08.900 | |
hands and then and then that's how the database was sold we try we don't do this | |
13:08.900 --> 13:14.400 | |
obviously because it's free and open source but we have a more bottom-up | |
13:14.400 --> 13:18.960 | |
strategy and to do that the experience needs to be good right people need to | |
13:18.960 --> 13:23.820 | |
just actually just we try to amaze people a bit with okay just can do this | |
13:23.820 --> 13:29.160 | |
and it works fine and yeah it's people seem to like it what can I say it's that's | |
13:29.160 --> 13:33.820 | |
interesting so do you have like an opinion on guardrails as well or is it | |
13:33.820 --> 13:37.680 | |
more meant is your philosophy just make everything as simple as possible no | |
13:37.680 --> 13:40.320 | |
matter what it is or do you have certain opinions about where those | |
13:40.320 --> 13:44.720 | |
limitations should be um what do you mean by guardrails guardrails like we're | |
13:44.720 --> 13:51.000 | |
talking last night about strings giving an example there where yeah if you want | |
13:51.000 --> 13:54.640 | |
to do something with a string go for it we don't really care oh yeah oh yeah | |
13:54.640 --> 14:02.040 | |
yeah that's interesting I think I think that that is about a schema well you have | |
14:02.040 --> 14:06.620 | |
obviously been working on schema stuff so we've been talking about that but yeah but | |
14:06.620 --> 14:13.100 | |
the let's say to be more forgiving I think databases traditionally have not | |
14:13.100 --> 14:17.900 | |
been very forgiving we try to be forgiving inducted be more so maybe than | |
14:17.900 --> 14:23.840 | |
other systems so we have things like we have this intermediate compression step | |
14:23.840 --> 14:29.240 | |
where during execution of a pipeline we will actually look at the types and the | |
14:29.240 --> 14:33.580 | |
statistics of the types that are in the in the columns in the data and then we | |
14:33.580 --> 14:36.760 | |
will actually insert intermediate compression decompression steps just to | |
14:36.760 --> 14:40.960 | |
lower the memory pressure on the way and the complexity of operations so it will | |
14:40.960 --> 14:44.380 | |
actually not make it make a huge difference whether your type is | |
14:44.380 --> 14:48.900 | |
declared as a let's say a string but only contains contains interest between one | |
14:48.900 --> 14:54.760 | |
and a hundred you will still get a good sort of result in terms of performance same | |
14:54.760 --> 14:58.820 | |
for storing things right we have a bunch of optimizations for storing very short | |
14:58.820 --> 15:03.460 | |
strings for storing very regular strings we have yeah we have we have | |
15:03.460 --> 15:06.700 | |
integer compression we have like there's a lot of sort of stuff that happens | |
15:06.700 --> 15:10.480 | |
magically behind the scenes so you don't have to think about it like our on | |
15:10.480 --> 15:14.920 | |
disk compression representation like if you you can use DuckDB to store a database | |
15:14.920 --> 15:21.480 | |
file on disk and there there is this you can say which compression you want but by | |
15:21.480 --> 15:26.260 | |
default we will actually run sort of an exploratory sort of exploration of like | |
15:26.260 --> 15:29.460 | |
okay let's try all our compression mechanisms which one is working best okay this | |
15:29.460 --> 15:35.580 | |
one great let's use this one we also have things like if you have an expression let's | |
15:35.580 --> 15:40.260 | |
say I have have a compare it like I have a second let's have select stuff from table | |
15:40.260 --> 15:47.640 | |
where I don't know length of aware this regex matches and this other value is | |
15:47.640 --> 15:52.980 | |
bigger than four okay look just if you visualize this I have a filter that says if | |
15:52.980 --> 15:58.080 | |
the regex matches and the other number is bigger than four then the role should | |
15:58.080 --> 16:02.280 | |
qualify okay so now for maybe not everyone knows but matching a regex is | |
16:02.280 --> 16:06.420 | |
way more expensive than running a bigger than four comparison so we actually | |
16:06.420 --> 16:10.980 | |
automatically will reorder this comparison so we first check the four | |
16:10.980 --> 16:16.620 | |
comparison and then check the regex just because yeah we need to yeah we want to be | |
16:16.620 --> 16:20.340 | |
forgiving in terms of performance we also don't want to create these crazy | |
16:20.340 --> 16:25.200 | |
performance clips that people generally hate databases for right where you change | |
16:25.200 --> 16:29.100 | |
one little thing you add one value to your rows and suddenly move the plan | |
16:29.100 --> 16:32.580 | |
changes and it's over right like we want to be a bit more robust here not saying | |
16:32.580 --> 16:36.240 | |
we're entirely there yet but it's definitely a goal it's it's just I think | |
16:36.240 --> 16:41.400 | |
it's just trying to just trying to be friendly I think it's the it's the general | |
16:41.400 --> 16:45.960 | |
choice general goal if I'm if I want to be more strict do I have that option of | |
16:45.960 --> 16:50.520 | |
making it it's interesting that you say that because people we get the request | |
16:50.520 --> 16:56.280 | |
every now and then if you want to be more strict I don't think we like I can | |
16:56.280 --> 17:01.800 | |
mention the compression techno technique you can force a specific one I don't | |
17:01.800 --> 17:05.880 | |
think you can force a specific execution order for expressions I think we will | |
17:05.880 --> 17:10.380 | |
always optimize it but it is definitely something that as people productize | |
17:10.380 --> 17:16.380 | |
DuckDB more or put it more into into you know products or you know tools that do | |
17:16.380 --> 17:21.180 | |
something else but they also include DuckDB that that we probably will avenge | |
17:21.180 --> 17:28.560 | |
we will not get around having my flags to to to make it more predictive more | |
17:28.560 --> 17:32.040 | |
deterministic just because you don't want it to change its mind three years | |
17:32.040 --> 17:35.400 | |
down the road right like it's that's that's exactly the danger that that you can | |
17:35.400 --> 17:39.880 | |
have there mm-hmm absolutely well especially if you do it using DuckDB on the edge for | |
17:39.880 --> 17:43.900 | |
example yeah you can't update it as much then yeah yeah there is people that are | |
17:43.900 --> 17:50.980 | |
running DuckDB on all sorts of devices that's that's and as something that in | |
17:50.980 --> 17:54.400 | |
the past I think that wasn't a very good idea since we released 1.0 this year I | |
17:54.400 --> 17:58.000 | |
think we are comfortable with the idea of somebody running that version of | |
17:58.000 --> 18:02.500 | |
DuckDB somewhere for five years and you know it working out fine uh-huh we do | |
18:02.500 --> 18:07.960 | |
regularly test against regressions against DuckDB 1.0 and so far we | |
18:07.960 --> 18:13.740 | |
haven't found anything dramatic so that's what's the craziest place you've | |
18:13.740 --> 18:18.700 | |
seen DuckDB deployed um so one thing is really crazy when I saw it first was the | |
18:18.700 --> 18:23.680 | |
web browser okay so some so Andre Cohen from the Technical University of Munich | |
18:23.680 --> 18:29.800 | |
he made DuckDB Wasm which is this version of DuckDB that's compiled to run in | |
18:29.800 --> 18:35.460 | |
WebAssembly in the web browser right somewhere else I think when I saw that I was | |
18:35.460 --> 18:40.260 | |
deeply impressed because I didn't even consider that to be a possibility that | |
18:40.260 --> 18:43.680 | |
you could run DuckDB in a website and you can and lots of people use it is one | |
18:43.680 --> 18:48.000 | |
of our biggest compil biggest sort of deployment targets right now because | |
18:48.000 --> 18:51.240 | |
people use it yeah people put it in dashboards people put it in you know | |
18:51.240 --> 18:55.440 | |
people use it in in like something behind the scenes in a website like we have we | |
18:55.440 --> 19:00.360 | |
have people in in visualization that use DuckDB version to just you know drive a | |
19:00.360 --> 19:05.460 | |
driver visualization you know to be reactive to user input live like right | |
19:05.460 --> 19:10.860 | |
there on the thing that I think was I think the the craziest deployment target | |
19:10.860 --> 19:13.920 | |
there's of course always the you know the random weirdos we have somebody I'm | |
19:13.920 --> 19:19.380 | |
sorry valued contributors we have we have somebody that managed to compile DuckDB | |
19:19.380 --> 19:27.240 | |
for the IBM mainframes like IBM Z series which I didn't think was possible but it | |
19:27.240 --> 19:31.860 | |
worked and he's happy I think so so that's always funny others like what | |
19:31.860 --> 19:35.480 | |
would that be used for like I mean that's not such a terrible idea I mean | |
19:35.480 --> 19:38.400 | |
that yeah it's gonna be working I think pretty well and it's like okay you know | |
19:38.400 --> 19:42.960 | |
maybe not on 10,000 cores at the same time but if these mainframes of course | |
19:42.960 --> 19:47.220 | |
allow you to use fewer cores for a problem so you're running there good memory | |
19:47.220 --> 19:50.040 | |
access I don't know I've never used one of these things I'm sorry yeah I don't we | |
19:50.040 --> 19:54.240 | |
don't have enough money for for IBM Z series in the someday if you send him some | |
19:54.240 --> 19:58.080 | |
money so we can get an IBM mainframe we don't we haven't taken enemy CVC money | |
19:58.080 --> 20:01.620 | |
so I regrettably you know our cocaine budget is limited | |
20:01.620 --> 20:12.240 | |
dang it well you're in Amsterdam it's difficult IBM mainframes coke whatever boats | |
20:12.240 --> 20:16.620 | |
I don't know you have a lot of those in Amsterdam yeah that's true that's | |
20:16.620 --> 20:21.240 | |
interesting yeah but it's a fun ride I have to say I think you said that there's | |
20:21.240 --> 20:25.560 | |
space satellites running there is we haven't gotten confirmation we just went | |
20:25.560 --> 20:32.480 | |
through the official software procurement process with NASA recently which is | |
20:32.480 --> 20:36.660 | |
something they it's just funny because they have this massive entitlement that | |
20:36.660 --> 20:41.000 | |
you will respond to them right so you are just a unsuspecting maintainer of a | |
20:41.000 --> 20:45.380 | |
popular open-source project and you get a you know 40 question questionnaire sent to | |
20:45.380 --> 20:48.540 | |
you by NASA and they say well if you don't fill this we can't use your software and | |
20:48.540 --> 20:52.860 | |
you think what part of you know MIT license don't you understand right yeah | |
20:52.860 --> 20:57.660 | |
it's like it's you know not fit for any particular purpose but we of course | |
20:57.660 --> 21:01.740 | |
because it's NASA we did fill it out I don't I don't know exactly what what | |
21:01.740 --> 21:06.300 | |
mission this is planned to be used but once we're here we'll we know obviously | |
21:06.300 --> 21:10.620 | |
but because we're European we don't have any telemetry inductive so we don't | |
21:10.620 --> 21:15.780 | |
really know where it's running so explain that part for the audience but like what | |
21:15.780 --> 21:21.240 | |
what do you mean by that okay so in I think there is some cultural differences | |
21:21.240 --> 21:28.740 | |
between the US and Europe one of these differences regards the sort of | |
21:28.740 --> 21:35.940 | |
sensitivity to privacy things and I think it's pretty common for soft software | |
21:35.940 --> 21:40.140 | |
projects especially more modern ones that come out of the US to have some sort of | |
21:40.140 --> 21:45.720 | |
telemetry built-in basically they will report they will phone home in some way | |
21:45.720 --> 21:52.920 | |
and report you know hey I'm just running on this IP I'm this version of DuckDB this | |
21:52.920 --> 21:57.840 | |
is my you know Linux version or this is my glibc version like I don't know this is | |
21:57.840 --> 22:01.980 | |
usually hidden from the users as a auto updater feature or something like that | |
22:01.980 --> 22:08.020 | |
which is or not hidden maybe obfuscated is the right word here but we are in | |
22:08.020 --> 22:11.620 | |
Europe and people care deeply about this sort of thing and we don't want to | |
22:11.620 --> 22:15.040 | |
leak any information and we don't want to collect information that we don't | |
22:15.040 --> 22:19.420 | |
strictly need we don't strictly need to know where DuckDB is running so we don't | |
22:19.420 --> 22:23.920 | |
collect it I mean I mentioned earlier we do see this summary statistics on the | |
22:23.920 --> 22:27.940 | |
extension installs but that's just because they go through Cloudflare and | |
22:27.940 --> 22:32.620 | |
they will be able to report like here you know you had this many IPs from | |
22:32.620 --> 22:36.960 | |
Germany and this many IPs from China and so on and so forth but it is it is | |
22:36.960 --> 22:40.560 | |
something that we're conscious of I think if one of the strengths of DuckDB | |
22:40.560 --> 22:47.100 | |
being a local first kind of system is also that you know you can just have you | |
22:47.100 --> 22:51.000 | |
couldn't have an app imagine you have an app like Strava right they had some | |
22:51.000 --> 22:55.500 | |
privacy issues recently where people found out like where military bases were | |
22:55.500 --> 22:59.340 | |
where the like celebrities houses are how to you know best catch them in the | |
22:59.340 --> 23:06.940 | |
forests bit scary and and it's why well because these the Strava app uploads all | |
23:06.940 --> 23:14.320 | |
this stuff to the cloud and then they use some big data BS to to to to process it | |
23:14.320 --> 23:18.580 | |
well but there's no reason there's not really a strong reason to do that they | |
23:18.580 --> 23:23.620 | |
could also just leave this data on your device run DuckDB locally do all these | |
23:23.620 --> 23:27.820 | |
analytics and and then you know that would not be an issue so I think that's | |
23:27.820 --> 23:31.940 | |
something that is it's reason it's not it's not my main concern but it's | |
23:31.940 --> 23:36.860 | |
something that I think is is nice if you can you know leave data under the | |
23:36.860 --> 23:41.220 | |
control of the on the control of the device that the people have I think that | |
23:41.220 --> 23:44.400 | |
would be that's a really cool thing of about DuckDB that you can just deploy it | |
23:44.400 --> 23:50.560 | |
close to the user and we have people running on on phones that's actually on | |
23:50.560 --> 23:57.660 | |
iPhones is we did an experiment last week a fun fun fun story we found that in | |
23:57.660 --> 24:01.260 | |
order to get best at the best database performance out of an iPhone you have | |
24:01.260 --> 24:07.380 | |
to put it into a box with dry ice oh yeah because it it got to get slows | |
24:07.380 --> 24:12.780 | |
itself down when when when it gets hot so but if you put in dry ice it doesn't do | |
24:12.780 --> 24:16.500 | |
that so you showed me a picture of that too yeah yes I think it was kind of melted | |
24:16.500 --> 24:20.520 | |
around the CPU core on the upper it wasn't melt the ice melted ice melted yeah the phone | |
24:20.520 --> 24:24.660 | |
wasn't melted just with clarity the phone died briefly after briefly after the | |
24:24.660 --> 24:28.680 | |
experiment but it thankfully came back to life after 10 minutes was that your | |
24:28.680 --> 24:33.240 | |
phone or somebody else's I have to admit that we bought it from Amazon and | |
24:33.240 --> 24:35.780 | |
returned it | |
24:36.900 --> 24:44.060 | |
sorry sorry whoever gets this phone you're a part of history you just won't know it | |
24:44.060 --> 24:50.640 | |
hopefully we didn't break it too much yeah but we needed we wanted the latest | |
24:50.640 --> 24:56.680 | |
model because obviously and I want to add you know and Apple stopped making the | |
24:56.680 --> 24:59.880 | |
iPhone mini so it's really it's really Apple's that's what you have though right | |
24:59.880 --> 25:03.120 | |
it's what I want but they don't make it anymore oh yeah you're gonna have to get | |
25:03.120 --> 25:07.920 | |
this like big bugger here then yeah well well just means you have to have like | |
25:07.920 --> 25:14.100 | |
bigger pockets that's true but uh yeah I don't know I don't want the max though | |
25:14.100 --> 25:17.220 | |
because it feels like I'm have like a iPad mini at that point maybe good as a | |
25:17.220 --> 25:21.360 | |
self-defense weapon though right like if you could do that can whack people with | |
25:21.360 --> 25:26.340 | |
it yes that seems like a nice there might be another test for you to do is at | |
25:26.340 --> 25:30.360 | |
what force can I knock somebody out with my iPhone no this is not we are more of our | |
25:30.360 --> 25:32.720 | |
databases | |
25:32.720 --> 25:38.820 | |
Dr.B could be collecting analytics I'm not sure just kidding um oh yeah that's | |
25:38.820 --> 25:44.280 | |
interesting like um so you gave a talk today what was your talk about today today I | |
25:44.280 --> 25:52.680 | |
talked about updating data which is this this one weird trick that I don't know | |
25:52.680 --> 25:59.440 | |
that's that Hadoop doesn't know I don't know but this general idea that changing | |
25:59.440 --> 26:04.140 | |
data is is a good thing and especially the transactional changes to data are a | |
26:04.140 --> 26:09.180 | |
good thing and that transactional changing to data for are also a good idea | |
26:09.180 --> 26:13.860 | |
for analytics right and it's actually kind of interesting because there are a bunch | |
26:13.860 --> 26:20.220 | |
of popular analytics systems out there I won't name names that completely ignore | |
26:20.220 --> 26:26.460 | |
transactional semantics right so you say you're loading a CSV file the thing goes | |
26:26.460 --> 26:31.180 | |
wrong halfway through or something crashes or I don't know your internet goes down and | |
26:31.180 --> 26:38.800 | |
you wrote it now the basically file will be half loaded and and somehow that's | |
26:38.800 --> 26:46.720 | |
acceptable and so we have or we have systems that I also won't name names even | |
26:46.720 --> 26:53.980 | |
though you're wearing those socks that that are not getting the the the try the | |
26:53.980 --> 26:59.940 | |
durability right where you know when you when you commit a transaction the you know | |
26:59.940 --> 27:04.980 | |
conventional again body of knowledge of database orthodoxy dictates that if you | |
27:04.980 --> 27:10.440 | |
commit a transaction you need to synchronize your disk changes changes to | |
27:10.440 --> 27:15.480 | |
the disk using something called f-sync and there are systems out there that | |
27:15.480 --> 27:20.120 | |
simply ignore this because then they can say ah but I get 15 million transactions | |
27:20.120 --> 27:25.320 | |
per second or you don't right and this is true if you disable syncing you can do a | |
27:25.320 --> 27:29.720 | |
lot but that leads to the problem potentially that your database gets | |
27:29.720 --> 27:36.500 | |
corrupted it's not great so so I was trying to make this point to say hey um we | |
27:36.500 --> 27:43.600 | |
really want to have a sort of track classical transactionality for analytics | |
27:43.600 --> 27:50.480 | |
pipelines so that's I mean you start with we want this okay and then you can say | |
27:50.480 --> 27:54.560 | |
okay you can want everything but is there an efficient implementation and I think | |
27:54.560 --> 27:58.880 | |
we've shown with DuckDB because we've wrote a bunch of blog posts on this and you | |
27:58.880 --> 28:04.760 | |
know we did we did again we innovated what databases can do we showed with DuckDB | |
28:04.760 --> 28:12.380 | |
that you can actually have a full transactional asset compliant transactional | |
28:12.380 --> 28:18.440 | |
semantics in an analytical system without punishing performance that's that's the | |
28:18.440 --> 28:25.340 | |
that's the big sort of asterisk right and I was trying to show also that if you | |
28:25.340 --> 28:31.640 | |
have that you can do cool things like you can for example do things like have you | |
28:31.640 --> 28:35.120 | |
know all or nothing kind of semantics on reading a whole folder of cc files this is | |
28:35.120 --> 28:40.580 | |
this is pretty cool to have right you can have all or nothing semantics on on schema | |
28:40.580 --> 28:46.120 | |
changes on you know type change schema changes on table creation table deletion that kind of | |
28:46.120 --> 28:50.840 | |
thing it always people come from a consistent state and come to a consistent state you can | |
28:50.840 --> 28:55.880 | |
have constraints I mentioned the art index that Pedro was working on earlier you can | |
28:55.880 --> 29:02.000 | |
have you know say a primary key defined and it's by the way it is again I won't name names but it is | |
29:02.000 --> 29:06.960 | |
completely common for analytical data management systems to ignore things like | |
29:06.960 --> 29:10.720 | |
primary key constraint yeah it's common I know and you know why it's common | |
29:10.720 --> 29:16.700 | |
why well it's because it's expensive right yeah it's like you need to check in a primary key is | |
29:16.700 --> 29:22.940 | |
like okay you need to have some giant hash table or a B tree or God knows what to to actually ensure | |
29:22.940 --> 29:28.700 | |
that that primary key is unique yeah or that constraint holes and again now we've we've worked | |
29:28.700 --> 29:36.800 | |
very hard one of our team Tania she's a she's our local index hero she's doing working very hard of | |
29:36.800 --> 29:44.840 | |
making an indexing structure that that that will be able to check concrete constraints efficiently you | |
29:44.840 --> 29:49.100 | |
know add transaction commit to make sure that your data goes from one consistent state to another | |
29:49.100 --> 29:53.720 | |
consistent state and that is really just something great to have so that's that's kind of I think | |
29:53.720 --> 29:59.180 | |
that was what the main point I wanted to make and maybe maybe the sort of the one of the side notes | |
29:59.180 --> 30:07.700 | |
that that makes this possible is that we used to trade away everything in data for scale yeah like | |
30:07.700 --> 30:14.720 | |
there's entire like the entire NoSQL movement was essentially that right we said hey we need | |
30:14.720 --> 30:22.940 | |
web scale whatever that meant back in 2001 lol right before iPhones we need web scale therefore | |
30:22.940 --> 30:29.000 | |
we need to throw everything overboard that we've ever had like again we throw all the you know crap | |
30:29.000 --> 30:38.000 | |
that these old database people had invented overboard for scale and one of the things that I was | |
30:38.000 --> 30:42.080 | |
talking to database is like this you probably have talked did you talk to Jordan by the way before I | |
30:42.080 --> 30:46.880 | |
think you had yeah yeah and you talked about the big data to him right yeah yeah so you had talked | |
30:46.880 --> 30:51.740 | |
about him his his makes a very good point about big data not being as big as maybe you think it is | |
30:51.740 --> 30:58.220 | |
and my point as a follow-up on that is like yeah okay it's not that big which means we can we don't have to | |
30:58.220 --> 31:06.560 | |
trade everything away we can have things like transactional semantics in a not terrible way we can have you know | |
31:06.560 --> 31:15.780 | |
basically data warehouse technology this is a weird word we'll have to talk about that yeah in in a non | |
31:15.780 --> 31:22.880 | |
sort of punishing not a non punishing way from a performance perspective but yeah and I think talk | |
31:22.880 --> 31:30.560 | |
about lake house formats very briefly because I don't like lake house formats let's talk about that full | |
31:30.560 --> 31:37.940 | |
disclosure so this is this is uh how should I say I think I think it's um it's interesting because I'm | |
31:37.940 --> 31:46.400 | |
okay I have to maybe as a bit of background I'm I really love file formats okay I'm really I'm really | |
31:46.400 --> 31:53.720 | |
obsessed with file formats I've personally implemented the parquet reader inductive of course we worked as a | |
31:53.720 --> 32:02.180 | |
team we worked on our own file format I've this week implemented the avro the avro code I've done we | |
32:02.180 --> 32:10.640 | |
done a we did actually do a paper about protocols Lawrence my current PhD student he's doing papers | |
32:10.640 --> 32:19.040 | |
on serialization on data structures to disk I really like file formats I like you know it because and I | |
32:19.040 --> 32:23.660 | |
thought about this recently and I think it's so cool because it's a dimensionality reduction | |
32:23.660 --> 32:31.100 | |
you have to somehow take a multiple multi-dimensional structure in a table it's 2d right columns rows and | |
32:31.100 --> 32:37.820 | |
you have to put that into this one-dimensional thing which is your file or your disk or your blob | |
32:37.820 --> 32:44.060 | |
store or whatever and that's something that's it's just somehow there's all the complexity inherent is | |
32:44.060 --> 32:52.100 | |
really beautiful so I spent a lot of time parquet and then when iceberg first came out I was like okay | |
32:52.100 --> 32:56.840 | |
I know everything about parquet is to know because it turns out if you implement your own reader yeah | |
32:56.840 --> 33:02.540 | |
and writer you learn everything there is to learn about parquet in the process um so we did that and | |
33:02.540 --> 33:07.220 | |
so I thought okay I know this I can understand iceberg and so I looked at it and I thought okay | |
33:07.220 --> 33:15.480 | |
there's this JSON file on top and then there is these avro files on the bottom two layers because | |
33:15.480 --> 33:21.420 | |
why not and then underneath there sits the parquet files with the actual data and it seemed extremely | |
33:21.420 --> 33:27.240 | |
cumbersome and I couldn't really I mean I could at the time I could only really criticize the choice of | |
33:27.240 --> 33:34.680 | |
avro which to this day I don't understand why you have two different metadata formats in one system but | |
33:34.680 --> 33:42.960 | |
it's actually some I think my my beef with Lakers from this is actually something else it's more this | |
33:42.960 --> 33:55.140 | |
yeah this idea of bringing basically core data warehouse features back but in a really bad way it's hard | |
33:55.140 --> 34:05.580 | |
it's hard to explain it in a in a in a sort of in the worst way right for for reasons of like you | |
34:05.580 --> 34:11.520 | |
you're making technical decisions based on sort of market force reasons and not really on technical | |
34:11.520 --> 34:17.340 | |
reasons so I think the reason people got fed up with data warehouse systems in general | |
34:17.340 --> 34:27.120 | |
it's because of pricing model right I would argue yeah is that fair to say like if Oracle hadn't | |
34:27.120 --> 34:34.620 | |
charged per CPU but instead you know I don't know something else would have would would would be | |
34:34.620 --> 34:40.280 | |
would like no sequel ever have happened are you talking about for the big tech companies that | |
34:40.280 --> 34:45.240 | |
came up with their own solutions like the Hadoop yeah like other stuff if they if Oracle had like | |
34:45.240 --> 34:51.240 | |
hadn't charged hadn't hadn't you I mean okay I don't want to you know I hit only an Oracle but | |
34:51.240 --> 34:58.260 | |
if sort of big database in the late 90s yeah hadn't had these pricing models that were essentially in | |
34:58.260 --> 35:04.320 | |
that were they still have that were let's say assuming your data has a high value per byte | |
35:04.320 --> 35:10.960 | |
mm-hmm I think that's that's fair to say right right if they had in that in maybe had some other model | |
35:10.960 --> 35:15.120 | |
then would I think a lot of the no sequel movement would not have happened because people would | |
35:15.120 --> 35:21.120 | |
have installed you know happily installed I don't know sequel server yeah or like Postgres or something like that | |
35:21.120 --> 35:30.120 | |
Postgres was free right yeah but and Postgres was not where it was so but I think I think a lot of this has to do with | |
35:30.120 --> 35:36.120 | |
market forces and so people ended up hating data warehouse technology and there's of course other vendors like that are | |
35:36.120 --> 35:42.120 | |
more on the analytics side but I think the same restrictions apply you had to have very deep pockets to to again golf course | |
35:42.120 --> 35:49.120 | |
technology to to expensive back yeah exactly and so I get that and then but then people made again | |
35:49.120 --> 35:57.120 | |
technical decisions like let's throw all this stuff overboard because the market sort of incumbents | |
35:57.120 --> 36:02.120 | |
weren't willing to compromise on their pricing I think it was a big factor of it I'd love to talk to some of the | |
36:02.120 --> 36:08.120 | |
some of the friends who are at the big tech companies at the time yeah like I think it was that plus there was a | |
36:08.120 --> 36:13.120 | |
sense of just not scaling to quote web scale which is an entirely different discussion obviously | |
36:13.120 --> 36:19.120 | |
yeah but yeah I think it yeah the notion of just let's throw things on commodity servers | |
36:19.120 --> 36:23.120 | |
and and figure out how we're gonna work with that was a there's a big driving force for sure | |
36:23.120 --> 36:31.120 | |
but at the time too and for people who you know listening it's like they are the vendors for these | |
36:31.120 --> 36:38.120 | |
database companies database.co I mean they they big database I suppose they it's aggressive right | |
36:38.120 --> 36:43.120 | |
I mean you know and I know like some of these companies if you if you decided to break the contract | |
36:43.120 --> 36:50.120 | |
also penalties for that so you know there's a lot going in and a lot going out yeah I get that | |
36:50.120 --> 36:56.120 | |
but now what we but I think what we're seeing we're seeing a market force is being I mean we see | |
36:56.120 --> 37:01.120 | |
again we see the incumbents I would say cloud data warehouse vendors there's three so I don't have to | |
37:01.120 --> 37:09.120 | |
name them and and those again you know people people whinging whining about I don't know complaining | |
37:09.120 --> 37:16.120 | |
about the pricing model you know for really like I don't I don't really know about that but so now we | |
37:16.120 --> 37:24.120 | |
are and then the that happened and we got we got data lake right so people essentially said fine | |
37:24.120 --> 37:30.120 | |
we'll we'll just dump everything into s3 you're talking about the original data lakes like the | |
37:30.120 --> 37:35.120 | |
HDFS ones yeah yeah we'll put everything in as HDFS yeah exactly so we'll not give money to these evil | |
37:35.120 --> 37:42.120 | |
people we'll put everything in what is it called sequence files yeah remember those what a horrible thing | |
37:42.120 --> 37:49.120 | |
we put everything in sequence files on our HDFS and we'll run some you know some some Java concoction | |
37:49.120 --> 37:57.120 | |
on top of it and that's gonna be better and that then then paying you know vendor acts and that and that | |
37:57.120 --> 38:06.120 | |
went on for 10 years or so I think right now we got parquet files better right and now but now we see the | |
38:06.120 --> 38:12.120 | |
swing back right now the screen back is happening we want to people want to clearly need data warehouse | |
38:12.120 --> 38:17.120 | |
features like they or want them or demand them or something like that I remember the conversations | |
38:17.120 --> 38:25.120 | |
back in the 2010s during the quote big data era yeah was that data warehousing and BI were going to go | |
38:25.120 --> 38:28.120 | |
away and data warehousing was dead especially when spark came out that I think there was a lot of shatter that the days of data warehousing are done you found the data warehousing was dead | |
38:28.120 --> 38:35.120 | |
especially when spark came out that I think there was a lot of shatter that the days of data warehousing are | |
38:35.120 --> 38:43.120 | |
done you fast forward to today and it's interesting because well spark data bricks it's sequel it's basically the lake house you know | |
38:43.120 --> 38:50.120 | |
my friend Bill Inman who came up with the data warehouse actually wrote a book on data lake houses about it so it's it's it is interesting the | |
38:50.120 --> 38:58.120 | |
pendulum swings back and forth I do recall like the conversation was sequels dead data warehousing is also dead you need to forget about | |
38:58.120 --> 39:03.120 | |
this stuff because we're moving in this is also around the time of data science though so I felt like there was a lot of in | |
39:03.120 --> 39:15.120 | |
retrospect hubris about the power of using data frames and all this stuff and because it was it felt like the data frame was really | |
39:15.120 --> 39:20.120 | |
going to supplant everything there was a point in time when pandas was roaring ahead and then spark comes out with | |
39:20.120 --> 39:27.120 | |
distributed what was the RDD at first and then distributed data frames but I felt like that could that had a chance of | |
39:27.120 --> 39:32.120 | |
being the paradigm for like a split second and then it wasn't yeah yeah that's absolutely true and I think it's | |
39:32.120 --> 39:40.120 | |
interesting because I see parallels there to what we saw with no sequel from for for for transactional use absolutely | |
39:40.120 --> 39:47.120 | |
because people were saying up you know key value is all you need let the eventual consistency is all you need let the | |
39:47.120 --> 39:52.120 | |
application developer figure it out well it ended up the application well your socks yeah | |
39:52.120 --> 39:58.120 | |
let it it ended up the what happened happening is that the application developers were not happy | |
39:58.120 --> 40:04.120 | |
dealing with eventual consistency in effect couldn't deal with it same happened with query languages people said | |
40:04.120 --> 40:09.120 | |
are we don't need query languages we know we'll have the application developer deal with this have a joint application | |
40:09.120 --> 40:15.120 | |
well turns out that was not great either and I think we are starting to see the same analytics now | |
40:15.120 --> 40:20.120 | |
yeah where the data lake and data science movement there basically put all the all responsibility on the | |
40:20.120 --> 40:26.120 | |
individual data scientists to say you get to you know dig into this sort of giant data lake find the things that | |
40:26.120 --> 40:36.120 | |
are relevant to you decode them read them parse them put them into a data something and now the the the same effect is | |
40:36.120 --> 40:41.120 | |
happening we're saying oh well some of these data warehouse features are actually great like being able to make a | |
40:41.120 --> 40:47.120 | |
change to a table that's that sounds like great I mean you can do it a bit if you have a zoo of like a folder of | |
40:47.120 --> 40:54.120 | |
parquet files you can throw another parquet file in there great that works but it doesn't work if you want to I don't | |
40:54.120 --> 41:00.120 | |
know add a column right doesn't right it doesn't work well it doesn't work well if you want to remove rows now you have to | |
41:00.120 --> 41:07.120 | |
basically invent your homegrown sort of invalidation system homebrew invalidation system for it so then we're seeing | |
41:07.120 --> 41:13.120 | |
lake house formats which which are essentially doing that and I think what we're also seeing is these catalog things | |
41:13.120 --> 41:29.120 | |
right and and and and and together if you if you took the users credentials for S3 away then a catalog and lake house formats are exactly the same as the old school data warehouse right? | |
41:29.120 --> 41:36.120 | |
like okay the only difference that we have left over is that maybe you can have those maybe you can you get to poke | |
41:36.120 --> 41:40.120 | |
around in the files yourself and then your question is do you actually want that? | |
41:40.120 --> 41:42.120 | |
is this a good idea? | |
41:42.120 --> 41:43.120 | |
right | |
41:43.120 --> 41:54.120 | |
I would argue that there are political reasons why you want to do this you want to be able to blackmail or pressure your database vendor | |
41:54.120 --> 42:02.120 | |
you want to threaten vendor X that you're going to leave them for vendor Y because they're both using the same formats | |
42:02.120 --> 42:05.120 | |
okay sure I get that that's a political reason again not a technical reason | |
42:05.120 --> 42:06.120 | |
right | |
42:06.120 --> 42:12.120 | |
and there's also just some extreme serious limitations on on lake house formats right like you know can | |
42:12.120 --> 42:18.120 | |
I don't think anybody can realistically show me a path to more than one transaction per second on an iceberg file | |
42:18.120 --> 42:25.120 | |
like I don't I just don't see how right you have to stage all your parquet files you have to write the metadata files | |
42:25.120 --> 42:32.120 | |
you have to write the new the new root sort of metadata thing and then you have to do a commit in the in the catalog | |
42:32.120 --> 42:38.120 | |
I think that's that's the way they think about it to actually switch like that's okay one transaction per second | |
42:38.120 --> 42:45.120 | |
that's like 1960s style right that's not and even then they could do more than one per second | |
42:45.120 --> 42:52.120 | |
so by the way I recently I learned a fun fact did you know that the booking code on your flight tickets | |
42:52.120 --> 42:59.120 | |
used to be a pointer so this was that was the actual pointer to the record | |
42:59.120 --> 43:05.120 | |
oh really yeah and so they basically just that was the they had tapes with all the bookings for flights | |
43:05.120 --> 43:09.120 | |
and that record locator is why it's called a record locator was just a pointer | |
43:09.120 --> 43:13.120 | |
and by that point that I could find your record by just you know going to that point and the tape anyways | |
43:13.120 --> 43:20.120 | |
it's interesting Bill Inman actually told me that the the on a similar note that plane ticket the paper ones used to be a punch card | |
43:20.120 --> 43:23.120 | |
yeah so probably these probably are related somehow | |
43:23.120 --> 43:29.120 | |
I don't know that but to come back to you to come back to the lake house format so we're we're essentially building these things | |
43:29.120 --> 43:35.120 | |
things that that are inferior to you know state of the art from 20 years ago | |
43:35.120 --> 43:41.120 | |
and somehow get excited about it and I what I think is gonna happen is that somebody is gonna | |
43:41.120 --> 43:47.120 | |
like what happened with Hadoop basically and the MapReduce paper is that somebody's gonna build a | |
43:47.120 --> 43:58.120 | |
clone of a cloud dis aggregated storage cloud data warehouse that that you know works and I think once that happens | |
43:58.120 --> 44:05.120 | |
uh we're probably gonna forget about data lake formats quite quickly because then you have the entire | |
44:05.120 --> 44:11.120 | |
sort of feature set in one place again you have catalog you have query engine you have storage you have updates | |
44:11.120 --> 44:18.120 | |
you have uh you know authorization authorization is something that there is no story in lake house formats | |
44:18.120 --> 44:23.120 | |
how to do that right if somebody has an s3 key for your files it's over | |
44:23.120 --> 44:28.120 | |
uh and this is actually one of the things where we're getting a lot of requests at the moment | |
44:28.120 --> 44:36.120 | |
like is there any way you you you crazy people at DuckDB can make a row level authentication possible for | |
44:36.120 --> 44:42.120 | |
lake house formats and I'm like I have to tell them like there is just no way you give somebody an s3 key | |
44:42.120 --> 44:48.120 | |
it's over right they they are they are there they can do anything which again it's very interesting from a sort of | |
44:48.120 --> 44:55.120 | |
democratization of access perspective because I think one of the things that made data science in its heyday | |
44:55.120 --> 45:03.120 | |
uh so successful is because we had a data lake and there was just the wild west of parquet files and essentially | |
45:03.120 --> 45:11.120 | |
there was no um no governance right of any sorts and your low-level analysts could just go and grab some parquet files | |
45:11.120 --> 45:18.120 | |
and that's I think another swing back that we're seeing that oh actually we have tons of regulation we have to follow now | |
45:18.120 --> 45:24.120 | |
we can't just do that anymore we need to we need to give you know we need to do proper authorization and logging | |
45:24.120 --> 45:30.120 | |
and all that stuff and lo and behold we're back at sort of the full-scale data warehouse so so I don't know | |
45:30.120 --> 45:35.120 | |
and one of the things that actually I'm a bit concerned about in that space is like DuckDB one of the I think reasons | |
45:35.120 --> 45:42.120 | |
where DuckDB is popular is because there is the wild west right you can just download a parquet file from your | |
45:42.120 --> 45:50.120 | |
from your s3 or point directly directly to it right now and your IT department can't really stop you from doing that | |
45:50.120 --> 45:55.120 | |
right but if the pendulum swings back further and it's kind of what we're seeing now with these catalogs | |
45:55.120 --> 46:02.120 | |
where you do need credentials and they hand out you know they deal with all that stuff it might not be so easy in the future | |
46:02.120 --> 46:11.120 | |
and so this is actually for us it's a it's a potential threat that that this that we that the wild west disappears | |
46:11.120 --> 46:18.120 | |
more even more that yeah we would just lose access to this stuff because in the end if you need to pay Snowflake anyway | |
46:18.120 --> 46:25.120 | |
I'm sorry I mentioned the vendor a cloud if you have to pay a cloud data warehouse anyway to get access to your data | |
46:25.120 --> 46:32.120 | |
then there's no point for you to using things like DuckDB or things like you know Polars or things like ClickHouse | |
46:32.120 --> 46:39.120 | |
or it really doesn't matter like one of these more like let's say guerrilla data tools because you you're paying them anyway | |
46:39.120 --> 46:47.120 | |
you might as well use their stupid compute right I mean valuable computer it's uh it's uh it is it is interesting to see | |
46:47.120 --> 46:54.120 | |
I'm I'm really curious what what will happen there I yeah I think we'll see we'll see that um that that sort of clone | |
46:54.120 --> 47:01.120 | |
like the Hadoop of the Hadoop of cloud data warehouse hopefully not in Java though I don't know | |
47:01.120 --> 47:08.120 | |
it's all coming back it's all coming back we're gonna have distributed real-time Java again yes yes | |
47:08.120 --> 47:15.120 | |
anyways you mentioned local first yeah that's an interesting movement right now I | |
47:15.120 --> 47:20.120 | |
I know Martin Kleppman he's working on some really cool stuff with the decentralized protocols | |
47:20.120 --> 47:33.120 | |
do you envision like a decentralized version of DuckDB it's interesting we did have a | |
47:33.120 --> 47:39.120 | |
uh we do have a current research project running on this actually I did get a grant uh for a research project | |
47:39.120 --> 47:45.120 | |
with responsible decentralized data architectures I think is the term is the name um that is doing | |
47:45.120 --> 47:51.120 | |
that is imagining this idea of there is going to be a fleet of of DuckDBs running they are gonna be | |
47:51.120 --> 47:59.120 | |
you know under users control but we are still the idea is that you can still uh for example run aggregations | |
47:59.120 --> 48:05.120 | |
over the whole fleet of systems with partial results being shipped back up um that's interesting | |
48:05.120 --> 48:12.120 | |
um I haven't seen like the research project is mainly there because you know you need to build some | |
48:12.120 --> 48:15.120 | |
abstractions for this to be something that it's not just a one-off that somebody hacks because you can | |
48:15.120 --> 48:20.120 | |
totally build that today right like nothing keeps you from building that today um you can ship | |
48:20.120 --> 48:25.120 | |
intermediates around there are some there are some organizations that build have built pretty wild | |
48:25.120 --> 48:31.120 | |
things around parquet files that are being sent around or arrow buffers or anything like that but | |
48:31.120 --> 48:38.320 | |
I think we need some abstractions there to make this like nice and and efficient um I think that | |
48:38.320 --> 48:43.040 | |
MotherDuck is doing some interesting stuff so so for those who don't know MotherDuck is a company that's | |
48:43.840 --> 48:50.560 | |
building a DuckDB as a service and they are their execution model is this whole uh hybrid execution where | |
48:51.200 --> 48:55.040 | |
you have a DuckDB local you have a DuckDB on the server they talk to each other the query | |
48:55.040 --> 48:59.920 | |
gets split up and run partially there or there depending on you know where it makes more sense | |
48:59.920 --> 49:05.520 | |
depending on optimization and I think that's super interesting I didn't actually see that one coming | |
49:05.520 --> 49:11.200 | |
I thought okay they're just going to do DuckDB as a service done uh but no they actually have been | |
49:11.200 --> 49:16.720 | |
innovating in that in that space as well which I think is really cool yeah I remember when Jordan uh | |
49:17.840 --> 49:22.560 | |
first mentioned this around uh I think it's right before Duck or MotherDuck was announced and I was just | |
49:22.560 --> 49:27.440 | |
kicking myself because I asked can I put some money in that that's awesome he's like or oversubscribed | |
49:27.440 --> 49:34.400 | |
like damn it yeah well thank you thank you for the for the for the trust so um yeah it's I think | |
49:34.400 --> 49:40.800 | |
it's really exciting to see what what like what at the moment I think our from DuckDB labs from DuckDB | |
49:40.800 --> 49:45.760 | |
the project perspective we are very much sort of focused on a single node yeah because that's still | |
49:45.760 --> 49:52.080 | |
of the the space we inhabit right that's like we do the best damn job we can do on that in that single | |
49:52.560 --> 50:00.560 | |
environment and uh and then other I think it's up to other people to you know to build to build sort | |
50:00.560 --> 50:06.560 | |
of crazy combinations of this we don't we don't have the resources on our other team really to | |
50:06.560 --> 50:15.360 | |
do a lot we have a small team we're 20 people uh wonderful team uh of of you know database hackers uh | |
50:15.360 --> 50:19.600 | |
but we can only really with that that size of team right we can only really focus | |
50:20.320 --> 50:26.720 | |
on one thing and it's gonna be like single node execution and what's your philosophy though in | |
50:26.720 --> 50:33.040 | |
terms of is a pendulum swinging back to to single node are you saying or what yeah I think directory | |
50:33.040 --> 50:40.000 | |
that's an excellent question I think that the the the distributed things I think that uh there's a | |
50:40.000 --> 50:47.600 | |
wonderful uh talk at uh ICD it's a academic database conference um I think it's data engineering even | |
50:47.600 --> 50:54.480 | |
international you should go and speak I should go and speak um uh that uh where he basically says | |
50:54.480 --> 51:00.640 | |
that database researchers have been solving the whales problems for the last 20 years uh basically | |
51:00.640 --> 51:06.080 | |
solving the problems that google have has that some of the problems that uh google what else | |
51:07.280 --> 51:13.760 | |
netflix yeah the big ones right the ones with the blogs the ones with the blogs the ones the big | |
51:13.760 --> 51:18.800 | |
tech blogs yeah but here's the problem so what I always see in this uh is you know the big tech | |
51:18.800 --> 51:24.800 | |
companies will publish the blogs here's what we're doing um iceberg for example right uh built at netflix | |
51:24.800 --> 51:32.000 | |
because it's all the netflix-like problem um and they build a lot of stuff there uh if you're a smaller | |
51:32.000 --> 51:36.880 | |
company like you say say you know duck db is a is uh I don't know you suddenly sell furniture or | |
51:36.880 --> 51:42.960 | |
something like that you just big pivot or ducks whatever boats boats yeah boats okay so then you | |
51:42.960 --> 51:46.880 | |
have your data warehouse right does it make any sense for you to do something you know like that | |
51:46.880 --> 51:51.440 | |
or do you so it does not make sense yeah but I don't want to say the blogs that's I mean everyone | |
51:51.440 --> 51:57.120 | |
looks at the blogs like oh we got to do that at our company too right yeah no this is exactly right | |
51:57.120 --> 52:01.600 | |
uh there was this there was this one thing where uber wrote about how they ditched postgres for my | |
52:01.600 --> 52:06.560 | |
sql yeah remember the other way i think it was the other i don't remember which yeah who cares the | |
52:06.560 --> 52:13.280 | |
point is it made huge it made huge splashes in the data engineering community and it didn't matter | |
52:13.280 --> 52:19.280 | |
for everyone like it didn't like it matters it matters to uber sure right um but i think this point | |
52:19.280 --> 52:23.920 | |
about people solving the whales problems is really interesting because it because we have been neglecting | |
52:23.920 --> 52:30.400 | |
sort of the 99 of people's data problem data problems because we cannot tell you know like your | |
52:30.400 --> 52:38.000 | |
your boat uh you know boat sales company to go install spark like there's no point ever for | |
52:38.000 --> 52:42.720 | |
them to run spark to deal with their customer data yet we have been telling them that for the last 10 | |
52:42.720 --> 52:49.360 | |
years right oh you want to do you want to do data stuff go install spark um like that that that's not | |
52:49.360 --> 52:54.400 | |
a great well otherwise you're not a real data company yeah i'm saying that jokingly but it's just uh yeah | |
52:54.400 --> 53:02.000 | |
um but i think i think it's this is also where we see our role especially since we started as taxpayer | |
53:02.000 --> 53:07.360 | |
funded uh research as a taxpayer funded research project is like we need to solve the the the real | |
53:07.360 --> 53:12.880 | |
data problems out there the problem of the 99 which is you know the people that that run out of steam | |
53:12.880 --> 53:17.520 | |
with excel right that that is the that it it does not make sense to solve google's problem they have they | |
53:17.520 --> 53:23.360 | |
have clever people they can afford to pay the clever people the clever people you know are capable of building a | |
53:23.360 --> 53:29.600 | |
solution that works for google and no one else and i don't care yeah it's it's like uh it's really | |
53:29.600 --> 53:35.840 | |
a different it's really a different it's a different ball game i think and and uh yeah and i think it's | |
53:35.840 --> 53:44.000 | |
super interesting if you look at these uh studies that came out from uh redshift and from uh who else | |
53:44.560 --> 53:50.240 | |
redshift and the other one the snowflake had a private one but yeah but i don't think i think they | |
53:50.240 --> 53:55.920 | |
didn't release the full benchmarks i'm not yeah i know redshift the redshift one is well known the | |
53:56.480 --> 54:01.440 | |
um to see what you know data sizes actually look like in the real world and it's i mean jordan has | |
54:01.440 --> 54:06.640 | |
been talking about this uh as well how he what he saw inside bigquery and these are all data sizes that | |
54:06.640 --> 54:12.240 | |
are completely manageable um you could argue about the point whether you need disaggregated storage or not | |
54:12.240 --> 54:20.160 | |
right that's an interesting point um jordan says yes i say probably not um but uh the uh | |
54:20.160 --> 54:25.840 | |
uh because yeah you know you're you're writing 10 years of data maybe you want to have disaggregated | |
54:25.840 --> 54:30.720 | |
storage so you don't run out of disk space but again disks are gigantic you can get a you know 20 | |
54:30.720 --> 54:37.920 | |
terabyte ssd no problem it's uh it's it's pretty wild um so i think that that fundamentally um yeah | |
54:37.920 --> 54:45.520 | |
our our tech is is getting better faster in at a quicker rate than our data sets are getting bigger and | |
54:45.520 --> 54:50.400 | |
more challenging and therefore we're gonna we're gonna see um actually much more of the sort of | |
54:50.400 --> 54:56.160 | |
single user workload number one um moving moving to to single node and you know your laptop you have | |
54:56.160 --> 55:02.400 | |
your laptop hey let's make it work well the laptops are insane these days insane you have six gigabytes | |
55:02.400 --> 55:08.720 | |
per second i o speed on a macbook right like that's i don't know i don't know what this i don't even | |
55:08.720 --> 55:13.360 | |
not actually and we managed to you know maybe managed to get that busy with ducktb that's not the | |
55:13.360 --> 55:19.840 | |
point the but it's it's uh it's it's unheard of so one thing you were talking about we're when we're | |
55:19.840 --> 55:28.240 | |
having uh lunch um i think somebody asked about memory uh versus ducktb um maybe can you walk people | |
55:28.240 --> 55:33.200 | |
through sort of how to think about that yeah yeah people think uh there's a misconception that people | |
55:33.200 --> 55:39.760 | |
think that ducktb is an in-memory database it is not uh it's not an in-memory database uh we can use | |
55:39.760 --> 55:46.720 | |
memory we like memory it's nice it's fast uh but we're not limited by it so again with laurence the | |
55:46.720 --> 55:52.480 | |
my phd student he's been working on the ducktb larger the memory capabilities so we've always had | |
55:52.480 --> 55:57.760 | |
this thing that your input data could be bigger than memory because we read it in a sort of a streaming | |
55:57.760 --> 56:02.720 | |
way like if you point ducktb to a parquet file it will not first copy the parquet file into ram | |
56:03.200 --> 56:09.440 | |
and then do things with it that would be dumb it instead it it says okay we'll start in the first | |
56:09.440 --> 56:16.720 | |
row group second row group and so on so forth um however there's operators in relational sort of | |
56:17.920 --> 56:25.120 | |
analysis like block so-called blocking operators like join sort aggregate some window functions | |
56:26.160 --> 56:32.320 | |
top n that may have to materialize actually a large of large amount of the input in itself so if you're | |
56:32.320 --> 56:36.720 | |
if you're loading a terabyte of data we might read this in a string fashion but if you aggregate on | |
56:36.720 --> 56:42.800 | |
the unique key we will have to put it into our hash table for the aggregation doesn't this just the | |
56:42.800 --> 56:50.160 | |
semantics of sql dictate this um and it's pretty common in in analytical systems to actually fail or | |
56:50.160 --> 56:55.200 | |
do something very slow in this case we have some papers that show that but inductively we actually | |
56:55.200 --> 57:01.760 | |
build something that's called graceful degradation when we reach the the memory limit uh which means that | |
57:01.760 --> 57:06.880 | |
we start using the disc and that's only really possible because we have these crazy ssds now | |
57:06.880 --> 57:10.720 | |
because we can actually offload to disc at six gigabyte per second we can read it back at six | |
57:10.720 --> 57:15.280 | |
gigabytes per second so and we don't get slowed down by multiple threads doing the same thing | |
57:15.280 --> 57:19.680 | |
thing too much so it's really it's improved quite a lot compared to the spinning rust kind of thing right | |
57:20.160 --> 57:26.160 | |
um and then so we can essentially for things like aggregations we can offload part of the | |
57:26.160 --> 57:31.040 | |
intermediate result to disc and then basically resume and you will actually not feel a performance | |
57:31.040 --> 57:36.480 | |
cliff there because it will just you know it will still use as much memory as you allow us to use | |
57:36.480 --> 57:41.600 | |
but it will just also be able to use the disc this gets even more interesting if you have a join | |
57:41.600 --> 57:48.080 | |
because you can have multiple join uh joins in the same sort of query right and then now we have | |
57:48.080 --> 57:52.080 | |
multiple operators running at the same time fighting for memory at the same time and now you have to think | |
57:52.080 --> 57:57.920 | |
about uh essentially a fair allocation strategy between those operators and again we have a paper | |
57:57.920 --> 58:03.920 | |
that lawrence wrote that's uh currently under revision at vldb which is a large database conference | |
58:04.640 --> 58:10.400 | |
that describes how we've implemented um the strategy to deal with this inductive and the result is | |
58:11.200 --> 58:18.240 | |
that you can set that a fairly low memory limit inductive run complex queries that everybody else | |
58:18.240 --> 58:23.760 | |
essentially blows up on on a single note again with your disc and still finish those queries in a | |
58:23.760 --> 58:30.000 | |
reasonable time we we call it the galaxy quest principle have you seen galaxy quest no it's this | |
58:30.000 --> 58:36.800 | |
movie uh it's just like lampooning star trek it's very funny it's tim tim burton and sigourney weaver i | |
58:36.800 --> 58:44.560 | |
think um and uh in galaxy quest it's this fake tv show that's like like star trek uh and there you have | |
58:44.560 --> 58:49.680 | |
this catchphrase which is never give up never surrender right and so we took this galaxy quest | |
58:49.680 --> 58:55.840 | |
principle to uh to the query processing which means that we we just never want to give up we never | |
58:55.840 --> 59:00.800 | |
want to surrender i will always be able to finish the query provided there is you know enough disk space | |
59:01.600 --> 59:07.200 | |
uh to to do it you know we cannot we cannot control what users are doing if you if you cross product | |
59:07.200 --> 59:15.440 | |
a giant parquet file it will be the end we will abort uh probably but uh if it's in any way reasonably | |
59:15.440 --> 59:21.040 | |
we want to finish and i think that's something that is unheard of in like research prop systems it's also | |
59:21.040 --> 59:26.560 | |
pretty unheard of in in in other database systems like you know if you have a cloud database if you | |
59:26.560 --> 59:32.480 | |
have distributed query processing you have to do these shuffles there are just gonna be no win scenarios | |
59:32.480 --> 59:36.800 | |
where every part a whole partition has to fit on one worker and if it doesn't it's just game over | |
59:36.800 --> 59:42.320 | |
like spark has this problem a lot uh but i think on single note we can do pretty cool things there | |
59:42.320 --> 59:48.240 | |
and make sure that we finish that's interesting um how does acid work in the that environment where | |
59:48.240 --> 59:55.920 | |
which environment well in the situation where uh um you have a small amount of memory yeah uh how do | |
59:55.920 --> 01:00:01.120 | |
you how do you make sure that the transactions are committed or rolled back right uh so this is uh so | |
01:00:01.120 --> 01:00:05.520 | |
this is independent of query processing right so when a query starts running we'll you know we'll run it | |
01:00:05.520 --> 01:00:13.760 | |
well mvcc the multi-version currency control currency control will read a um a specific version of the | |
01:00:13.760 --> 01:00:19.840 | |
table okay and that's going to be consistent during the runtime of the query so we'll read this specific | |
01:00:19.840 --> 01:00:24.320 | |
version so that has nothing to do with memory it's just we run this in this version what is more | |
01:00:24.320 --> 01:00:32.720 | |
interesting is uh how do we deal with changes that are made within a transaction that are bigger than ram | |
01:00:32.720 --> 01:00:36.400 | |
right that's kind of where it's getting to exactly exactly yeah and then we're also able to offload | |
01:00:36.400 --> 01:00:43.440 | |
this so the the writer headlock um uh that where basically changes go on transaction commit they um | |
01:00:43.440 --> 01:00:50.400 | |
that uh will of course be written when at the end of the transaction but uh we do speculatively write | |
01:00:50.400 --> 01:00:55.600 | |
big changes already to the database file okay so we're running a big say you're this is because we had | |
01:00:55.600 --> 01:01:01.680 | |
this problem this actual problem let's say you have a table you're inserting a terabyte csv file | |
01:01:02.240 --> 01:01:07.520 | |
in a transaction and then you're committing okay traditionally what would happen is you're | |
01:01:07.520 --> 01:01:13.360 | |
writing that terabyte file in your writer headlock because you need to be transactionally safe then | |
01:01:13.360 --> 01:01:17.680 | |
a checkpoint you read it again and you actually write it to your persistent table at which point you | |
01:01:17.680 --> 01:01:22.720 | |
truncate your better headlock now you've basically written this terabyte file twice to disk and read it | |
01:01:22.720 --> 01:01:27.520 | |
twice because you read it once to read it in and then you wrote it to the val and then you read it | |
01:01:27.520 --> 01:01:32.080 | |
from the val again and you wrote it to the main persistent database that's not great right that's | |
01:01:32.080 --> 01:01:37.840 | |
both read and write amplification so inductively what what mark has built is a speculative writing | |
01:01:37.840 --> 01:01:44.080 | |
of large changes so we will already write in the database file but only at transaction and at commit | |
01:01:44.080 --> 01:01:51.760 | |
we into the val we don't we only write references to those blocks and only uh at checkpoint we will | |
01:01:51.760 --> 01:01:58.640 | |
essentially mark these blocks at being used um yeah so in in the main sort of header so that means | |
01:01:58.640 --> 01:02:04.720 | |
that basically in the default case where it works out we will write it once to the database file and | |
01:02:04.720 --> 01:02:10.400 | |
that's it when we're done and then some metadata has to be updated later on um and in the failware in | |
01:02:10.400 --> 01:02:17.360 | |
the in the case where we abort we'll have written some blocks to the database file that we then mark as | |
01:02:17.360 --> 01:02:24.160 | |
empty so nothing so the worst thing that happened is that we made our file a terabyte bigger but that's | |
01:02:24.160 --> 01:02:28.800 | |
uh something that we will be reused this space will be able to reuse later so we have thought about | |
01:02:28.800 --> 01:02:34.160 | |
this and this is exactly what what i meant like we've thought really how do we bring these transactional | |
01:02:34.160 --> 01:02:42.240 | |
semantics to analytical use cases without making the without like making it slow right because yeah you | |
01:02:42.240 --> 01:02:47.760 | |
you assume that this is going to write work work out well so the default case is going to be fast the | |
01:02:47.760 --> 01:02:53.840 | |
you know the worst case when it aborts is going to be we we use some disk space oh no it's a good | |
01:02:53.840 --> 01:02:58.960 | |
trade-off i think from our perspective right yeah so so that's these are sort of things that we that | |
01:02:58.960 --> 01:03:05.360 | |
we do in in actively to make this work that's interesting yeah it's a fun thing to do it's uh it's nobody | |
01:03:05.360 --> 01:03:11.520 | |
has been there before it's right it's really like nobody has tried to write a terabyte to a vowel like | |
01:03:11.520 --> 01:03:17.200 | |
that's uh or not to write a terabyte to the wall it's uh so it's really funny how the other guys do | |
01:03:17.200 --> 01:03:24.480 | |
this uh so most systems out there again not gonna name names have a sort of bypass of the transactional | |
01:03:25.280 --> 01:03:32.800 | |
uh sort of system for bulk loads so you have a special tool then you're basically bypassing the | |
01:03:32.800 --> 01:03:37.840 | |
whole transactionality just for bulk loads because that that doesn't work and then yeah that's that's | |
01:03:37.840 --> 01:03:41.920 | |
of course not what you want we for us reading a big file is a common operation so it needs to be | |
01:03:41.920 --> 01:03:47.280 | |
transactional are you saying like a copy command yeah it is like a copy command but then um there's | |
01:03:47.280 --> 01:03:52.880 | |
this like usually there's a separate command line tool that that uh that this just bypasses the | |
01:03:52.880 --> 01:03:59.360 | |
transactional log yeah it's not great but if because if that goes wrong then you're you're back to | |
01:03:59.360 --> 01:04:03.600 | |
you know not the things that's not to be done anyways it's a fun thing are you seeing many | |
01:04:03.600 --> 01:04:08.880 | |
streaming use cases with docdb um well we're not a streaming system in that sense so maybe by | |
01:04:08.880 --> 01:04:14.000 | |
definition we don't see a lot of demand for it i think there's really amazing streaming systems | |
01:04:14.000 --> 01:04:17.440 | |
that they're like materialized i don't know if you've talked to them already yeah no material yeah | |
01:04:18.720 --> 01:04:24.000 | |
so there i think i i think they have really amazing tech and it would be really weird for us to try to | |
01:04:24.000 --> 01:04:30.000 | |
make a knockoff like there's another unnamed database company that or self self declared | |
01:04:30.000 --> 01:04:35.040 | |
database company that's uh taking their bulk system and slapping on some sort of lightweight | |
01:04:35.040 --> 01:04:39.920 | |
fake stream thing uh shouldn't be we don't want to do that right like that's just embarrassing | |
01:04:40.800 --> 01:04:47.200 | |
uh and uh so i think the the real way how to do streaming is like what materialize is doing like | |
01:04:47.200 --> 01:04:55.040 | |
they have frank frank's magic that that makes this really work um so i don't think we're we're gonna | |
01:04:55.040 --> 01:05:01.840 | |
go there in in the near future what we might end up doing though is maybe that we um have materialized | |
01:05:01.840 --> 01:05:09.200 | |
views that this you know can incrementally update but that's not really streaming right that's more like | |
01:05:10.160 --> 01:05:16.640 | |
it's my updated materialized views it's i think streaming streaming itself um yeah if you use the | |
01:05:16.640 --> 01:05:21.920 | |
user wants to we have users that that use ducktb for streaming use cases where they just you know | |
01:05:21.920 --> 01:05:26.240 | |
they do an insert and they do a delete at the same time and then you have written they rerun the query | |
01:05:27.840 --> 01:05:33.040 | |
that works it's it's it's it's not gonna it's not gonna work well it's really not why it seems kind | |
01:05:33.040 --> 01:05:43.120 | |
of expensive i think it's also what the unnamed vendor is doing so it's like yeah man you see you | |
01:05:43.120 --> 01:05:50.640 | |
see if you're long enough in data engines you see things right i'm sure oh my god it's uh it's like | |
01:05:50.640 --> 01:05:57.280 | |
this f-sync thing it's it's it's really amazing like i have to say so in my i had a in my class at | |
01:05:57.280 --> 01:06:05.600 | |
some point years ago i had a student saying but the key value store with the socks uh manages like so | |
01:06:05.600 --> 01:06:11.040 | |
the socks he's referring to it starts with an m and there's a database i'll let you figure out the rest | |
01:06:11.040 --> 01:06:16.800 | |
um i don't you know i don't know if i'm allowed to say it or not but uh but uh there's this there | |
01:06:16.800 --> 01:06:21.280 | |
was this thing where somebody said yeah but but uh this system can do a million transactions i don't | |
01:06:21.280 --> 01:06:27.280 | |
i forgot the actual numbers a million transactions per second and postgres being old and stupid uh can | |
01:06:27.280 --> 01:06:30.800 | |
only do a hundred thousand transactions per second that's what they're saying that was that was the | |
01:06:30.800 --> 01:06:37.200 | |
blog post with the pen and everything and so i looked at this and i thought bullshit um and i looked | |
01:06:37.200 --> 01:06:42.240 | |
at it and in the lecture and like in the next lecture i i brought uh my laptop and i was like | |
01:06:42.240 --> 01:06:46.480 | |
right i figured out we'll run an experiment right here right now i'll show you once and for all | |
01:06:46.480 --> 01:06:51.760 | |
because it turns out if you enable the proper logging in the key value store that was bragging | |
01:06:51.760 --> 01:06:58.160 | |
about performance low and behold it went to 100 000 transactions per second yeah and poscus has a | |
01:06:58.160 --> 01:07:03.520 | |
flag to disable them and if you do that low and behold it went to a million transactions per second | |
01:07:03.520 --> 01:07:09.280 | |
so the bragging result from blog post turned around 100 and if you had actually configured this fairly | |
01:07:10.400 --> 01:07:15.520 | |
either both of them do it or none of them do it it would have you know they would have been exact same | |
01:07:15.520 --> 01:07:21.440 | |
result interesting um and i think it's so funny how how defaults sometimes matter like again with the | |
01:07:21.440 --> 01:07:26.720 | |
example i mentioned with the you know today with the with the the routers bricking themselves because | |
01:07:26.720 --> 01:07:32.640 | |
again the same system wasn't using f-sync um that is they probably wanted to have this but the default | |
01:07:32.640 --> 01:07:38.080 | |
was it was off so they aired on the source they aired on the wrong side if you want right postgres | |
01:07:38.080 --> 01:07:42.880 | |
will always air on the on the side of caution and i think it's the correct thing if you want to optimize | |
01:07:42.880 --> 01:07:48.240 | |
the hell out of it uh then you can by setting the flags it's just that you need to actually look at | |
01:07:48.240 --> 01:07:54.720 | |
it and not just write a blog post mm-hmm yeah somebody wrote on blue sky today about um | |
01:07:54.720 --> 01:08:02.720 | |
uh i can't remember who it was but uh how not to do database benchmarking 101 and i guess it was um | |
01:08:02.720 --> 01:08:09.040 | |
just about certain metrics of performance but didn't take into account uh uh indexes and rebalancing there | |
01:08:09.040 --> 01:08:13.680 | |
that's a classic yeah yeah yeah we wrote a paper about this what's that we actually wrote a paper about | |
01:08:13.680 --> 01:08:20.640 | |
this oh yeah uh it's one of our most cited space cited papers it's a database benchmarking i forgot the | |
01:08:20.640 --> 01:08:29.760 | |
actual title of it um i it is it is meant as a how-to but we actually if you're curious uh there's a | |
01:08:29.760 --> 01:08:35.360 | |
there's a paper on this that go check it out from from us that that lists uh lists the most common | |
01:08:35.360 --> 01:08:42.400 | |
benchmarking crimes oh yeah which is uh but absolutely like pre-processing time is definitely one of them | |
01:08:42.400 --> 01:08:49.040 | |
it's a easy way of winning a benchmark is to to pre-process the hell out of it and uh and then | |
01:08:49.040 --> 01:08:54.160 | |
depending on the benchmarks and also like there is also a lot of lawyering going on you know you | |
01:08:54.160 --> 01:08:59.360 | |
know tpch right you've heard of it right it's terrible it's pretty old and busted people should | |
01:08:59.360 --> 01:09:05.520 | |
be switching to tpcds much better benchmark um but even that uh there's real benchmark lawyers | |
01:09:05.520 --> 01:09:10.880 | |
out there right that look through the spec and find like the the thing they forgot and they will | |
01:09:10.880 --> 01:09:16.480 | |
exploit that thing and then they will win at the benchmark it's it's kind of weird well that's | |
01:09:16.480 --> 01:09:21.280 | |
funny and then some databases don't let you benchmark because it's a do it clause the debit | |
01:09:21.280 --> 01:09:27.840 | |
clause yeah yeah like uh i think that's that's the famous one so then you see dbms x and dbms y | |
01:09:27.840 --> 01:09:31.440 | |
and all that stuff in papers which i don't think is you know helping science a whole lot | |
01:09:32.240 --> 01:09:39.040 | |
but what's what's that i mean uh because of the debit clause oh okay people can't say i run this on | |
01:09:39.040 --> 01:09:47.360 | |
oracle because uh yeah people fear oracles yeah you know legal department probably for a reason yeah | |
01:09:47.360 --> 01:09:53.200 | |
um and then instead of saying we run the result with oracle was this they will say the result with | |
01:09:53.200 --> 01:10:01.120 | |
dbmxx dbmsx was this and then it's up to the reader to guess what that was i don't know i don't | |
01:10:01.120 --> 01:10:06.800 | |
understand that but then leave them out you know who cares yeah but uh yeah benchmarking is a is an | |
01:10:06.800 --> 01:10:12.320 | |
interesting thing so we because it's so hard to make fair benchmarks we actually don't publish any | |
01:10:12.320 --> 01:10:18.080 | |
benchmarks of duckdb against something else on like our website so we don't have anything there we we | |
01:10:19.200 --> 01:10:24.080 | |
we call this a home game obviously it's much more interesting to win an away game yeah so and that's | |
01:10:24.080 --> 01:10:29.120 | |
when somebody else runs the benchmark uh so that's that's the ones that uh that i think are more | |
01:10:29.120 --> 01:10:34.480 | |
interesting to us because they hadn't you know they probably haven't spent a whole lot of time | |
01:10:34.480 --> 01:10:41.920 | |
optimizing uh which is you know hard to do hard to not to do if it's your system uh but uh yeah so | |
01:10:41.920 --> 01:10:46.720 | |
we're not doing anything of that because yeah it's also like raw performance is overrated | |
01:10:48.000 --> 01:10:53.280 | |
i think we've talked about this earlier about the ease of use it's not that dcb is not fast i think it's | |
01:10:53.280 --> 01:11:01.840 | |
it's a very competitive uh query engine uh but it's it's not the end it's not the end all of data | |
01:11:01.840 --> 01:11:07.440 | |
things you need to solve a problem you're not solving the problem by being two percent faster | |
01:11:07.440 --> 01:11:11.360 | |
and then the user not being to install it because you're using some intel intrinsic that you don't | |
01:11:11.360 --> 01:11:18.320 | |
have yeah so so it's probably worth more to not have the two percent extra performance | |
01:11:18.320 --> 01:11:22.480 | |
but knowing that this will run on every intel processor made in the last 20 years | |
01:11:22.480 --> 01:11:32.560 | |
just saying that's interesting yeah how did you guys become so cool i don't know i'm not cool i i i | |
01:11:32.560 --> 01:11:37.200 | |
think i'm database cool like that's cool that database cool is something very different right | |
01:11:37.200 --> 01:11:41.680 | |
cool i will never achieve cool i know this but you're not like taylor swift exactly but you're kind | |
01:11:41.680 --> 01:11:46.480 | |
of the taylor swift of the database world right so that's the but i have to admit the bar is much | |
01:11:46.480 --> 01:11:53.520 | |
much much much lower right i i mean i i really like databases that's right and i have shaved today | |
01:11:54.560 --> 01:12:01.360 | |
so i'm clearly not one of them but uh no i i know this is a this is like the people are like the | |
01:12:01.360 --> 01:12:07.440 | |
people obsess about tables and query engines is a is a strange subgroup of people i will admit we | |
01:12:07.440 --> 01:12:12.000 | |
managed to be cool in that subgroup of people yeah and that's okay that's that's that's all we can | |
01:12:12.000 --> 01:12:18.560 | |
really ever hope for it's not going to be coolness uh as defined by others like whatever and what do | |
01:12:18.560 --> 01:12:24.880 | |
you think it was though because i mean it's you know you enter the database uh field it's it's it | |
01:12:24.880 --> 01:12:32.720 | |
tends to be pretty dry it tends to be pretty you know whatever you want to call it right but um but | |
01:12:32.720 --> 01:12:39.040 | |
you guys have a cult following we have um i mean i can maybe say what originally drew me to the field was | |
01:12:39.040 --> 01:12:48.480 | |
the low amount of that i initially perceived um so in in data systems uh or in systems research in general | |
01:12:49.440 --> 01:12:58.480 | |
like there is a definition of right and wrong of course after 25 years in the field i have learned | |
01:12:58.480 --> 01:13:05.600 | |
that it's not that clear cut uh but at least there was the definition and i mean when i was at university | |
01:13:05.600 --> 01:13:12.160 | |
where people doing things like human computer interaction with like a visualization or like | |
01:13:12.160 --> 01:13:18.800 | |
things that are quite hard to actually evaluate in a scientific way but you cannot it's you have | |
01:13:18.800 --> 01:13:24.240 | |
to run like a user study you're you know asking 200 people what they think which which which plot | |
01:13:24.240 --> 01:13:30.080 | |
they find prettier it's not very hard science i mean it is science of course i don't want to disparage | |
01:13:30.080 --> 01:13:36.400 | |
but it's not something that i want to do uh i i like this idea of there's being like a there's a | |
01:13:36.400 --> 01:13:42.480 | |
test you run the test you see who is who's better at least on this specific test under these specific | |
01:13:42.480 --> 01:13:50.640 | |
circumstances so so i think that um yeah that's what i really liked about databases uh why the cult following | |
01:13:50.640 --> 01:13:58.160 | |
i think i think making the user experience great is something that has not occurred to anyone in databases yet | |
01:13:58.160 --> 01:14:03.680 | |
i think that i think we may be the first people to actually care about user experience | |
01:14:04.320 --> 01:14:10.560 | |
um it's not entirely fair because there's other vendors that but certainly like open source systems | |
01:14:12.080 --> 01:14:16.880 | |
i don't know there's there's not a lot there there's there's there's people in the no sql world | |
01:14:16.880 --> 01:14:23.040 | |
that have gotten it right uh which is where we took some inspiration from this sqlite who got it right | |
01:14:23.040 --> 01:14:26.640 | |
right yeah they they are the world's most widely used database system for a reason | |
01:14:27.920 --> 01:14:34.160 | |
while being still extremely weird uh in other senses but uh but i think that if there is a cult | |
01:14:34.160 --> 01:14:41.840 | |
following i'm you know i'm happy to hear it but um uh then i think it is because we really put the | |
01:14:41.840 --> 01:14:48.320 | |
user first and not the and not our orthodoxy it's it's this needs to this needs to be nice to this | |
01:14:48.320 --> 01:14:53.760 | |
needs to be easy to use it needs to be nice to use it needs to make you you know solve a problem and | |
01:14:53.760 --> 01:15:00.400 | |
not not not satisfy our academic curiosity or our need to be the fastest because in the end right | |
01:15:00.400 --> 01:15:05.440 | |
like let's say we win on a benchmark say we run this big huge benchmark we optimize we spend a year | |
01:15:05.440 --> 01:15:11.280 | |
optimizing for it and we go out with big fanfare we've you know like other database vendors have | |
01:15:11.280 --> 01:15:18.160 | |
recently and say we we beat tpcds scale factor so and so and you know screw these others who are slower | |
01:15:19.040 --> 01:15:25.120 | |
is it gonna make the life of a single one of our users better i don't think so if we spend the | |
01:15:25.120 --> 01:15:29.040 | |
same amount of time on like the csv reader is that gonna make the life of users better absolutely | |
01:15:29.920 --> 01:15:34.400 | |
so it's clear where we have to go right we we're not gonna we're not gonna run a giant | |
01:15:34.400 --> 01:15:41.360 | |
benchmarking campaign it makes sense yeah i think that i think that's i would hope that's why people | |
01:15:41.360 --> 01:15:50.160 | |
like ducktb not us ducktb closing out like what are you excited about over the next year or two | |
01:15:50.160 --> 01:15:59.120 | |
yeah um well uh i really like this uh extension ecosystem that we started building yeah so for | |
01:15:59.120 --> 01:16:04.480 | |
those who don't know um we basically made a pip for ducktb that's kind of built into the system you | |
01:16:04.480 --> 01:16:09.920 | |
can just install extensions that can add features like they can add new file formats they can add new | |
01:16:09.920 --> 01:16:18.320 | |
functions it can add new soon it's going to be able to add new syntax um and i think that's really | |
01:16:18.320 --> 01:16:27.120 | |
where i think the project needs to go is become this runtime fabric i don't know host for uh the | |
01:16:27.120 --> 01:16:34.560 | |
cambrian explosion of creativity from people that you know are in you know that care about one specific | |
01:16:34.560 --> 01:16:39.040 | |
subset of it like we have people in geospatial that are doing cool things there yeah there's people in | |
01:16:39.040 --> 01:16:42.880 | |
like con compatibility with other systems that are cool things there there's people doing | |
01:16:43.440 --> 01:16:47.760 | |
file formats you know all all sorts of crazy things somebody i think made an xml reader | |
01:16:48.640 --> 01:16:53.920 | |
great right like that the world the world has xml files in it regret it almost swing back at some | |
01:16:53.920 --> 01:17:02.720 | |
point um so that's i think what i'm really excited about obviously uh to see where this can go and | |
01:17:03.600 --> 01:17:08.880 | |
i think enabling people building infrastructure is always going to be uh the winning the winning | |
01:17:08.880 --> 01:17:14.240 | |
sort of thing to do and if that is not nothing to do with like any sort of you know commercial interest | |
01:17:14.240 --> 01:17:21.440 | |
it's just that we really care about how what what is the state of the world in terms of how people look | |
01:17:21.440 --> 01:17:29.680 | |
at data and uh i want people to just be comfortable leave like wrangling data and not be like terrified of it | |
01:17:29.680 --> 01:17:36.880 | |
so it's awesome yeah thanks man it's good to chat with you finally and good to finally meet i know | |
01:17:36.880 --> 01:17:41.520 | |
that we've been in each other's orbit for a bit and uh yeah thanks thanks so much uh joe for the for | |
01:17:41.520 --> 01:17:50.240 | |
the you know for the nice chat and uh in us in paris awesome all right well uh au revoir or uh | |
01:17:52.000 --> 01:17:53.360 | |
all right | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment