Skip to content

Instantly share code, notes, and snippets.

@andytill
Last active January 11, 2017 09:00
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save andytill/28944d0744a5487b68c6311836a81531 to your computer and use it in GitHub Desktop.
Save andytill/28944d0744a5487b68c6311836a81531 to your computer and use it in GitHub Desktop.
basho-sort-blog.md

Row Sorting Features in Riak TS 1.5

Riak TS 1.5 introduces two new features for retrieving rows in a sort order other than the order that LevelDB naturally stores keys. The first is a hint to the storage engine about how the keys for a table should be stored disk. The second is a traditional SQL ORDER BY clause that works in the same as you'd expect from a relational database.

LevelDB Sort Order

LevelDB is the backend storage engine for Riak TS. It is useful to understand how rows are sorted by the local key in LevelDB, without the new features. We will use the table and data below for the first example, note that columns a and b are in the partition key, and the local key contains columns a, b and importantly c which is additional to the partition key.

The partition key's hash determines which partition the rows get written to in the cluster. The local key is used to store the value on disk, and is stored in sorted order. Each row is stored against a unique local key.

CREATE TABLE table1(
    a VARCHAR NOT NULL  ,
    b TIMESTAMP NOT NULL,
    c VARCHAR NOT NULL,
    PRIMARY KEY((a,QUANTUM(b,1,'m')),a,b,c)
);
INSERT INTO table1 VALUES
    ('dby', '2017-01-08 11:05:51','a'),
    ('dby', '2017-01-08 11:05:51','b'),
    ('dby', '2017-01-08 11:05:52','b'),
    ('dby', '2017-01-08 11:05:53','c');

With table1 created, we can run the following query in the shell. It covers all of the rows that were inserted. Rows are returned in the order that they are stored in LevelDB. Insertion order does not affect how keys are stored on disk.

SELECT * FROM table1 WHERE a = 'dby' AND b >= '2017-01-08 11:05:51' AND b <= '2017-01-08 11:05:53';

+---+--------------------+-+
| a |         b          |c|
+---+--------------------+-+
|dby|2017-01-08T11:05:51Z|a|
|dby|2017-01-08T11:05:51Z|b|
|dby|2017-01-08T11:05:52Z|b|
|dby|2017-01-08T11:05:53Z|c|
+---+--------------------+-+

LevelDB sorts by column so rows are sorted by column a, then all rows with the same column a value are sorted by column b starting with the lowest value and so on. VARCHAR and BLOB values are sorted by byte values smallest to largest e.g. 0x01 before 0x02 and 'a' before 'b'.

Crafting a local key that stores keys in the order required by your queries will provide the lowest overhead on writes and queries.

Changing the Storage Order

In Riak TS 1.5 we don't have to settle for LevelDBs default sort order. Adding DESC to a column name in the local key will reverse the order that the rows are stored in for that table.

This makes queries equally as fast as those on the table in the first example because the data on disk is already in the correct order, and no additional work has to be done. A negligible amount of work has to be done on writes to create the local key, for integer and timestamp columns the value is negated and for varchar and blob columns each byte is negated so that 255 becomes 0.

This table is similar to the previous one but applies the DESC keyword to column b in the local key which is the timestamp.

CREATE TABLE table2(
    a VARCHAR NOT NULL,
    b TIMESTAMP NOT NULL,
    c VARCHAR NOT NULL,
    PRIMARY KEY((a,QUANTUM(b,1,'m')),a,b DESC)
);
INSERT INTO table2 VALUES
    ('dby', '2017-01-08 11:05:51','a'),
    ('dby', '2017-01-08 11:05:51','b'),
    ('dby', '2017-01-08 11:05:52','b'),
    ('dby', '2017-01-08 11:05:53','c');
SELECT * FROM table2
WHERE a = 'dby' AND b >= '2017-01-08 11:05:51' AND b <= '2017-01-08 11:05:53';

+---+--------------------+-+
| a |         b          |c|
+---+--------------------+-+
|dby|2017-01-08T11:05:53Z|c|
|dby|2017-01-08T11:05:52Z|b|
|dby|2017-01-08T11:05:51Z|b|
+---+--------------------+-+

This method of sorting is preferable if the queries are known when the table is being designed. By sorting when the data is written, which occurs once, Riak TS does not have to sort per query using ORDER BY.

Sorting with ORDER BY

It is possible that this will not cover your sorting requirements. In this case we have a flexible ORDER BY clause that allows ascending and descending sorting on any column name that is defined in the table.

Using table1 and data from the first example we can create a query to order the rows by column c in descending, greatest first order.

SELECT * FROM table1
WHERE a = 'dby' AND b >= '2017-01-08 11:05:51' AND b <= '2017-01-08 11:05:53'
ORDER BY c DESC;

+---+--------------------+--+
| a |         b          |c |
+---+--------------------+--+
|dby|2017-01-08T11:05:53Z|c |
|dby|2017-01-08T11:05:51Z|b |
|dby|2017-01-08T11:05:52Z|b |
|dby|2017-01-08T11:05:51Z|a |
+---+--------------------+--+

When a query with an ORDER BY clause is issued, an instance of LevelDB is created on the node that received the request. When the query results are retrieved by that node, they are put into the temporary LevelDB instance and a key is created for each row using the ORDER BY clause. When DESC is applied to a column in the ORDER BY clause. The sort order is the same as described in the LevelDB Sort Order section and the value "negation" for local keys is used, described in the Changing the Storage Order section.

The required flexibility of ORDER BY means that the sorting is done as part of the query and you may notice an increase in latency.

Conclusion

We've gone through three ways that you can get results from Riak TS in the order that your application requires, each with different trade offs and performance.

Get Started Now with Riak TS 1.5
  • Download Open Source
  • Sign In to download Enterprise Edition
  • Install on a supported distribution
  • Use Riak Shell to execute the example statements and explore the new features.
@hmmr
Copy link

hmmr commented Jan 4, 2017

Can you clarify on the interactions between DESC in ORDER BY and DESC in DDL (or rather, on the lack thereof)? That is, given a DDL where column X appears in the primary key definition with a DESC qualifier, a SELECT .. ORDER BY _X_ (where ASC is implied) will still have X ASC-ordered. This detail may or may not be obvious.

@paegun
Copy link

paegun commented Jan 4, 2017

  1. w/i the introductory paragraph, the order of topics to be presented is given as "...first...ORDER BY...." and "...second...storage engine....", but the order of the topics presented is the reverse of that. Presenting storage (natural) order first is good as ORDER BY is a query-time sort function incurred on each query using ORDER BY.
  2. The presentation of the topics from natural order to augmented query order (ORDER BY) leads well to the table design subtopic reminding the reader of the importance of setting the natural order in a way that optimizing for write as well as for the various queries that will be issued against the table.
  3. I believe the table design subtopic should be lifted out as a topic since there are several points herein.
  4. Default and altered natural order IMHO are subtopics of natural order.
  5. w/i the natural order topic, the information presented wrt how leveldb is used to achieve minimal impact on write performance while providing much improved read performance (not requiring an ORDER BY) is very well presented IMO.
  6. w/i the default natural order topic, the mention of partition key is good, but IMO should include a brief paragraph explaining that natural order has no effect on partition key as the partition key elements are used to determine the vnode ownership of the rows, a feature which has no order component.
  7. w/i the default natural order topic, the example includes the insertion of a row that has 'ab' for column 'c', but the query excludes this row via the predicate. I believe simply excluding the insert of the row w/ 'ab' is best since that makes the examples in ascending and descending order coherent. However, the descending order example does not demonstrate order priority.
  8. w/i the default natural order topic, the example only demonstrates ORDER BY order priority via columns 'b' and 'c'. Due to the requirement that non-timestamp fields of the partition key be equality matched, not range nor allowing for multiple ranges via OR, we can not demonstrate that the order progression would also work for column 'a'. IMO we should mention this fact as viewing this from a SQL perspective, this gap in functionality is apparent. It is a worthwhile tradeoff to require that the overall solution gains the AP and massive scale characteristics of Riak at the expense of requiring multiple queries from the application layer to achieve this feature. If we have a blog or documentation that already explains this, we should reference that here.
  9. w/i the altered natural order topic, there is a typo 'firt' which should be 'first'.
  10. w/i the query order topic, IMHO "It is possible that this will not cover your sorting requirements.", should be expanded to state "While, natural order should cover the most prominent query cases, it is possible that some queries will not be covered by natural order, so require query-time ordering."

@gordonguthrie
Copy link

Is it worth saying "you can copy and paste these examples into riak-shell for your self" - and then have someone test them, obvs ;-)

@andytill
Copy link
Author

andytill commented Jan 5, 2017

@gordonguthrie there is a line under Getting Started at the bottom that says the example statements can be executed using Riak Shell, does it need more?

@andytill
Copy link
Author

andytill commented Jan 5, 2017

@paegun going to have to go through these one at a time! github doesn't seem to allow me to link to the diffs directly, you should be able to make them out in the list of revisions though, I have tackled them in order.

  1. fixed
  2. For 2 and 3 are you suggesting another topic to be covered in another blog?
  3. I gave it it's own three hash section because all three variants have their own sorting rules, and it was a large (code at least) feature in 1.5.
  4. thanks!
  5. yes maybe, it is hard to show that without showing what partitions the sub queries hash to using the EXPLAIN statement.
  6. deleted this row. I was planning to show sort behaviour between varchar lengths but realised it would get very repetitive.
  7. I don't know if we do explain this and the reasons why.
  8. fixed
  9. I didn't use the word "most" because I don't know what the use cases are, do you think this sentence is too negative?

@paegun
Copy link

paegun commented Jan 5, 2017

@andytill Yep, the revisions clearly show the changes. And of the longish list of review notes, only topic hierarchy and typos remain.

For 2, 3, and 4, I'm not suggesting another blog, just that the hierarchy of topics could be (read: as I see it :p ):

Row Sorting Features in Riak TS 1.5

Natural Order

LevelDB (default) Sort Order

Changing the Storage Order

Query Order

Sorting with ORDER BY

Conclusion

Get Started Now with Riak TS 1.5

A couple more typos remain:

  • firt
  • LevdelDB

@andytill
Copy link
Author

andytill commented Jan 5, 2017

Fixed the typos.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment