TL;DR - I think I have an answer that I'm sort of okay with ... and I might have decided that I need to write up a bug regarding a regression.
The whole thing has gotten a bit complicated - so you can skip it if you like...
If you consider nodetool getendpoints
as a tool in our toolkit, we'd like
know how to use it on realistic data. The examples in the exercises were just
integer
types, and not something a bit more complex.
So, here's what we need to give nodetool
:
$ ./bin/nodetool -h ... getendpoints - Print the end points that owns the key ...
But how do pass <key>
here when you have a partition key, like musicdb.album
?
The DDL for album
is:
CREATE TABLE album (
title text,
year int,
genre text,
performer text,
tracks map<int, text>,
PRIMARY KEY ((title, year))
)
That's the major spirit of the question.
How about we assume that we want the location for the replica set for the 1972 album released by Elvis Presley, "Elvis Sings Hits From His Movies, Volume 1".
How do I passed that to nodetool getendpoints
?
I did some wild Googling, and I found a blog
from Last Pickle about the PRIMARY KEY
statement.
For a given table events
CREATE TABLE events (
device_id int,
year_month int,
sequence timestamp,
pressure int,
temperature int,
is_dam_dirty_apes boolean,
PRIMARY KEY ((device_id, year_month), sequence)
);
They peeked under the covers with cassandra-cli
and showed the "RowKey" and
then used that value in nodetool getendpoints
as the <key>
agrument:
[default@dev] list events;
Using default limit of 100
Using default column limit of 100
-------------------
RowKey: 2:201302
=> (column=2013-02-20 10\:58\:40+1300:, value=, timestamp=1357869160739000)
=> (column=2013-02-20 10\:58\:40+1300:is_dam_dirty_apes, value=01, timestamp=1357869160739000)
=> (column=2013-02-20 10\:58\:40+1300:pressure, value=000011d0, timestamp=1357869160739000)
=> (column=2013-02-20 10\:58\:40+1300:temperature, value=00000015, timestamp=1357869160739000)
-------------------
$ bin/nodetool -h 127.0.0.1 -p 7100 getendpoints dev events 2:201302
127.0.0.2
Our table album
, the partition key is title,year
.
CREATE TABLE album (
title text,
year int,
genre text,
performer text,
tracks map<int, text>,
PRIMARY KEY ((title, year))
)
If a title
is simple, without spaces - it seems to work like this:
$ ./nodetool -p 7100 getendpoints musicdb album Pinkerton:1996
127.0.0.1
I know that "RowKey" is Pinkerton:1996
from cassandra-cli
.
But ... what about spaces in our title?
Considering that I know the "RowKey" is: Elvis Sings Hits From His Movies, Volume 1:1972
This works in the lo-fi world Thrift (like you might expect):
[default@musicdb] get album['Elvis Sings Hits From His Movies, Volume 1:1972'];
=> (name=, value=, timestamp=1410900116143001)
=> (name=genre, value=526f636b, timestamp=1410900116143001)
=> (name=performer, value=456c76697320507265736c6579, timestamp=1410900116143001)
=> (name=tracks:00000001, value=446f776e20627920746865205269766572736964652f5768656e20746865205361696e747320476f206d61726368696e6720496e, timestamp=1410900116143001)
=> (name=tracks:00000002, value=546865792052656d696e64204d6520746f6f206d756368206f6620596f75, timestamp=1410900116143001)
=> (name=tracks:00000003, value=436f6e666964656e6365, timestamp=1410900116143001)
=> (name=tracks:00000004, value=4672616e6b696520616e64204a6f686e6e79, timestamp=1410900116143001)
=> (name=tracks:00000005, value=477569746172204d616e, timestamp=1410900116143001)
=> (name=tracks:00000006, value=4c6f6e672d4c6567676564204769726c, timestamp=1410900116143001)
=> (name=tracks:00000007, value=596f7520446f6e2774204b6e6f77204d65, timestamp=1410900116143001)
=> (name=tracks:00000008, value=486f7720576f756c6420596f75204c696b6520746f204265, timestamp=1410900116143001)
=> (name=tracks:00000009, value=42696720426f7373204d616e, timestamp=1410900116143001)
=> (name=tracks:0000000a, value=4f6c64204d6163446f6e616c64, timestamp=1410900116143001)
Returned 13 results.
Elapsed time: 2.01 msec(s).
HOWEVER ... if does not translate to the command-line with nodetool
:
student@cascor:~/cassandra$ ./bin/nodetool -p 7100 getendpoints musicdb album "Elvis Sings Hits From His Movies, Volume 1:1972"
./bin/nodetool: 61: [: Elvis: unexpected operator
getendpoints requires ks, cf and key args
It seems that this was written up and fixed at one point:
Again, like "Pinkerton", we can use a key if no spaces against (like albums_by_genre
):
student@cascor:~/cassandra$ ./bin/nodetool -p 7100 getendpoints musicdb albums_by_genre "Punk"
127.0.0.1
But, anything with a space, we're DOOMED!
student@cascor:~/cassandra$ ./bin/nodetool -p 7100 getendpoints musicdb albums_by_genre "Middle Eastern"
./bin/nodetool: 61: [: Middle: unexpected operator
getendpoints requires ks, cf and key args
I did notice in CASSANDRA-4551
the help text for nodetool
used to ask for
the <key>
in HEX format. And, today, when I saw the token()
I thought I'd
try that out:
cqlsh:musicdb> SELECT title, token(title, year)
FROM album
WHERE
title = 'Elvis Sings Hits From His Movies, Volume 1' AND
year = 1972;
title | token(title, year)
--------------------------------------------+---------------------
Elvis Sings Hits From His Movies, Volume 1 | 9124020880974048405
(1 rows)
Which I think works!
$ ./bin/nodetool -p 7100 getendpoints musicdb album 9124020880974048405
127.0.0.1
Funny thing is, being a Bootcamp student, I am not 100% sure that this has
identified the node. But, given that token()
gives us, well, a token
we can devine from nodetool ring
if the value is owned by a node.
Putting my customer hat on for a moment (as an OPS/admin person) ... it seems
like having to run a query to get a token to tell me the endpoint is a bit of
run around (feels like M*A*S*H 4077
storyline to me!).