Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save russelldb/aa2a18b3988e880c6c252393c14a7efb to your computer and use it in GitHub Desktop.
Save russelldb/aa2a18b3988e880c6c252393c14a7efb to your computer and use it in GitHub Desktop.
Add field to Riak YZ Schema with CRDTs
This gist captures what needs to be done to add a new field to Riak's Yokozuna
search index.
Sources:
- https://github.com/basho/yokozuna/issues/130
- http://riak-users.197444.n3.nabble.com/How-to-update-existed-schema-td4032143.html
The code below is for illustration purposes only. Use at your own risk.
1. Create/Update new schema file
2. Upload schema to main node
cat schema/my_bucket.xml | curl -XPUT http://127.0.0.1:49001/search/schema/my_bucket -H 'Content-Type:application/xml' --data-binary @-
3. Reload YZ index on each node
a. individual rpc calls on each node:
rpc:block_call('riak@172.17.0.1', yz_index, reload, [<<"my_bucket">>]).
rpc:block_call('riak@172.17.0.2', yz_index, reload, [<<"my_bucket">>]).
rpc:block_call('riak@172.17.0.3', yz_index, reload, [<<"my_bucket">>]).
b. via multicall
rpc:multicall(['riak@172.17.0.1','riak@172.17.0.2','riak@172.17.0.3'], yz_index, reload, [<<"my_bucket">>]).
If all is well then you should get {ok, Nodes} where Nodes is the
list of nodes in your Riak cluster. If something goes wrong
you'll get {error, Errors} where Errors is a list of errors
for each node that had an error.
At this point any new data inserted is searchable. To get old data re-indexed
with new field definition, we need to read/write all keys in the bucket
18> {ok, Keys} = riakc_pb_socket:list_keys(Pid, {<<"my_bucket">>,<<"my_bucket">>}).
19> lists:foreach(fun(E) -> {ok, Post} =
{ok, M1} = riakc_pb_socket:fetch_type(Pid, {<<"my_bucket">>, <<"my_bucket">>}, E),
M2 = riakc_map:update({<<"some_field">>, set}, fun(S) -> riakc_set:add_element(<<"1">>, S), riakc_set:del_element(<<"1">>, S) end, M1),
riakc_pb_socket:update_type(Pid, {<<"my_bucket">>, <<"my_bucket">>}, E, riakc_map:to_op(M2)) end, Keys).
WARNING: the code above can wreck havoc on your cluster, esp. if you have gazillions
of keys. Think carefully. Unfortunately, this is the only way to achieve what we
need to do as of 06/26/2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment