Skip to content

Instantly share code, notes, and snippets.

@danostrowski
Last active December 15, 2015 13:29
Show Gist options
  • Save danostrowski/5267421 to your computer and use it in GitHub Desktop.
Save danostrowski/5267421 to your computer and use it in GitHub Desktop.
Relatively small Riak pre-commit hook written in Erlang (for me) by @Vagabond and @evanmcc.
% Purpose: I use this pre-commit hook to mark objects in a bucket as "dirty" with secondary indexing.
% I then use a script to scrape out all dirty objects, do some processing, then save them with
% "dirty_bin = false" as an index and the pre-commit hook erases the "dirty_bin" index.
% So in essence it works as: `if dirty_bin = false; del dirty_bin; else dirty_bin = true; end`
%
% To install this pre-commit hook (just like any Riak pre-commit hook in Erlang), you need to create an Erlang file and
% put it in your "basho-patches" directory. For me, on Ubuntu, this was "/usr/lib/riak/lib/basho-patches".
% Once there, you need to compile it to a .beam file. This was helped by using the Riak provided erlc compiler,
% which, on my Ubuntu system, was at "/usr/lib/riak/erts-5.8.5/bin/erlc"
%
% Once you do this and *restart Riak*, then your library should be available to Riak and you can install it
% by modifying the properties of the bucket you'd like the pre-commit hook to run on.
% See here: http://docs.basho.com/riak/latest/references/apis/http/HTTP-Set-Bucket-Properties/
%
% Essentially the bucket has properties, one property is called "precommit" and it ends up storing a list of dicts.
% So, in Python, you would do bucket.set_property('precommit', [{'fun': 'mark_dirty', 'mod': 'mylib'}])
%
% Once that is set, any object you save should invoke this pre-commit hook.
%
% So to recap: 1) .erl in basho-patches 2) Compile it 3) Restart Riak 4) Install bucket propery via API
%
% Couple of random things that I learned:
% 1. Once the .beam is in place, you can attach to the Riak console using `riak attach` and then you can actually
% call your function, if you like, using mylib:mark_dirty(...)
% 2. It appears that when you update the metadata in the console, if you don't save the object, it doesn't actually display
% with riak_object:get_metadata(). That is, calling mark_dirty() on an Obj, rp(Obj) will show the indexed data, but
% rp(riak_object:get_metadata(Obj)) will not.
% 3. You can reload your module (if you have made changes and recompiled it) through the Riak console by using `l(mylib).`
% (This is Erlang stuff, not Riak stuff.) In addition, if you are in the right path (which you can check with pwd()),
% you can compile with `c(mylib).`
% 4. A super helpful reference about the Riak objects in question is this file:
% https://github.com/basho/riak_kv/blob/master/src/riak_object.erl
%
-module(mylib).
-export([mark_dirty/1]).
mark_dirty(Obj) ->
%% Metadata is just a dict.
MD = riak_object:get_metadata(Obj),
%% Secondary indexes is a list in that dict. Should be there, but might not.
Indexes = case dict:find(<<"index">>, MD) of
error ->
%% no index data at all, so add "dirty" index
[{<<"dirty_bin">>, <<"true">>}];
{ok, I} ->
case lists:keyfind(<<"dirty_bin">>, 1, I) of
false ->
%% index not present, mark dirty
[{<<"dirty_bin">>, <<"true">>}|I];
{<<"dirty_bin">>, <<"false">>} ->
%% marked "clean" so delete the dirty index
lists:keydelete(<<"dirty_bin">>, 1, I);
{<<"dirty_bin">>, <<"true">>} ->
%% already marked dirty leave it alone
I;
_ ->
%% unexpected value, delete this clause if you want to just crash here
I
end
end,
riak_object:update_metadata(Obj, dict:store(<<"index">>, Indexes, MD)).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment