Created
June 2, 2012 07:03
-
-
Save lusis/2857059 to your computer and use it in GitHub Desktop.
experimental checksumming for logstash events as a filter
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
require "logstash/filters/base" | |
require "logstash/namespace" | |
require "yaml" | |
class LogStash::Filters::Checksum < LogStash::Filters::Base | |
config_name "checksum" | |
plugin_status "experimental" | |
# A list of keys to use in creating the string to checksum | |
# Keys will be sorted before building the string | |
# keys and values will then be concatenated with pipe delimeters | |
# and checksummed | |
config :keys, :validate => :array, :default => ["@message", "@source_host", "@timestamp", "@source_path", "@type", "@source"] | |
config :algorithm, :validate => ["md5", "sha128", "sha256", "sha384"], :default => "sha256" | |
public | |
def register | |
require 'openssl' | |
@to_checksum = "" | |
end | |
public | |
def filter(event) | |
return unless filter?(event) | |
@logger.debug("Running checksum filter", :event => event) | |
@keys.sort.each do |k| | |
@logger.debug("Adding key to string", :current_key => k) | |
@to_checksum << "|#{k}|#{event[k]}" | |
end | |
@to_checksum << "|" | |
@logger.debug("Final string built", :to_checksum => @to_checksum) | |
digested_string = OpenSSL::Digest.hexdigest(@algorithm, @to_checksum) | |
@logger.debug("Digested string", :digested_string => digested_string) | |
event.fields['logstash_checksum'] = digested_string | |
end | |
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
jvstratusmbp :: ~/development/logstash ‹master*› » ruby --1.9 bin/logstash -f logstash.conf | |
{:args=>["agent", "-f", "logstash.conf"]} | |
{:run=>"agent"} | |
{:remaining=>[]} | |
doneargs | |
Using experimental plugin 'checksum'. This plugin is untested and may change in the future. For more information about plugin statuses, see http://logstash.net/docs/1.1.0.1/plugin-status {"level":"warn"} | |
foobarbangbaz |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"@source":"stdin://jvstratusmbp.local/", | |
"@type":"stdin", | |
"@tags":[], | |
"@fields":{ | |
"logstash_checksum":"34092978fb4055baa980815768bddc5342d601b59b48cd147ee44447ceff6929" | |
}, | |
"@timestamp":"2012-06-02T07:00:48.045000Z", | |
"@source_host":"jvstratusmbp.local", | |
"@source_path":"/", | |
"@message":"foobarbangbaz" | |
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
input { stdin { type => 'stdin' } } | |
filter { checksum { } } | |
output { stdout { debug => true debug_format => "json" }} |
In general, I like this. In the future, we will do:
filter { checksum { ... } }
output { elasticsearch { document_id => "%{logstash_checksum}" } }
And 'dedup' will be fulfilled! ALSO THIS WILL PERMIT REINDEXING OF LOGS THAT FAILED PARSING WOO
(assuming the name 'checksum' stays, which it may not!)
this is awesome. I think we should include an option to specify the field name we put the checksum in (maybe defaul to @Checksum?).
- sorting keys makes sense
- I agree on making keys a list of fields.
Nice work (like everything else in Logstash). Thanks!
If you need to calculate hash for a single field, you could use the following ruby code:
ruby {
code => "event.to_hash.merge!('message_hash' => OpenSSL::Digest.hexdigest('md5', event['message']))"
}
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
@keys.sort.each ...
) Shouldn't the order be up to the user?filter { checksum { string => "%{@message} ..." } }
)