Skip to content

Instantly share code, notes, and snippets.

@yaauie
Last active March 8, 2021 21:03
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save yaauie/c716bdf33d6021c8c20b314fdc1d8390 to your computer and use it in GitHub Desktop.
Save yaauie/c716bdf33d6021c8c20b314fdc1d8390 to your computer and use it in GitHub Desktop.
###############################################################################
# extract-nested-set.logstash-filter-ruby.rb
# ---------------------------------
# A script for a Logstash Ruby Filter to extract nested keys from an array.
#
# This script has three required parameters:
# - `source`: a field reference to the source array
# - `search`: a sub-field in each entry of the source array
# - `target`: a field reference to the target array
# And two optional parameters:
# - `coerce`: when encountering non-array values in either source or target
# fields, setting `coerce => true` causes these fields to first
# be converted to a single-entry array.
# - `unique`: when encountering a value that is already in the target array,
# it will be duplicated unless setting `unique => true`
###############################################################################
#
# Copyright 2021 Ry Biesemeyer
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
def register(params)
params = params.dup # isolate
@source = extract_required_string(params, 'source')
@target = extract_required_string(params, 'target')
@search = extract_required_string(params, 'search')
@coerce = extract_boolean(params, 'coerce', default: false)
@unique = extract_boolean(params, 'unique', default: false)
params.empty? || report_configuration_error("unknown script parameter(s): #{params.keys}.")
end
def report_configuration_error(message)
raise LogStash::ConfigurationError, message
end
def extract_boolean(params, name, default: false)
value = params.delete(name) || default
case value
when true, 'true' then true
when false, 'false' then false
else report_configuration_error "invalid value for param `#{name}`: `#{value}` (expected `true` or `false`)"
end
end
def extract_required_string(params, name)
params.delete(name) || report_configuration_error("missing required param `#{name}`")
end
def filter(event)
timestamp = event.get('@timestamp')
return [event] unless event.include?(@source)
source = event.get(@source)
unless @coerce || source.kind_of?(::Array)
logger.debug("source field `#{@source}` not an array", :event => event.to_hash) if logger.debug?
return [event]
end
target = event.get(@target)
unless target.nil? || @coerce || target.kind_of?(::Array)
logger.debug("target field `#{@target}` not an array", :event => event.to_hash) if logger.debug?
return [event]
end
# to prevent partial failure, we clone the event and return the
# modified clone on success or the original unmodified event
# on failure
clone_event = LogStash::Event.new(event.to_hash_with_metadata)
source = clone_event.get(@source) || []
source = Array[source] if @coerce && !source.kind_of?(Array)
return [event] if source.empty?
target = clone_event.get(@target) || []
target = Array[target] if @coerce && !target.kind_of?(Array)
target = Set.new(target) if @unique
source.each do |entry|
target << entry.fetch(@search) if entry.include?(@search)
end
clone_event.set(@target, target.to_a) unless target.empty?
# success. cancel original and return modified clone
event.cancel
return [clone_event]
rescue => e
logger.error('failed to extract nested set of fields from event', exception: e.message, event: event.to_hash, source: @source, target: @target)
event.tag('_extractnestedset_error')
[event]
end

Suppose you had events with the following structure:

{
  "books": [
    {"author":"Alice",  "title":"Fields" },
    {"author":"Bob",    "title":"Oceans" },
    {"author":"Connie", "title":"Rivers" },
    {"author":"Connie", "title":"Lakes"  },
    {"author":"David",  "title":"Streams"},
    {"author":"Eunice", "title":"Creaks" },
  ]
}
filter {
  ruby {
    path => "${PWD}/extract-nested-set.logstash-filter-ruby.rb"
    script_params => {
      source => "[books]"
      search => "author"
      target => "authors"
      unique => true
      coerce => true
    }
  }
}

Would produce an event with:

{
  "books": [
    {"author":"Alice",  "title":"Fields" },
    {"author":"Bob",    "title":"Oceans" },
    {"author":"Connie", "title":"Rivers" },
    {"author":"Connie", "title":"Lakes"  },
    {"author":"David",  "title":"Streams"},
    {"author":"Eunice", "title":"Creaks" },
  ],
  "authors": ["Alice","Eunice","Bob","David","Connie"]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment