Skip to content

Instantly share code, notes, and snippets.

@t9md
Last active February 1, 2019 06:33
Show Gist options
  • Save t9md/fca4164ea0fbb496f7a0efbdbb605099 to your computer and use it in GitHub Desktop.
Save t9md/fca4164ea0fbb496f7a0efbdbb605099 to your computer and use it in GitHub Desktop.
extract field since awk is very awkward for me.
#!/usr/bin/env ruby
require 'optparse'
require "pp"
class CLI
def parse_options(argv = ARGV)
op = OptionParser.new
self.class.module_eval do
define_method(:usage) do |msg = nil|
puts op.to_s
puts "error: #{msg}" if msg
exit 1
end
end
# default value
opts = {
report: false,
split: "\t",
join: "\t",
fields: [],
}
op.on('-r', '--report', "Report field configuration from very 1st line. (default: #{opts[:report]})") {|v| opts[:report] = v }
op.on('-s', '--split VALUE', "string value (default: #{opts[:split].inspect})") {|v|
opts[:split] = v.gsub('\\t', "\t")
}
op.on('-j', '--join VALUE', "string value (default: #{opts[:join].inspect})") {|v|
opts[:join] = v.gsub('\\t', "\t")
}
op.on('-f', '--fields one,two,three', Array, "fields to extract (default: #{opts[:fields]})") {|v| opts[:fields] = v }
begin
args = op.parse!(argv)
rescue OptionParser::InvalidOption => e
usage e.message
end
[opts, args]
end
def run
opts, args = parse_options
if (opts[:report])
puts "opts: #{opts.to_s}"
puts "args: #{args.to_s}"
puts
ARGF.each do |line|
line.chomp.split(opts[:split]).each_with_index do |e, idx|
puts "%2d: #{e}" % (idx + 1)
end
break
end
puts
return
end
fields_to_extract = opts[:fields].map {|n| n.to_i }
ARGF.each do |line|
split = line.chomp.split(opts[:split])
extracted = fields_to_extract.map {|n| split.values_at(n-1) }
puts extracted.join(opts[:join])
end
end
end
CLI.new.run

help

$ ruby extract-fields.rb -h
Usage: extract-fields [options]
    -r, --report                     Report field configuration from very 1st line. (default: false)
    -s, --split VALUE                string value (default: "\t")
    -j, --join VALUE                 string value (default: "\t")
    -f, --fields one,two,three       fields to extract (default: [])

Now explain with this sample.txt each fields are separated with tab(\t) char.

$ cat sample.txt
foo1	foo2	foo3
bar1	bar2	bar3
baz1	baz2	baz3

-r is useful to check field number from very 1st line.

$ cat sample.txt | ruby extract-fields.rb -r
opts: {:report=>true, :split=>"\t", :join=>"\t", :fields=>[]}
args: []

 1: foo1
 2: foo2
 3: foo3
 

Now lets' extract field 1 and 3.

$ cat sample.txt | ruby extract-fields.rb -f 1,3
foo1	foo3
bar1	bar3
baz1	baz3

This time, I extract 1 and 3, but custom order.

$ cat sample.txt | ruby extract-fields.rb -f 3,1
foo3	foo1
bar3	bar1
baz3	baz1

With -j option, I can join fields with custom string, here I use --.

$ cat sample.txt | ruby extract-fields.rb -f 3,1 -j '--'
foo3--foo1
bar3--bar1
baz3--baz1
$
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment