Skip to content

Instantly share code, notes, and snippets.

@matpalm
Created July 3, 2010 02:46
Show Gist options
  • Save matpalm/462236 to your computer and use it in GitHub Desktop.
Save matpalm/462236 to your computer and use it in GitHub Desktop.
mat@ubishop:~/qwe$ cat test.input
blah de blah
cat goal dog
dum dum dum
mat@ubishop:~/qwe$ cat terms_in_lines_with_goal.rb
#!/usr/bin/env ruby
# if a line has 'goal' in it, write each term to a seperate line
STDIN.each do |line|
next unless line =~ /goal/
puts line.split.join("\n")
end
mat@ubishop:~/qwe$ cat test.input | ruby terms_in_lines_with_goal.rb
cat
goal
dog
mat@ubishop:~/qwe$ pig -x local
grunt> lines = load 'test.input' as (terms:chararray);
grunt> dump lines
(blah de blah)
(cat goal dog)
(dum dum dum)
grunt> terms = stream lines through `ruby terms_in_lines_with_goal.rb`;
grunt> dump terms;
(cat)
(goal)
(dog)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment