Skip to content

Instantly share code, notes, and snippets.

@infectious
Created February 27, 2013 10:19
Show Gist options
  • Save infectious/5046902 to your computer and use it in GitHub Desktop.
Save infectious/5046902 to your computer and use it in GitHub Desktop.
ETL question:
how do I transform an extract to add multiple output rows for each input row? For example, in the following there in an array within each row extracted and I want each array member to contribute to a separate row.
eg, Here is one one row of an extract. The array called 'splits' within it has two members (each a hash).
{:name=>"segment_feed", :hour=>"2013_02_05_19", :timestamp=>"20130205204446",
:splits=>[{"part"=>"0", "status"=>"new", "checksum"=>"3980ec0b30f78e15782df5dc29ec89e4"},
{"part"=>"1", "status"=>"new", "checksum"=>"fec249e666448b236ea6a4367563ccd6"}]}
I want the following two rows in the load (as many rows as there are split parts):
{:name=>"segment_feed", :hour=>"2013_02_05_19", :timestamp=>"20130205204446", :split=>"0"}
{:name=>"segment_feed", :hour=>"2013_02_05_19", :timestamp=>"20130205204446", :split=>"1"}
What is the general trick I need to perform in the transform?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment