Skip to content

Instantly share code, notes, and snippets.

@Remiii
Last active September 24, 2018 10:50
Show Gist options
  • Save Remiii/5429429 to your computer and use it in GitHub Desktop.
Save Remiii/5429429 to your computer and use it in GitHub Desktop.
AWK - split XML subnodes into separate files
# This awk script splits an XML file and exports X nodes into separate file
# It doesn't work if the main node has an attribute
# It doesn't work if the main node contains a subnode with the same tagname, etc.
# Use at your own risk, think before using.
# Usage:
# - Replace "myNumberOfNodesByOutputFile" by your number of nodes by output file
# - Replace "myChildNode" by your tag
# - Replace "myParentNode" by your parent tag
# Run:
# $ awk -f xml-split.awk myXML.xml
/<myNode>/ {
rfile="./out/part-" count ".xml"
print "<?xml version=\"1.0\" encoding=\"UTF-8\"?>" > rfile
print "<myEOF>" > rfile
print $0 > rfile
for ( iLoop=0 ; iLoop<myNumberOfNodesByOutputFile ; iLoop++ )
{
getline
while ( ( $0 !~ "<\/myChildNode>" ) && ( $0 !~ "<\/myParentNode>" ) )
{
print > rfile
getline
}
if ( $0 !~ "<\/myParentNode>" )
{
print $0 > rfile
}
}
print "<\/myParentNode>" > rfile
close(rfile)
count++
}
@jbelmaro
Copy link

it works great! thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment