Skip to content

Instantly share code, notes, and snippets.

@bobbzorzen
Created December 19, 2013 09:16
Show Gist options
  • Save bobbzorzen/8036559 to your computer and use it in GitHub Desktop.
Save bobbzorzen/8036559 to your computer and use it in GitHub Desktop.
A Sed assignment
1a <q1dataclean.txt sed -n 's/(.*\/\([A-Z][a-z]\{2\}[0-9]\{4\}AD\)[0-9A-Z].*>\([A-Z][a-z]\{2\}[0-9]\{4\}AD\))/({2012AD\21944AD\1})/p;s/(.*>\([A-Z][a-z]\{2\}[0-9]\{4\}AD\))/({2012AD\11944AD})/p;s/(.*\/\([A-Z][a-z]\{2\}[0-9]\{4\}AD\)[0-9A-Z].*)/({2012AD1944AD\1})/p;s/(\/.*)/({2012AD1944AD})/p'
1b I prayed to the sed gods to enlighten me, and so it was. The sed MASTER was born. Or maybe i just watched a 40m youtube tutorial to learn sed and then spend 3h ripping my hair out followed by another 2h of experimenting with regex, backreferencing, nagging at the teacher and then some comparing my result with the proper hits file.
2a < q2dataclean.txt sed -n 's/^\(0x[a-z0-9]\{3\}\)\([1-9][0-9]\{0,1\}:[0-9]\{1,2\}[amp]\{2\}\).*/<([Jane|Alice|\2|Mary|\1])>/p ; s/^\(0x[a-z0-9]\{3\}\)[^1-9].*/<([Jane|Alice||Mary|\1])>/p'
2b I spent aproximately 20-30 minutes identifying the replace pattern and then aproximately 1-2h constructing a pattern that i thougth would match. it turned out i had written some of my numerical character classes a bit wonky. my friends keen eye helped me identify the broken characterclasses and i then fixed them and all was well once more.
3a <q3dataclean.txt sed -n 's/^\([0-9]\{3\}\.[0-9]\{2\}\)\[([a-zA-Z]*\([0-9]\{3\}\.[0-9]\{2\}\).*/{{Gwen:\2:June:\1:Eve}}/p ; s/^[A-Z].*/{{Gwen::June::Eve}}/p'
3b I identified the pattern and then wrote the sed. for once it was that simple... i split in 2 part to match the 2 different patterns that occur.
4a <q4dataclean.txt sed -n 's/.*\([0-9]\{3\}\)\.\([0-9]\{3\}\)[0-9]\{4\}AD.*/[\212:49\1|in12:51amwhen]/p'
4b I'm starting to think that the first question was the hardest... i identified the pattern and wrote the sed expression to match it. i used wc -l to make sure i've matched all the expected rows.
5a <q5dataclean.txt sed -n 's/^{[a-zA-Z]*\([0-9]\{2\}\-[0-9]\{2\}\-[0-9]\{2\}\).*:\([0-9]\{1,2\}:[0-9]\{1,2\}[amp]\{2\}\):[0-9]\{1,2\}:[0-9]\{1,2\}[amp]\{2\}}$/Milly|Eve|\2|Eve|\1/p ; s/^.*:\([0-9]\{1,2\}:[0-9]\{1,2\}[amp]\{2\}\):[0-9]\{1,2\}:[0-9]\{1,2\}[amp]\{2\}}$/Milly|Eve|\1|Eve|--/p'
5b As before it was mearely to identify the pattern write the sed and then try and try again untill the correct pattern is completed.
6a <q6dataclean.txt sed -n 's/^[^$]*\(\$[0-9]\{2\}\.[0-9]\{2\}\)[a-z]*\([0-9]\{1,2\}:[0-9]\{2\}[amp]\{2\}\).*/*={\/\2\/\1\/fee\/(+=)*=}/p ; s/^[^$]*\(\$[0-9]\{2\}\.[0-9]\{2\}\)[a-z].*/*={\/\/\1\/fee\/(+=)*=}/p'
6b I wrote the expression i thought would work and then spent the next 30 min to debug it and find the places where i had forgotten a range or a backslash.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment