Skip to content

Instantly share code, notes, and snippets.

Last active September 18, 2022 10:10
Show Gist options
  • Save tsrivishnu/d92a34a36cdf4f4e11b16c9be34f2c5e to your computer and use it in GitHub Desktop.
Save tsrivishnu/d92a34a36cdf4f4e11b16c9be34f2c5e to your computer and use it in GitHub Desktop.
`sed` to extract matched strings from a file

NOTE: This gist wasn't given so much attention while writing. So please ignore any typos or meaningless sentences.

How to match a pattern and return only a part of the lines from a file.

One might run in a case when they need to extract some information from a log file and get that information into a CSV file.

This once happened to me to look into the logs for an email sendout worked to findout the ids of users to whom we sent the emails to for a particular day. Luckily, our email service prints out to the log, the user object and the type of the email that is sent to that user object. Each line in the log file looks like this

07-07-19 11:57:14 user_signup_confirmation "deliver_now" with options {:applicant=>#<Account id: 1234, email: "">}
07-07-19 12:01:56 password_reset_instructions "deliver_now" with options {:applicant=>#<Account id: 98, email: "">}
07-07-19 12:45:09 user_signup_confirmation "deliver_now" with options {:applicant=>#<Account id: 1432, email: "">}

and I need an output looking like this

user_signup_confirmation  User: 1234
user_signup_confirmation  User: 1432

Of course I could have written a ruby or php script file to do this for me by reading the file and printing to the console but it means some effort in writting the script by loading libraries etc.

This could be achieved easily with sed program that is usually available on all linux distributions in just one line. sed can execute a pattern substitution command on a file. The pattern substitution command could be set to print only the matches and that along with sed will help achieve it. The sed command looks like this

sed -n -e 's/.*\(user_signup_confirmation\).*Account id: \([0-9]*\).*/\1\tUser: \2/p' email_sendout_log.log
  • -n tells sed to not print anything by default
  • -e to execute a command on the file.
  • s/../../p is the substitution command
    • the first part between /'s is the pattern to match.
    • the second part is the replacement string. Note the \( and \), they will match the text in the first part. Those matched strings are made available as \1, \2 and so on, to use in the replacement like shown in the command above.
    • /p to print the substitution result.


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment