Skip to content

Instantly share code, notes, and snippets.

@compor
Created June 4, 2014 12:53
Show Gist options
  • Save compor/302673e3eeb5f402c223 to your computer and use it in GitHub Desktop.
Save compor/302673e3eeb5f402c223 to your computer and use it in GitHub Desktop.
associates 2 columns of data using the first one's elements as key
#!/bin/bash
# parses an unsorted document with 2 columns :
#
# ip date
#
# where same ip's might appear multiple times and even the exact same ip date pair
# and outputs
#
# ip | # of occurences | list of dates of occurences
#
# each ip appears only once in the output (key) and multiple exactly identical ip-date tuples from the
# initial list are counted in only once
#
awk '{
if( 0 == match( date[ $1 ], $2 ) ) {
occurences[ $1 ] = occurences[ $1 ] + 1
date[ $1 ] = $2" "date[ $1 ]
}
}
END {
for( ip in date )
printf( "%s\t|\t%d\t|\t%s\n", ip, occurences[ ip ], date[ ip ] );
}'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment