Skip to content

Instantly share code, notes, and snippets.

@jpzhu
Created June 13, 2018 01:31
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jpzhu/0ee88871fd62b672bcf3477c45e37eac to your computer and use it in GitHub Desktop.
Save jpzhu/0ee88871fd62b672bcf3477c45e37eac to your computer and use it in GitHub Desktop.
sort, keep order
cat -n file | sort -k2 | uniq -f1 | sort -k1 | cut -f2-
How it works:
On a GNU system, cat -n will prepend the line number to each line following some amount of spaces and followed by a <tab> character. cat pipes this input representation to sort.
sort's -k2 option instructs it only to consider the characters from the second field until the end of the line when sorting, and sort splits fields by default on white-space (or cat's inserted spaces and <tab>).
When followed by -k1n, sort considers the 2nd field first, and then secondly—in the case of identical -k2 fields—it considers the 1st field but as sorted numerically. So repeated lines will be sorted together but in the order they appeared.
The results are piped to uniq—which is told to ignore the first field (-f1 - and also as separated by whitespace)—and which results in a list of unique lines in the original file and is piped back to sort.
This time sort sorts on the first field (cat's inserted line number) numerically, getting the sort order back to what it was in the original file and pipes these results to cut.
Lastly, cut removes the line numbers that were inserted by cat. This is effected by cut printing only from the 2nd field through the end of the line (and cut's default delimiter is a <tab> character).
To illustrate:
$ cat file
bb
aa
bb
dd
cc
dd
aa
bb
cc
$ cat -n file | sort -k2 | uniq -f1 | sort -k1 | cut -f2-
bb
aa
dd
cc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment