Skip to content

Instantly share code, notes, and snippets.

@jclosure
Created February 3, 2015 03:26
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jclosure/c852630afc8d77a3845d to your computer and use it in GitHub Desktop.
Save jclosure/c852630afc8d77a3845d to your computer and use it in GitHub Desktop.
CLI Big Data Hacks
# Count the number of lines in a file
cat file.txt | wc -l
# Jump to a line with less
less +392800g file.txt
# View the top of a file (eg. view the column headers of a csv file)
head file.txt
# Grab the first 8 fields of a delimeted file and save to another
cut -d',' -f1-8 ./file.csv > file2.csv
# Split a large file (1048576 is Excel's max rows)
split -l 1048576 ./file.csv segment
# reverse the lines in a file on the cli
gawk '{ L[n++] = $0 }
END { while(n--)
print L[n] }' file.txt
# Delete a line from a file by line number
# Example:
$ grep -n foo file.txt
1:foo
711835:foo
sed -i.bak -e '711835d' file.txt
$ grep -n foo file.txt
1:foo
#If you want to delete lines 5 through 10 and 12:
sed -e '5,10d;12d' file.txt
#This will print the results to the screen. If you want to save the results to the same file:
sed -i.bak -e '5,10d;12d' file.txt
#This will back the file up to file.bak, and delete the given lines.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment