Created
February 3, 2015 03:26
-
-
Save jclosure/c852630afc8d77a3845d to your computer and use it in GitHub Desktop.
CLI Big Data Hacks
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Count the number of lines in a file | |
cat file.txt | wc -l | |
# Jump to a line with less | |
less +392800g file.txt | |
# View the top of a file (eg. view the column headers of a csv file) | |
head file.txt | |
# Grab the first 8 fields of a delimeted file and save to another | |
cut -d',' -f1-8 ./file.csv > file2.csv | |
# Split a large file (1048576 is Excel's max rows) | |
split -l 1048576 ./file.csv segment | |
# reverse the lines in a file on the cli | |
gawk '{ L[n++] = $0 } | |
END { while(n--) | |
print L[n] }' file.txt | |
# Delete a line from a file by line number | |
# Example: | |
$ grep -n foo file.txt | |
1:foo | |
711835:foo | |
sed -i.bak -e '711835d' file.txt | |
$ grep -n foo file.txt | |
1:foo | |
#If you want to delete lines 5 through 10 and 12: | |
sed -e '5,10d;12d' file.txt | |
#This will print the results to the screen. If you want to save the results to the same file: | |
sed -i.bak -e '5,10d;12d' file.txt | |
#This will back the file up to file.bak, and delete the given lines. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment