lontivero/README.md

## README.md

      
    Raw
  

              README.md
            
          
    Useful tips and tricks


## linux-tips-tricks.txt
Basics

Learn basic bash. Actually, read the whole bash man page; it's pretty easy to follow and not that long. Alternate shells can be nice, but bash is powerful and always available (learning mainly zsh or tcsh restricts you in many situations).
Learn vim. There's really no competition for random Linux editing (even if you use Emacs or Eclipse most of the time).
Know ssh, and the basics of passwordless authentication, via ssh-agent, ssh-add, etc.
Be familiar with bash job management: &, Ctrl-Z, Ctrl-C, jobs, fg, bg, kill, etc.
Be familiar with basic file and network management (less, head and tail and tail -f, chown, chmod, df, mount, ip or ifconfig, dig, etc.).
Know regular expressions well, and the various flags to grep/egrep. The -o, -A, and -B options are worth knowing.
Learn to use aptitude or yum (depending on distro) to find and install packages.

Everyday use

In bash, use Ctrl-R to search through command history.
In bash, use Ctrl-W to kill the last word, and Ctrl-U to kill the line. See man readline for default keybindings in bash. There are a lot. For example Alt-. cycles through prevous arguments, and Alt-* expands a glob.
To go back to the previous working directory: cd -
Use xargs. It's very powerful. Note you can control how many items execute per line (-L) as well as parallelism (-P). If you're not sure if it'll do the right thing, use xargs echo first. Also, -I{} is handy. Examples:
find . -name \*.py | xargs grep some_function
cat hosts | xargs -I{} ssh root@{} hostname
pstree -p is a helpful display of the process tree.
Use pgrep and pkill to find or signal processes by name (-f is helpful).
Know the various signals you can send processes. For example, to suspend a process, use kill -STOP [pid].  For the full list, see man 7 signal
Use nohup or disown if you want a background process to keep running forever.
Check what processes are listening via netstat -lntp. See also lsof.
In bash scripts, use set -x for debugging output. Use set -e to abort on errors. Consider using set -o pipefail as well, to be strict about errors (though this topic is a bit subtle). For more involved scripts, also use trap.
In bash scripts, subshells (written with parentheses) are convenient ways to group commands. A common example is to temporarily move to a different working directory, e.g.
# do something in current dir
(cd /some/other/dir; other-command)
# continue in original dir
In bash, note there are lots of handy variations on variable expansion. These include ${name:?error message}, which is useful checking args in bash scripts, and arithmetic expansion, such as i=$(( (i + 1) % 5 )). Also trimming of strings via ${var%suffix} and ${var#prefix}. For example if var=foo.pdf, then echo ${var%.pdf}.txt prints "foo.txt".
Know about "here documents" in bash, as in cat <<EOF ....
Use man ascii for a good ASCII table, with hex and decimal values.
On remote ssh sessions, use screen or dtach to save your session, in case it is interrupted.
For web debugging, curl and curl -I are handy, and/or their wget equivalents.
To convert HTML to text: lynx -dump -stdin
If you must handle XML, xmlstarlet is good.
For Amazon S3, s3cmd is convenient (albeit immature, with occasional misfeatures).
In ssh, knowing how to port tunnel with -L or -D (and occasionally -R) is useful, e.g. to access web sites from a remote server.
It can be useful to make a few optimizations to your ssh configuration; for example, this .ssh/config contains settings to avoid dropped connections in certain network environments, not require confirmation connecting to new hosts, forward authentication, and use compression (which is helpful with scp over low-bandwidth connections):
TCPKeepAlive=yes
ServerAliveInterval=15
ServerAliveCountMax=6
StrictHostKeyChecking=no
Compression=yes
ForwardAgent=yes
If you are halfway through typing a command but change your mind, hit Alt-# to add a # at the beginning and enter it as a comment (or use Ctrl-A, #, enter). You can then return to it later via command history.

Data processing

Know about sort and uniq (including uniq's -u and -d options).
Know about cut, paste, and join to manipulate text files. Many people use cut but forget about join.
It is remarkably helpful sometimes that you can do set intersection, union, and difference of text files via sort/uniq. Suppose a and b are text files that are already uniqued. This is fast, and works on files of arbitrary size, up to many gigabytes. (Sort is not limited by memory, though you may need to use the -T option if /tmp is on a small root partition.)
cat a b | sort | uniq > c   # c is a union b
cat a b | sort | uniq -d > c   # c is a intersect b
cat a b b | sort | uniq -u > c   # c is set difference a - b
Know that locale affects a lot of command line tools, including sorting order and performance. Most Linux installations will set LANG or other locale variables to a local setting like US English. This can make sort or other commands run many times slower. (Note that even if you use UTF-8 text, you can safely sort by ASCII order for many purposes.) To disable slow i18n routines and use traditional byte-based sort order, use export LC_ALL=C (in fact, consider putting this in your .bashrc).
Know basic awk for simple data munging. For example, summing all numbers in the third column of a text file: awk '{ x += $3 } END { print x }'. This is probably 3X faster and 3X shorter than equivalent Python.
Use shuf to shuffle or select random lines from a file.
Know sort's options. Know how keys work (-t and -k). In particular, watch out that you need to write -k1,1 to sort by only the first field; -k1 means sort according to the whole line.
Stable sort (sort -s) can be useful. For example, to sort first by field 2, then secondarily by field 1, you can use sort -k1,1 | sort -s -k2,2
If you ever need to write a tab literal in a command line in bash (e.g. for the -t argument to sort), press Ctrl-V <tab>.
For binary files, use hd for simple hex dumps and bvi for binary editing.
Also for binary files, strings (plus grep, etc.) lets you find bits of text.
To convert text encodings, try iconv. Or uconv for more advanced use (it supports some advanced Unicode things, such as transforms for normalization, accent removal, etc.).
To split files into pieces, see split (to split by size) and csplit (to split by a pattern).

System debugging

To know disk/cpu/network status, use iostat, netstat, top (or the better htop), and (especially) dstat. Good for getting a quick idea of what's happening on a system.
To know memory status, run and understand the output of free. In particular, be aware the "cached" value is memory held by the Linux kernel as file cache, so effectively counts toward the "free" value.
Use mtr as a better traceroute, to identify network issues.
To find which socket or process is using bandwidth, try iftop or nethogs.
The ab tool (comes with Apache) is helpful for quick-and-dirty checking of web server performance.
For more serious network debugging, wireshark or tshark.
Know strace, and that you can strace a running process (with -p). This can be helpful if a program is failing, hanging, or crashing, and you don't know why.
Know about ldd to check shared libraries etc.
Know how to connect to a running process with gdb and get its stack traces.
Use /proc. It's amazingly helpful sometimes when debugging live problems. Examples: /proc/cpuinfo, /proc/xxx/cwd, /proc/xxx/exe, /proc/xxx/fd/, /proc/xxx/smaps.
When debugging why something went wrong in the past, sar can be very helpful. It shows historic statistics on CPU, memory, network, etc.
Use dmesg whenever something's acting really funny (it could be hardware or driver issues).

## rsync-commands.txt
find . -name "\:*" | xargs rm
find . -name "\.*" | xargs rm -r
find . -name "*.[j|t|c|n|x]*" | xargs chmod -x

# main backup command template
rsync -avzn --exclude=\.* --exclude=\:* ./ /Volumes/Mac-Backup-Joy-1.5TB/joy/photography/

# function to delete files in bkup disk which is not under current dir
rsyncd () {
    rsync -avz $1 --exclude=\.* --exclude=\:* --delete-after ./ `pwd | sed "s/\/Users\/joy\/Pictures\/photography/\/Volumes\/Mac-Backup-Joy-1.5TB\/joy\/photography/"`;
}
	Basics

	Learn basic bash. Actually, read the whole bash man page; it's pretty easy to follow and not that long. Alternate shells can be nice, but bash is powerful and always available (learning mainly zsh or tcsh restricts you in many situations).
	Learn vim. There's really no competition for random Linux editing (even if you use Emacs or Eclipse most of the time).
	Know ssh, and the basics of passwordless authentication, via ssh-agent, ssh-add, etc.
	Be familiar with bash job management: &, Ctrl-Z, Ctrl-C, jobs, fg, bg, kill, etc.
	Be familiar with basic file and network management (less, head and tail and tail -f, chown, chmod, df, mount, ip or ifconfig, dig, etc.).
	Know regular expressions well, and the various flags to grep/egrep. The -o, -A, and -B options are worth knowing.
	Learn to use aptitude or yum (depending on distro) to find and install packages.

	Everyday use

	In bash, use Ctrl-R to search through command history.
	In bash, use Ctrl-W to kill the last word, and Ctrl-U to kill the line. See man readline for default keybindings in bash. There are a lot. For example Alt-. cycles through prevous arguments, and Alt-* expands a glob.
	To go back to the previous working directory: cd -
	Use xargs. It's very powerful. Note you can control how many items execute per line (-L) as well as parallelism (-P). If you're not sure if it'll do the right thing, use xargs echo first. Also, -I{} is handy. Examples:
	find . -name \*.py \| xargs grep some_function
	cat hosts \| xargs -I{} ssh root@{} hostname
	pstree -p is a helpful display of the process tree.
	Use pgrep and pkill to find or signal processes by name (-f is helpful).
	Know the various signals you can send processes. For example, to suspend a process, use kill -STOP [pid]. For the full list, see man 7 signal
	Use nohup or disown if you want a background process to keep running forever.
	Check what processes are listening via netstat -lntp. See also lsof.
	In bash scripts, use set -x for debugging output. Use set -e to abort on errors. Consider using set -o pipefail as well, to be strict about errors (though this topic is a bit subtle). For more involved scripts, also use trap.
	In bash scripts, subshells (written with parentheses) are convenient ways to group commands. A common example is to temporarily move to a different working directory, e.g.
	# do something in current dir
	(cd /some/other/dir; other-command)
	# continue in original dir
	In bash, note there are lots of handy variations on variable expansion. These include ${name:?error message}, which is useful checking args in bash scripts, and arithmetic expansion, such as i=$(( (i + 1) % 5 )). Also trimming of strings via ${var%suffix} and ${var#prefix}. For example if var=foo.pdf, then echo ${var%.pdf}.txt prints "foo.txt".
	Know about "here documents" in bash, as in cat <<EOF ....
	Use man ascii for a good ASCII table, with hex and decimal values.
	On remote ssh sessions, use screen or dtach to save your session, in case it is interrupted.
	For web debugging, curl and curl -I are handy, and/or their wget equivalents.
	To convert HTML to text: lynx -dump -stdin
	If you must handle XML, xmlstarlet is good.
	For Amazon S3, s3cmd is convenient (albeit immature, with occasional misfeatures).
	In ssh, knowing how to port tunnel with -L or -D (and occasionally -R) is useful, e.g. to access web sites from a remote server.
	It can be useful to make a few optimizations to your ssh configuration; for example, this .ssh/config contains settings to avoid dropped connections in certain network environments, not require confirmation connecting to new hosts, forward authentication, and use compression (which is helpful with scp over low-bandwidth connections):
	TCPKeepAlive=yes
	ServerAliveInterval=15
	ServerAliveCountMax=6
	StrictHostKeyChecking=no
	Compression=yes
	ForwardAgent=yes
	If you are halfway through typing a command but change your mind, hit Alt-# to add a # at the beginning and enter it as a comment (or use Ctrl-A, #, enter). You can then return to it later via command history.

	Data processing

	Know about sort and uniq (including uniq's -u and -d options).
	Know about cut, paste, and join to manipulate text files. Many people use cut but forget about join.
	It is remarkably helpful sometimes that you can do set intersection, union, and difference of text files via sort/uniq. Suppose a and b are text files that are already uniqued. This is fast, and works on files of arbitrary size, up to many gigabytes. (Sort is not limited by memory, though you may need to use the -T option if /tmp is on a small root partition.)
	cat a b \| sort \| uniq > c # c is a union b
	cat a b \| sort \| uniq -d > c # c is a intersect b
	cat a b b \| sort \| uniq -u > c # c is set difference a - b
	Know that locale affects a lot of command line tools, including sorting order and performance. Most Linux installations will set LANG or other locale variables to a local setting like US English. This can make sort or other commands run many times slower. (Note that even if you use UTF-8 text, you can safely sort by ASCII order for many purposes.) To disable slow i18n routines and use traditional byte-based sort order, use export LC_ALL=C (in fact, consider putting this in your .bashrc).
	Know basic awk for simple data munging. For example, summing all numbers in the third column of a text file: awk '{ x += $3 } END { print x }'. This is probably 3X faster and 3X shorter than equivalent Python.
	Use shuf to shuffle or select random lines from a file.
	Know sort's options. Know how keys work (-t and -k). In particular, watch out that you need to write -k1,1 to sort by only the first field; -k1 means sort according to the whole line.
	Stable sort (sort -s) can be useful. For example, to sort first by field 2, then secondarily by field 1, you can use sort -k1,1 \| sort -s -k2,2
	If you ever need to write a tab literal in a command line in bash (e.g. for the -t argument to sort), press Ctrl-V <tab>.
	For binary files, use hd for simple hex dumps and bvi for binary editing.
	Also for binary files, strings (plus grep, etc.) lets you find bits of text.
	To convert text encodings, try iconv. Or uconv for more advanced use (it supports some advanced Unicode things, such as transforms for normalization, accent removal, etc.).
	To split files into pieces, see split (to split by size) and csplit (to split by a pattern).

	System debugging

	To know disk/cpu/network status, use iostat, netstat, top (or the better htop), and (especially) dstat. Good for getting a quick idea of what's happening on a system.
	To know memory status, run and understand the output of free. In particular, be aware the "cached" value is memory held by the Linux kernel as file cache, so effectively counts toward the "free" value.
	Use mtr as a better traceroute, to identify network issues.
	To find which socket or process is using bandwidth, try iftop or nethogs.
	The ab tool (comes with Apache) is helpful for quick-and-dirty checking of web server performance.
	For more serious network debugging, wireshark or tshark.
	Know strace, and that you can strace a running process (with -p). This can be helpful if a program is failing, hanging, or crashing, and you don't know why.
	Know about ldd to check shared libraries etc.
	Know how to connect to a running process with gdb and get its stack traces.
	Use /proc. It's amazingly helpful sometimes when debugging live problems. Examples: /proc/cpuinfo, /proc/xxx/cwd, /proc/xxx/exe, /proc/xxx/fd/, /proc/xxx/smaps.
	When debugging why something went wrong in the past, sar can be very helpful. It shows historic statistics on CPU, memory, network, etc.
	Use dmesg whenever something's acting really funny (it could be hardware or driver issues).
	find . -name "\:*" \| xargs rm
	find . -name "\.*" \| xargs rm -r
	find . -name ".[j\|t\|c\|n\|x]" \| xargs chmod -x

	# main backup command template
	rsync -avzn --exclude=\.* --exclude=\:* ./ /Volumes/Mac-Backup-Joy-1.5TB/joy/photography/

	# function to delete files in bkup disk which is not under current dir
	rsyncd () {
	rsync -avz $1 --exclude=\.* --exclude=\:* --delete-after ./ `pwd \| sed "s/\/Users\/joy\/Pictures\/photography/\/Volumes\/Mac-Backup-Joy-1.5TB\/joy\/photography/"`;
	}