knbknb/command-line-course.edx.md

## command-line-course.edx.md

      
    Raw
  

              command-line-course.edx.md
            
          
    Bash COurse on edX

Shell Command Language
Which option of the uniq command allows you to specify the number of fields to ignore in its comparisons?
uniq  -f
  -f, --skip-fields=N   avoid comparing the first N fields
      --group[=METHOD]  show all items, separating groups with an empty line;
                          METHOD={separate(default),prepend,append,both}

less command

less -b # scroll back
less can be easily configured to behave likemore.  command to do that:export LESS=-XEmR.
The main advantage of 'less', of course, is that it allows forward and backward scrolling and, as mentioned, its performance witth large files as it doesn't read in the entire file before operations.
There is also 'most', which can display multiple files and has left/right scrolling as well.
Calculator

bc
sqrt(3)
1
scale = 5  # set number of digits/decimal precision
sqrt(3)
1.73205
Save a calculation
echo 2 ^ 64 - 1 > chess-rice
bc < chess--rice
Wildcard expansion

myfile.[^c]  # do  match any suffix apart from .c
[a-z]? is is consider two characters. It is not like a regular expression which [a-z]? stands for 0 or 1 character
HERE Documents

(the ability of the shell to specify the standard input for a command
after its invocation. This standard input forms what is called a "Here document")
Preceded with backslash: variables and commands NOT expanded
cat <<\EOF
bla
bla
bl
EOF
Command grouping (Group commands)

; -- use simple semicolon for one-liner
and use curly braces{ and}
{ echo -n 'Today is '; date; }  # oneliner format
multiline format
{
  echo -n 'Today is '
  date -R; # format according to email standards : Wed, 15 Apr 2020 09:56:39 +0200
}
{ and} are reserved words not metacharacters. See Compound commands in fileman-bash.pdf
{ ls /tmp/x && rm /tmp/x ; }    # The old file /tmp/x will be deleted, if it exists. correct
{ ls /tmp/x || touch /tmp/x ; } # A new file named /tmp/x will always be created. correct
Timezone Calculator

TZ=US/Pacific date
Scripting Tricks

if test sourcefile -nt testfile -a -r testfile   # -nt is the "newer than" file-test operator, -a the and condition
Makefind andxargs work together
find ... -print0 | xargs -0 ...  # argument separator is Null-Character
format output ofstat:stat -c '%Y %n' /tmp/myfile    # returns unixtime fileame
Data processing Pipelines

git log parsing: Check out 1 GB of Unix: Commit  History
git clone --mirror https://github.com/dspinellis/unix-history-repo.git
cd unix-history-repo.git

git log --pretty=format:%aD FreeBSD-release/10.0.0 |
cut -d, -f1 |
sort |
uniq -c |
sort -rn

### Pattern to generate a simple freqency count of something
    <something>
    sort |
    uniq -c |
    sort -rn
For loop: IMDB data
for f in title.{akas,basics,crew,episode,principals,ratings}.tsv.gz
do
  curl -L https\://datasets.imdbws\.com/$f
  gunzip $f
done
-s Silent --compressed
git command

To process the output ofgit blame with Unix tools
the so-called porcelain format is the most appropriate.
git blame --line-porcelain mydir/myfile.txt
find*.vue files with most commits, changed most often
    find .  -type f -name "*.vue"  |  while read f ; do   echo -n "$f ";   git log --follow --oneline "$f" | wc -l; done | sort -k 2nr | more
dates of commits of all specific files:
git blame --line-porcelain myFile.txt  | grep "author-time" | sort -u | cut -c 13- | xargs -i date +%Y-%m-%d -d@{}
Date command

get unixtime :date +%s
inverse operation:date -d @1582286183
Tools to analyse compiled files:
-nm - list symbols from object files
-dumpbin - Windows tool for same task
-ldd -  print shared object dependencies
-strip -  Discard symbols from object files.
fmt - simple optimal text formatter, expects a list of space-separated strings
dd

create bootable pen drive
curl -s -L ftp://site./some-bootable-img | sudo dd of /dev/usbdrive bs=10240
write some blocks of bytes in  arbitrary size
dd if=/dev/zero of=/tmp/nullfile bs=32k count=12
awk command

general rule: awk command= "word" + "action"

word is a regex
action given in curly braces e.g.{print $1}

sed command

print specific lines of a file :sed -n1000,1005p myfile
xmlstarlet command

xmlstarlet sel -t -c //xpath myfile.xml
sort command

sort -t : country-population -k 5 -n -r   # -t field separator. for awk -F:, for cut -d
sort - k 3M -2n # sort 3rd field by alphanumeric Month, then second field numeric
sort -C myfile  # check if a file is sorted
data processing and reporting

differences

diff command
cmp command  # compare binary files
create 256 bytes of random binary data

dd if=/dev/random of=random--binary-data bs=1 count=256
hexdump random--binary-data
shuf -n 1  # shuffle input, pick first element
pygmentize   # syntax highlighter and code formatter
Other hacks not mentioned in course

but still useful
pstree -p -C age - processtree annotated wih PIDs and colored by age (red/green)
journalctl _PID=7546  - show log entries that the process has created
ls -p | grep -v / | column - exclude process ids (which are directores not files) fromls /proc output
https://access.redhat.com/solutions/406773 - Interpreting /proc/meminfo and free output for Red Hat Enterprise Linux 5, 6 and 7
Network traffic
sudo iftop -i eno1  - highllevel view of IP traffic going in and out of your machine
sudo nethogs eno1 - which process is creating network traffic on your machine?
https://cockpit-project.org - new browser based admin tool
apachetop -f /var/log/apache2/access.log  monitor access to web pages in real timeParallel
*parallel
# cleanup big tweets file fetched from Twitters streaming API
# (files are 1 twieet/line but sometimes its 1 fragment /line.
# thus, remove fragments without "created_at":
# in 100 MB blocks
parallel --pipepart -a stream__TrumpShutdown.json --block 100M grep created_at > /mnt/virtualbox2/data/stream__TrumpShutdown.json