marioa/shell-2-instructor.md

## shell-2-instructor.md

      
    Raw
  

              shell-2-instructor.md
            
          
    Introduction to the Shell - part II

Second part of my instructor script for steps 4-7 of the  software carpentry shell lesson.
Loops [15 mins]

Objectives


Write a loop that applies one or more commands separately to each file in a set of files.
Trace the values taken on by a loop variable during execution of the loop.
Explain the difference between a variable’s name and its value.
Explain why spaces and some punctuation characters shouldn’t be used in file names.
Demonstrate how to see what commands have recently been executed.
Re-run recently executed commands without retyping them.


Want to rename files from *.dat to original-*.dat.
# Go to data-shell/creatures
cp *.dat original-*.dat
cp basilisk.dat unicorn.dat original-*.dat

# Need to use a loop
for filename in *.dat   # filename is a variable, $ to dereference
do
   echo $filename
done

# Delimit the variable name
for filename in *.dat; do
   echo ${filename}name
done

# List the last 20 lines
for filename in *.dat
do
    echo $filename
    head -n 100 $filename | tail -n 20
done

for filename in *.dat   # slightly clearer output
do
    echo
    echo $filename
    echo
    head -n 100 $filename | tail -n 20
done

# Bad
cp unicorn.dat "red dragon.dat"

for filename in *.dat   # slightly clearer output
do
    echo
    echo $filename
    echo
    head -n 100 $filename | tail -n 20
done

rm red*

# Do not use embedded spaces in files and directories
for filename in *.dat
do
    cp $filename original-$filename
done

Nelle's processing of files

cd ../north-pacific-gyre/2012-07-03/

# List all files that end with an A or B
for datafile in *[AB].txt
do
   echo $datafile
done

# Run an analysis program 1st arg 
# is the input file, second file is the
# outputfile
for datafile in *[AB].txt
do
   echo bash goostats $datafile stats-$datafile
done


Short-cuts

^A takes you to the beginning of a line
^E to the end of a line
^U deletes the line
^P goes to the previous command
^N goes forward (if you have gone back)

history

Repeat a command by doing !number.

^R cycles back in the history
!! repeats the last command
!$ the last word in the previous line

Shell scripts [15 mins]

Objectives


Write a shell script that runs a command or series of commands for a fixed set of files.
Run a shell script from the command line.
Write a shell script that operates on a set of files defined by the user on the command line.
Create pipelines that include shell scripts you, and others, have written.


cd ../../molecules
nano middle.sh  

# Add
head -n 15 octane.pdb | tail -n 5

# Will show lines 11-15 of octane.pdb
bash middle.sh

Careful about text editing files - word processors introduce special characters.
Edit the script to read:
head -n 15 "$1" | tail -n 5   # Quotes for embedded spaces

# Now run:
bash middle.sh octane.pdb
bash middle.sh pentane.pdb

Edit the script to read:
head -n "$2" "$1" | tail -n "$3"

Run:
bash middle.sh pentane.pdb 15 5
bash middle.sh pentane.pdb 20 5

Edit and add comments:
# Select lines from the middle of a file.
# Usage: bash middle.sh filename end_line num_lines
head -n "$2" "$1" | tail -n "$3"


How do we put this in a shell script?
wc -l *.pdb | sort -n


Edit the file:
nano sorted.sh

and add:
# Sort filenames by their length.
# Usage: bash sorted.sh one_or_more_filenames
wc -l "$@" | sort -n

Need to make sure comments remain updated.
bash sorted.sh *.pdb ../creatures/*.dat

# but
bash sorted.sh   # Does nothing (waits for output from stdin)

Could fix with:
if [ "$#" -lt 1 ]; then
   echo "Usage: bash $0 <list of files>" >&2
   exit 1
fi

edit the script:
# List files sorted by number of lines.
$ wc -l "$@" | sort -n


If you manage to do something and want to preserve a record of it then:
history | tail -n 5 > redo-figure-3.sh

Edit the file to make into a shell script.
Nelle can now put her analysis into a shell script.
Exercises


Write a shell script called longest.sh that takes:

name of a directory and
filename extension as its parameters,

prints out the name of the file with the most lines in that directory with that extension.


wc -l "$1"/*."$2"|sort -n |tail -2|head -1

To debug a script use:
bash -x scriptname

Finding Things [15 mins]

Objectives


Use grep to select lines from text files that match simple patterns.
Use find to find files whose names match simple patterns.
Use the output of one command as the command-line parameters to another command.
Explain what is meant by ‘text’ and ‘binary’ files, and why many common tools don’t handle the latter well.


grep is a contraction of global/regular expression/print

cd ../writing
ls
cat haiku.txt

# a Japanese poem of seventeen syllables, in three lines of five,
# seven, and five.

grep not haiku.txt

grep day haiku.txt
grep -w day haiku.txt   # No "day" word.
grep " day " haiku.txt

grep "is not" haiku.txt
grep -n "it" haiku.txt   # line number where it matches
cat -n haiku.txt         # check

grep -n -w "the" haiku.txt
grep -n -i -w "the" haiku.txt
grep -n -w -v "the" haiku.txt
grep -n -w -v -H "the" haiku.txt
grep --help
man grep

# Use extended regular expressions
grep -E '^.o' haiku.txt  # Second character is an o

Finding files:
ls -1 # one column output
ls -R # recursive listing
ls -1R|grep -i empty

find .  # Note use of the .
find . -type d
find . -type f
find . -name *.txt  # Not what you expect
find . -name "*.txt"

wc -l $(find . -name '*.txt')
wc -l `find . -name "*.txt"`

grep "FE" $(find .. -name '*.pdb')

find . -name "*.txt" -exec grep "FE" {} \;

Looking inside binary files
od `which ls` |head
hexdump `which ls`|head
xxd `which ls`|head
strings `which ls`

If there is time there are some exercises at the end of the notes.


This work is licensed under a 
Creative Commons Attribution 4.0 International License.