Skip to content

Instantly share code, notes, and snippets.

@marioa
Last active March 27, 2019 21:51
Show Gist options
  • Save marioa/19ec637f60c17129835232fb67dc2526 to your computer and use it in GitHub Desktop.
Save marioa/19ec637f60c17129835232fb67dc2526 to your computer and use it in GitHub Desktop.
Second part of the instructor script for the software carpentry shell lesson.

Introduction to the Shell - part II

Second part of my instructor script for steps 4-7 of the software carpentry shell lesson.

Loops [15 mins]

Objectives

  • Write a loop that applies one or more commands separately to each file in a set of files.
  • Trace the values taken on by a loop variable during execution of the loop.
  • Explain the difference between a variable’s name and its value.
  • Explain why spaces and some punctuation characters shouldn’t be used in file names.
  • Demonstrate how to see what commands have recently been executed.
  • Re-run recently executed commands without retyping them.

Want to rename files from *.dat to original-*.dat.

# Go to data-shell/creatures
cp *.dat original-*.dat
cp basilisk.dat unicorn.dat original-*.dat
# Need to use a loop
for filename in *.dat   # filename is a variable, $ to dereference
do
   echo $filename
done
# Delimit the variable name
for filename in *.dat; do
   echo ${filename}name
done
# List the last 20 lines
for filename in *.dat
do
    echo $filename
    head -n 100 $filename | tail -n 20
done
for filename in *.dat   # slightly clearer output
do
    echo
    echo $filename
    echo
    head -n 100 $filename | tail -n 20
done
# Bad
cp unicorn.dat "red dragon.dat"
for filename in *.dat   # slightly clearer output
do
    echo
    echo $filename
    echo
    head -n 100 $filename | tail -n 20
done
rm red*
# Do not use embedded spaces in files and directories
for filename in *.dat
do
    cp $filename original-$filename
done

Nelle's processing of files

cd ../north-pacific-gyre/2012-07-03/

# List all files that end with an A or B
for datafile in *[AB].txt
do
   echo $datafile
done

# Run an analysis program 1st arg 
# is the input file, second file is the
# outputfile
for datafile in *[AB].txt
do
   echo bash goostats $datafile stats-$datafile
done

Short-cuts

  • ^A takes you to the beginning of a line
  • ^E to the end of a line
  • ^U deletes the line
  • ^P goes to the previous command
  • ^N goes forward (if you have gone back)
history

Repeat a command by doing !number.

  • ^R cycles back in the history
  • !! repeats the last command
  • !$ the last word in the previous line

Shell scripts [15 mins]

Objectives

  • Write a shell script that runs a command or series of commands for a fixed set of files.
  • Run a shell script from the command line.
  • Write a shell script that operates on a set of files defined by the user on the command line.
  • Create pipelines that include shell scripts you, and others, have written.

cd ../../molecules
nano middle.sh  
# Add
head -n 15 octane.pdb | tail -n 5
# Will show lines 11-15 of octane.pdb
bash middle.sh

Careful about text editing files - word processors introduce special characters.

Edit the script to read:

head -n 15 "$1" | tail -n 5   # Quotes for embedded spaces
# Now run:
bash middle.sh octane.pdb
bash middle.sh pentane.pdb

Edit the script to read:

head -n "$2" "$1" | tail -n "$3"

Run:

bash middle.sh pentane.pdb 15 5
bash middle.sh pentane.pdb 20 5

Edit and add comments:

# Select lines from the middle of a file.
# Usage: bash middle.sh filename end_line num_lines
head -n "$2" "$1" | tail -n "$3"

How do we put this in a shell script?

wc -l *.pdb | sort -n

Edit the file:

nano sorted.sh

and add:

# Sort filenames by their length.
# Usage: bash sorted.sh one_or_more_filenames
wc -l "$@" | sort -n

Need to make sure comments remain updated.

bash sorted.sh *.pdb ../creatures/*.dat

# but
bash sorted.sh   # Does nothing (waits for output from stdin)

Could fix with:

if [ "$#" -lt 1 ]; then
   echo "Usage: bash $0 <list of files>" >&2
   exit 1
fi

edit the script:

# List files sorted by number of lines.
$ wc -l "$@" | sort -n

If you manage to do something and want to preserve a record of it then:

history | tail -n 5 > redo-figure-3.sh

Edit the file to make into a shell script.

Nelle can now put her analysis into a shell script.

Exercises

  • Write a shell script called longest.sh that takes:

    • name of a directory and
    • filename extension as its parameters,

    prints out the name of the file with the most lines in that directory with that extension.

wc -l "$1"/*."$2"|sort -n |tail -2|head -1

To debug a script use:

bash -x scriptname

Finding Things [15 mins]

Objectives

  • Use grep to select lines from text files that match simple patterns.
  • Use find to find files whose names match simple patterns.
  • Use the output of one command as the command-line parameters to another command.
  • Explain what is meant by ‘text’ and ‘binary’ files, and why many common tools don’t handle the latter well.

  • grep is a contraction of global/regular expression/print
cd ../writing
ls
cat haiku.txt

# a Japanese poem of seventeen syllables, in three lines of five,
# seven, and five.

grep not haiku.txt

grep day haiku.txt
grep -w day haiku.txt   # No "day" word.
grep " day " haiku.txt

grep "is not" haiku.txt
grep -n "it" haiku.txt   # line number where it matches
cat -n haiku.txt         # check

grep -n -w "the" haiku.txt
grep -n -i -w "the" haiku.txt
grep -n -w -v "the" haiku.txt
grep -n -w -v -H "the" haiku.txt
grep --help
man grep

# Use extended regular expressions
grep -E '^.o' haiku.txt  # Second character is an o

Finding files:

ls -1 # one column output
ls -R # recursive listing
ls -1R|grep -i empty

find .  # Note use of the .
find . -type d
find . -type f
find . -name *.txt  # Not what you expect
find . -name "*.txt"

wc -l $(find . -name '*.txt')
wc -l `find . -name "*.txt"`

grep "FE" $(find .. -name '*.pdb')

find . -name "*.txt" -exec grep "FE" {} \;

Looking inside binary files

od `which ls` |head
hexdump `which ls`|head
xxd `which ls`|head
strings `which ls`

If there is time there are some exercises at the end of the notes.


Creative Commons Licence
This work is licensed under a Creative Commons Attribution 4.0 International License.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment