Second part of my instructor script for steps 4-7 of the software carpentry shell lesson.
- Write a loop that applies one or more commands separately to each file in a set of files.
- Trace the values taken on by a loop variable during execution of the loop.
- Explain the difference between a variable’s name and its value.
- Explain why spaces and some punctuation characters shouldn’t be used in file names.
- Demonstrate how to see what commands have recently been executed.
- Re-run recently executed commands without retyping them.
Want to rename files from *.dat
to original-*.dat
.
# Go to data-shell/creatures
cp *.dat original-*.dat
cp basilisk.dat unicorn.dat original-*.dat
# Need to use a loop
for filename in *.dat # filename is a variable, $ to dereference
do
echo $filename
done
# Delimit the variable name
for filename in *.dat; do
echo ${filename}name
done
# List the last 20 lines
for filename in *.dat
do
echo $filename
head -n 100 $filename | tail -n 20
done
for filename in *.dat # slightly clearer output
do
echo
echo $filename
echo
head -n 100 $filename | tail -n 20
done
# Bad
cp unicorn.dat "red dragon.dat"
for filename in *.dat # slightly clearer output
do
echo
echo $filename
echo
head -n 100 $filename | tail -n 20
done
rm red*
# Do not use embedded spaces in files and directories
for filename in *.dat
do
cp $filename original-$filename
done
cd ../north-pacific-gyre/2012-07-03/
# List all files that end with an A or B
for datafile in *[AB].txt
do
echo $datafile
done
# Run an analysis program 1st arg
# is the input file, second file is the
# outputfile
for datafile in *[AB].txt
do
echo bash goostats $datafile stats-$datafile
done
Short-cuts
- ^A takes you to the beginning of a line
- ^E to the end of a line
- ^U deletes the line
- ^P goes to the previous command
- ^N goes forward (if you have gone back)
history
Repeat a command by doing !number
.
- ^R cycles back in the history
- !! repeats the last command
- !$ the last word in the previous line
- Write a shell script that runs a command or series of commands for a fixed set of files.
- Run a shell script from the command line.
- Write a shell script that operates on a set of files defined by the user on the command line.
- Create pipelines that include shell scripts you, and others, have written.
cd ../../molecules
nano middle.sh
# Add
head -n 15 octane.pdb | tail -n 5
# Will show lines 11-15 of octane.pdb
bash middle.sh
Careful about text editing files - word processors introduce special characters.
Edit the script to read:
head -n 15 "$1" | tail -n 5 # Quotes for embedded spaces
# Now run:
bash middle.sh octane.pdb
bash middle.sh pentane.pdb
Edit the script to read:
head -n "$2" "$1" | tail -n "$3"
Run:
bash middle.sh pentane.pdb 15 5
bash middle.sh pentane.pdb 20 5
Edit and add comments:
# Select lines from the middle of a file.
# Usage: bash middle.sh filename end_line num_lines
head -n "$2" "$1" | tail -n "$3"
How do we put this in a shell script?
wc -l *.pdb | sort -n
Edit the file:
nano sorted.sh
and add:
# Sort filenames by their length.
# Usage: bash sorted.sh one_or_more_filenames
wc -l "$@" | sort -n
Need to make sure comments remain updated.
bash sorted.sh *.pdb ../creatures/*.dat
# but
bash sorted.sh # Does nothing (waits for output from stdin)
Could fix with:
if [ "$#" -lt 1 ]; then
echo "Usage: bash $0 <list of files>" >&2
exit 1
fi
edit the script:
# List files sorted by number of lines.
$ wc -l "$@" | sort -n
If you manage to do something and want to preserve a record of it then:
history | tail -n 5 > redo-figure-3.sh
Edit the file to make into a shell script.
Nelle can now put her analysis into a shell script.
-
Write a shell script called
longest.sh
that takes:- name of a directory and
- filename extension as its parameters,
prints out the name of the file with the most lines in that directory with that extension.
wc -l "$1"/*."$2"|sort -n |tail -2|head -1
To debug a script use:
bash -x scriptname
- Use grep to select lines from text files that match simple patterns.
- Use find to find files whose names match simple patterns.
- Use the output of one command as the command-line parameters to another command.
- Explain what is meant by ‘text’ and ‘binary’ files, and why many common tools don’t handle the latter well.
grep
is a contraction of global/regular expression/print
cd ../writing
ls
cat haiku.txt
# a Japanese poem of seventeen syllables, in three lines of five,
# seven, and five.
grep not haiku.txt
grep day haiku.txt
grep -w day haiku.txt # No "day" word.
grep " day " haiku.txt
grep "is not" haiku.txt
grep -n "it" haiku.txt # line number where it matches
cat -n haiku.txt # check
grep -n -w "the" haiku.txt
grep -n -i -w "the" haiku.txt
grep -n -w -v "the" haiku.txt
grep -n -w -v -H "the" haiku.txt
grep --help
man grep
# Use extended regular expressions
grep -E '^.o' haiku.txt # Second character is an o
Finding files:
ls -1 # one column output
ls -R # recursive listing
ls -1R|grep -i empty
find . # Note use of the .
find . -type d
find . -type f
find . -name *.txt # Not what you expect
find . -name "*.txt"
wc -l $(find . -name '*.txt')
wc -l `find . -name "*.txt"`
grep "FE" $(find .. -name '*.pdb')
find . -name "*.txt" -exec grep "FE" {} \;
Looking inside binary files
od `which ls` |head
hexdump `which ls`|head
xxd `which ls`|head
strings `which ls`
If there is time there are some exercises at the end of the notes.
This work is licensed under a Creative Commons Attribution 4.0 International License.