First part of my instructor script for steps 1-4 of the software carpentry shell lesson.
Computers used to:
- run programs
- store data
- communicate with each other
- and interact with us
- screens
- mice/touchpads
- keyboards
Have:
- graphical user interface, or GUI
- command-line interface, or CLI
- The heart of a CLI is a read-evaluate-print loop, or REPL
The shell acts as the interface between you and the OS. Have various shells, we shall be using bash (Bourne Again SHell) one of the most common.
- Commands are terse and cryptic.
- Can build powerful pipeline workflows.
- Sometimes it is the only way to interacting with remote computers.
Belle's pipeline example in the introduction section.
Need to download:
http://swcarpentry.github.io/shell-novice\
/data/data-shell.zip
# Can use a browser to download this or
wget http://swcarpentry.github.io...
# or
curl -o data-shell.zip http://swcarpentry.github.io/...
# Unzip the file.
# Go to where the directory is.
# Change into the root directory.
- Explain similarities and differences between files and directories.
- Translate absolute paths into relative paths and vice versa.
- Construct absolute and relative paths that identify specific files and directories.
- Explain the steps in the shell’s read-run-print cycle.
- Identify the actual command, flags, and filenames in a command-line call.
- Demonstrate the use of tab completion, and explain its advantages.
- file system
- files contain information/data.
- directories (or folders) special files that contain other files/directories.
prompt allows you to interact with the shell. Can set the prompt using:
PS1='$ '
Shows the shell is waiting for a command. Type:
whoami
This finds out your user id.
- finds a program called
whoami
, - runs that program,
- displays that program’s output, then
- displays a new prompt to tell us that it’s ready for more commands.
Can find our current working directory:
pwd
What you get back will depend on your OS:
- Windows:
C:\Users\mario
... - Linux:
/home/mario
- Mac:
/Users/mario
The /
is a directory separator (could be \
for Windows).
/
is referred to the root directory.- hierarchical file system.
- Your home directory is the directory you are placed in when you login.
- As a normal user you will only be able to add/modify files in your home directory or below.
You can list the files & directories using:
ls
Adding -F
flag will append a /
to directories:
ls -F
ls
has lots of options, you can find out what these are:
ls --help # does not work on a mac
man ls
You can find more about commands by typing --help
on some commands or use man command
. man
not supported by all OSs. Type q
to quit, other commands available - type h
.
We are going to assume that we are now nelly
. Will will change directory to:
cd data-shell
pwd # Difference between relative/absolute paths
ls
ls -F # Can look in directories without changing into them
ls -F data
ls -F data/pdb
cd data
pwd # want to go back up
cd ..
pwd # Note if you type cd without arguments -> home directory
cd
pwd
cd - # only works for the previous directory
pwd # ~ is also a shortcut for your home directory
The ..
is a special directory pointing to the parent directory. Files that begin with a .
are normally invisible to the shell.
ls -F -a
ls -Fa
ls ..
# Nelle going to do some data analysis
north-pacific-gyre # Origin of the data
cd north-pacific-gyre
2012-07-03 # Good for ordering data when listing
# Show tab completion
# Use of mkdir -p
- Create a directory hierarchy that matches a given diagram.
- Create files in that hierarchy using an editor or by copying and renaming existing files.
- Display the contents of a directory using the command line.
- Delete specified files and/or directories.
Go to:
/Users/nelle/Desktop/data-shell
ls -F
mkdir thesis
ls -F
Good names for files and directories:
- Don't use embedded white spaces, spaces are used to separate arguments.
- Don't begin a name with
-
. - Stay with letters, numbers,
.
,-
and_
.
cd thesis
nano draft.txt # open -a textedit draft.txt
# Show how to get out of vi, emacs
# Control keys in nano
ls
rm draft.txt # Not recoverable
ls
nano draft.txt
ls
cd ..
rm thesis # oops
rm -r thesis # Dangerous command
rm -r -f thesis # Incredibly DANGEROUS command
rm -r -i # Safer version
mkdir thesis
nano thesis/draft.txt
ls thesis
mv thesis/draft.txt thesis/quotes.txt # Can clobber files,
# can use use mv -i (interactive)
ls thesis
mv thesis/quotes.txt . # move to the current directory
ls thesis
ls quotes.txt
cp quotes.txt thesis/quotations.txt
ls quotes.txt thesis/quotations.txt
rm quotes.txt
ls quotes.txt thesis/quotations.txt
Use extensions to provide information about the contents of the files.
- Redirect a command’s output to a file.
- Process a file instead of keyboard input using redirection.
- Construct command pipelines with two or more stages.
- Explain what usually happens if a program or pipeline isn’t given any input to process.
- Explain Unix’s ‘small pieces, loosely joined’ philosophy.
Go to the data-shell/molecules
directory.
ls molecules # Protein data bank files
cd molecules
wc *.pdb # Wildcards
Wildcards
*
match anything?
match one character
What will the following return?
ls ??t*
ls [ce]*
# also ...
ls *.{txt,pdf}
# Can run
wc -l *.pdb # wc -c and wc -w
wc -l *.pdb > lengths.txt
ls lengths.txt
cat lenghts.txt
less lengths.txt
sort -n lengths.txt
sort -n lengths.txt > sorted-lengths.txt
# Do not redirect to self
head -n 1 sorted-lengths.txt
sort -n lengths.txt | head -n 1 # pipes - remove intermediate files
wc -l *.pdb | sort -n
wc -l *.pdb | sort -n | head -1 # redirecting stdin
# Wheat do you think this will do?
wc -l < octane.pdb
cd ../north-pacific-gyre/2012-07-03
wc -l *.txt
wc -l *.txt | sort -n | head -n 5 # Whoops
wc -l *.txt | sort -n | tail -n 5 # Note file with Z (missing info)
ls *Z.txt
mv NENE02018B.txt NENE02018B.txt.problem
ls *[AB].txt
ls -1 # if it gives 1 file per line, count number of files
ls -1 *.txt| wc -l
# why?
echo "10
2
19
22
6" > nums.txt
cat nums.txt
sort nums.txt # Why?
# what is the difference?
wc -l < nums.txt
wc -l nums.txt
echo hello > t1.txt
echo hello >> t2.txt
echo "john
john
mary
peter
mary
mary
john
peter" > names.txt
uniq names.txt # only removes neighbouring dups
# How can we remove all dups?
This work is licensed under a Creative Commons Attribution 4.0 International License.