Skip to content

Instantly share code, notes, and snippets.

@marioa
Last active March 27, 2019 21:51
Show Gist options
  • Save marioa/f88353357da2294dec63b369db5021b1 to your computer and use it in GitHub Desktop.
Save marioa/f88353357da2294dec63b369db5021b1 to your computer and use it in GitHub Desktop.
First part of my instructor script for steps 1-4 of the software carpentry shell lesson.

Introduction to the shell - Part I

First part of my instructor script for steps 1-4 of the software carpentry shell lesson.

Introduction [5 mins]

Learning objectives

Computers used to:

  • run programs
  • store data
  • communicate with each other
  • and interact with us
    • screens
    • mice/touchpads
    • keyboards

Have:

  • graphical user interface, or GUI
  • command-line interface, or CLI
    • The heart of a CLI is a read-evaluate-print loop, or REPL

The shell acts as the interface between you and the OS. Have various shells, we shall be using bash (Bourne Again SHell) one of the most common.

  • Commands are terse and cryptic.
  • Can build powerful pipeline workflows.
  • Sometimes it is the only way to interacting with remote computers.

Belle's pipeline example in the introduction section.

Need to download:


http://swcarpentry.github.io/shell-novice\
/data/data-shell.zip

# Can use a browser to download this or

wget http://swcarpentry.github.io...

# or

curl -o data-shell.zip http://swcarpentry.github.io/...

# Unzip the file.
# Go to where the directory is.
# Change into the root directory.

Navigating Files and Directories [15 mins]

Learning objectives

  • Explain similarities and differences between files and directories.
  • Translate absolute paths into relative paths and vice versa.
  • Construct absolute and relative paths that identify specific files and directories.
  • Explain the steps in the shell’s read-run-print cycle.
  • Identify the actual command, flags, and filenames in a command-line call.
  • Demonstrate the use of tab completion, and explain its advantages.

  • file system
    • files contain information/data.
    • directories (or folders) special files that contain other files/directories.

prompt allows you to interact with the shell. Can set the prompt using:

PS1='$ '

Shows the shell is waiting for a command. Type:

whoami

This finds out your user id.

  1. finds a program called whoami,
  2. runs that program,
  3. displays that program’s output, then
  4. displays a new prompt to tell us that it’s ready for more commands.

Can find our current working directory:

pwd

What you get back will depend on your OS:

  • Windows: C:\Users\mario ...
  • Linux: /home/mario
  • Mac: /Users/mario

The / is a directory separator (could be \ for Windows).

  • / is referred to the root directory.
  • hierarchical file system.
  • Your home directory is the directory you are placed in when you login.
  • As a normal user you will only be able to add/modify files in your home directory or below.

You can list the files & directories using:

ls

Adding -F flag will append a / to directories:

ls -F 

ls has lots of options, you can find out what these are:

ls --help  # does not work on a mac
man ls

You can find more about commands by typing --help on some commands or use man command. man not supported by all OSs. Type q to quit, other commands available - type h.

We are going to assume that we are now nelly. Will will change directory to:

cd data-shell
pwd  # Difference between relative/absolute paths
ls
ls -F  # Can look in directories without changing into them
ls -F data
ls -F data/pdb
cd data
pwd # want to go back up
cd ..
pwd # Note if you type cd without arguments -> home directory
cd
pwd
cd - # only works for the previous directory
pwd  # ~ is also a shortcut for your home directory

The .. is a special directory pointing to the parent directory. Files that begin with a . are normally invisible to the shell.

ls -F -a
ls -Fa
ls ..

# Nelle going to do some data analysis
north-pacific-gyre # Origin of the data
cd north-pacific-gyre    
2012-07-03     # Good for ordering data when listing
               # Show tab completion
               # Use of mkdir -p

Working With Files and Directories [15mins]

Learning objectives

  • Create a directory hierarchy that matches a given diagram.
  • Create files in that hierarchy using an editor or by copying and renaming existing files.
  • Display the contents of a directory using the command line.
  • Delete specified files and/or directories.

Go to:

/Users/nelle/Desktop/data-shell

ls -F

mkdir thesis

ls -F

Good names for files and directories:

  • Don't use embedded white spaces, spaces are used to separate arguments.
  • Don't begin a name with -.
  • Stay with letters, numbers, ., - and _.
cd thesis
nano draft.txt  # open -a textedit draft.txt
                # Show how to get out of vi, emacs
                # Control keys in nano
ls
rm draft.txt # Not recoverable
ls

nano draft.txt
ls
cd ..
rm thesis   # oops

rm -r thesis     # Dangerous command
rm -r -f thesis  # Incredibly DANGEROUS command
rm -r -i         # Safer version

mkdir thesis
nano thesis/draft.txt
ls thesis

mv thesis/draft.txt thesis/quotes.txt # Can clobber files, 
                                      # can use use mv -i (interactive)
ls thesis

mv thesis/quotes.txt .    # move to the current directory
ls thesis
ls quotes.txt

cp quotes.txt thesis/quotations.txt
ls quotes.txt thesis/quotations.txt

rm quotes.txt
ls quotes.txt thesis/quotations.txt

Use extensions to provide information about the contents of the files.

Pipes and Filters [15 min]

Objectives

  • Redirect a command’s output to a file.
  • Process a file instead of keyboard input using redirection.
  • Construct command pipelines with two or more stages.
  • Explain what usually happens if a program or pipeline isn’t given any input to process.
  • Explain Unix’s ‘small pieces, loosely joined’ philosophy.

Go to the data-shell/molecules directory.

ls molecules # Protein data bank files

cd molecules
wc *.pdb      # Wildcards

Wildcards

  • * match anything
  • ? match one character

What will the following return?

ls ??t*
ls [ce]*

# also ...
ls *.{txt,pdf}

# Can run
wc -l *.pdb    # wc -c and wc -w

wc -l *.pdb > lengths.txt
ls lengths.txt
cat lenghts.txt
less lengths.txt

sort -n lengths.txt
sort -n lengths.txt > sorted-lengths.txt 
# Do not redirect to self
head -n 1 sorted-lengths.txt

sort -n lengths.txt | head -n 1 # pipes - remove intermediate files

wc -l *.pdb | sort -n
wc -l *.pdb | sort -n | head -1 # redirecting stdin

# Wheat do you think this will do?
wc -l < octane.pdb

cd ../north-pacific-gyre/2012-07-03
wc -l *.txt
wc -l *.txt | sort -n | head -n 5 # Whoops
wc -l *.txt | sort -n | tail -n 5 # Note file with Z (missing info)
ls *Z.txt
mv NENE02018B.txt NENE02018B.txt.problem
ls *[AB].txt

ls -1  # if it gives 1 file per line, count number of files
ls -1 *.txt| wc -l

# why?
echo "10
2
19
22
6" > nums.txt
cat nums.txt
sort nums.txt    # Why?

# what is the difference?
wc -l < nums.txt
wc -l nums.txt

echo hello > t1.txt
echo hello >> t2.txt

echo "john
john
mary
peter
mary
mary
john
peter" > names.txt
uniq names.txt # only removes neighbouring dups
# How can we remove all dups?

Creative Commons Licence
This work is licensed under a Creative Commons Attribution 4.0 International License.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment