Skip to content

Instantly share code, notes, and snippets.

@douglatornell
Last active December 23, 2015 22:19
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save douglatornell/6702739 to your computer and use it in GitHub Desktop.
Save douglatornell/6702739 to your computer and use it in GitHub Desktop.
{
"metadata": {
"name": ""
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#Automation - Shell Scripts"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##Learning Goals\n",
"- Write a series of commands as a shell script\n",
"- Run shell scripts and with and without execute permissions set on them\n",
"- Use variables and loop constructs in shell scripts\n",
"- Explain the use of the `at` and `cron` commands for deferred script execution"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##Shell Script Basics\n",
"\n",
"A shell script is basically just a file containing a collections of commands that you\n",
"could have typed at the command line one after the other.\n",
"`bash` has some of the features of a programming language like\n",
"variables, loops, and conditionals,\n",
"but the syntax is quite different from Python's.\n",
"\n",
"A simple `bash` script:\n",
"\n",
"```bash\n",
"#!/usr/bin/env bash\n",
"\n",
"# Hello bash\n",
"\n",
"echo hello world!\n",
"echo from bash\n",
"echo\n",
"echo here is the contents of your directory:\n",
"ls\n",
"```\n",
"\n",
"The first line is a special command called a \"shebang\"\n",
"(In the mists of time of Unnix history `#` was called \"sharp\" and `!` was called \"bang\").\n",
"It tells the operating system to use the program `/usr/bin/env` for find `bash`\n",
"and use it to run the commands in the file.\n",
"\n",
"Comment lines in `bash` start with `#`.\n",
"That's why,\n",
"if your operating system doesn't know what to do with the shebang command on the 1st line\n",
"it ignores it.\n",
"\n",
"The rest of the commands do exactly what they would do if you typed them in order."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"###Running Shell Scripts and Permissions\n",
"\n",
"If you save the commands above in a file\n",
"(let's call it `simple.sh`)\n",
"you can execute the script at the command line with:\n",
"```bash\n",
"$ source simple.sh\n",
"```\n",
"or\n",
"\n",
"```bash\n",
"$ . simple.sh\n",
"```\n",
"\n",
"`.` and `source` are synonyms.\n",
"\n",
"Using `ls -l simple.sh` you will probably see that its permissions are something like:\n",
"\n",
"```bash\n",
"-rw-r--r-- 1 doug staff 31 24 Sep 11:30 simple.sh\n",
"```\n",
"\n",
"You can change the permissions to make the script executable with:\n",
"\n",
"```bash\n",
"$ chmod +x simple.sh\n",
"$ ls -l simple.sh\n",
"-rwxr-xr-x 1 doug staff 31 24 Sep 11:30 simple.sh\n",
"```\n",
"\n",
"and,\n",
"having done so,\n",
"run the script with:\n",
"\n",
"```bash\n",
"$ ./simple.sh\n",
"```\n",
"\n",
"It's a security feature of `bash` that your current directory is not included in `PATH`,\n",
"so,\n",
"you have to prefix the script name with `./` to tell bash to run it from the current\n",
"directory.\n",
"If you put a shell script in a directory that is included in your `PATH` you can run\n",
"it by name,\n",
"just like any other command.\n",
"By convention,\n",
"people often put general purpose scripts that they have made executable for frequent use\n",
"in `$HOME/bin/` or `~/bin/` and add that directory to their `PATH` \n",
"in their `$HOME/.bashrc` file with:\n",
"\n",
"```bash\n",
"export PATH=$HOME/bin:$PATH\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##Shell Variables and Loops\n",
"\n",
"Variables are assigned in `bash` using `=` \n",
"and their values are obtained by pre-fixing their names with `$`:\n",
"\n",
"```bash\n",
"#!/usr/bin/env bash\n",
"\n",
"FOO=bar\n",
"echo $FOO\n",
"\n",
"RAZZLE=\"dazzle do\"\n",
"echo $RAZZLE\n",
"```\n",
"\n",
"Note that spaces are not allowed around `=`\n",
"and values that contain spaces must be quoted."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Exercise:**\n",
"Write a shell script to run `climate_data.py` to download a climate data file.\n",
"Use shell variables to hold the command line arguments.\n",
"Run the script and confirmed that it worked."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The `for` loop in `bash` allows us to iterate over a collection of argument values.\n",
"The arguments are given in a white-space separated list\n",
"and the loop syntax it:\n",
"\n",
"```bash\n",
"#!/usr/bin/env bash\n",
"\n",
"START_DATES=\"2013-08 2013-09\"\n",
"DATA_FREQ=\"hourly\"\n",
"\n",
"for START_DATE in $START_DATES\n",
"do\n",
" python climate_data.py $START_DATE $DATA_FREQ\n",
"done\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If we decided instead that we wanted to store the command line argument pairs\n",
"for `climate_data.py` in a file\n",
"(let's call it `data_wanted.txt`),\n",
"on pair per line like:\n",
"\n",
" 2010-08 hourly\n",
" 2010 daily\n",
"\n",
"we can iterate over the lines read from `stdin` with the script:\n",
"\n",
"```bash\n",
"#!/usr/bin/env bash\n",
"\n",
"while read LINE\n",
"do\n",
" python climate_data.py $LINE\n",
"done\n",
"```\n",
"\n",
"and run it by piping our data file into our script:\n",
"\n",
" cat data_wanted.txt | ./get_climate_data.sh"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Optional shell examples for looping through files\n",
"\n",
"using Python script that accepts arg1 and outputs arg2:\n",
"\n",
"```bash\n",
"for file in *.csv\n",
"do \n",
" python SortCSV.py $file $file\".sorted_CSV\"\n",
"done\n",
"```\n",
"\n",
"Do multiple types of files (if applicable)\n",
"Using ython script that accepts arg1 and outputs arg2:\n",
"\n",
"```bash\n",
"for file in *.csv *tsv\n",
"do \n",
" python SortCSV.py $file $file\".sorted_CSV\"\n",
"done\n",
"```\n",
"\n",
"To loop over files in a directory:\n",
"\n",
"```bash\n",
"Directory=/home/julia/*\n",
"for file in $Directory\n",
"do\n",
" python SortCSV.py $file $file\".sorted_CSV\"\n",
"done\n",
"```\n",
"\n",
"And finally useful for running programs at different parameters:\n",
"\n",
"```bash\n",
"for number in {1,2,3,4}\n",
"do\n",
" echo $number\n",
" program_for_clustering.py percent_clustering -$number\n",
"done \n",
"```\n",
"\n",
"More correct and succinct syntax would be: \n",
"\n",
"```bash\n",
"for number in {1..4}\n",
"do\n",
" echo $number\n",
" program_for_clustering.py percent_clustering -$number\n",
"done\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##Deferred Execution with `at`\n",
"\n",
"Both Unix-like and Windows operating systems have an `at` command that allows\n",
"a shell script to be run at a future time.\n",
"\n",
"The syntax differs significantly between Unix and Windows.\n",
"\n",
"On OS/X and Linux the command:\n",
"\n",
"```bash\n",
"at -f ./foo.sh 6pm tomorrow\n",
"```\n",
"\n",
"will run the `foo.sh` script at 18:00 the next day,\n",
"and:\n",
"\n",
"```bash\n",
"at -f ./foo.sh now + 3 hours\n",
"```\n",
"\n",
"will run the script 3 hours in the future.\n",
"\n",
"On Windows... must run Git Bash as administrator for this to work!\n",
"\n",
"```bash\n",
"at 00:00 -f ./foo.sh\n",
"```\n",
"\n",
"will run it at midnight\n",
"\n",
"```bash\n",
"at 23:00 /every:M,T,W,Th,F -f ./foo.sh\n",
"```\n",
"\n",
"Will run the file every weekday at 11pm. \n",
"\n",
"On OS/X `at` is disabled by default.\n",
"See the instructions in `man atrun` to enable it.\n",
"\n",
"On Windows the Task Scheduler service must be running to use `at`.\n",
"\n",
"Note that `at` jobs only run when computers are awake.\n",
"Scheduling an `at` job for the middle of the night on a laptop or desktop machine\n",
"that has power-saving settings that put it to sleep is unlikely to give you the\n",
"results you want."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##Scheduled Execution with `cron`\n",
"\n",
"`cron` is a Unix utility that allows jobs to be scheduled for repeated execution.\n",
"The schedule and commands to run are stored in a `crontab` file and accessed via the `crontab` command.\n",
"`man 5 crontab` provides all the details of the syntax that may be used in a `crontab` file:\n",
"Here is an example of a `crontab` that will run one script at midnight every day,\n",
"and another at 01:15 on the first day of January, April, July, and October:\n",
"\n",
" @midnight /path/to/daily_job.sh\n",
" 15 1 1 1,4,7,10 * /path/to/quarterly_job.sh\n",
" \n",
"`cron` jobs are run as though you are working at the root (`/`) of the file system,\n",
"so you need to use absolute paths or include `cd` commands in your scripts.\n",
"\n",
"The `/next` and `/every` flags of the Windows `at` command offer some `cron`-like functionality.\n",
"\n",
"The same caveats mentioned above apply to scheduled jobs -\n",
"the computer must be away to execute them."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##Key Points\n",
"- Use `source` or `.` to run a shell script,\n",
"or make it executable so that you can invoke it with its path and name\n",
"- Shell variables are assigned like `FOO=bar` and used like `echo $FOO`\n",
"- `bash` provides a `for` loop construct to iterate over a list of values,\n",
"and a `while` loop to iterate over data from `stdin`\n",
"- On Linux and OS/X `cron` allows jobs to be run at scheduled times,\n",
"possibly on a repeating schedule\n",
"- On Windows the `/next` and `/every` flags of the `at` command provide some `cron`-like functions\n",
"- A computer must be awake to execute a deferred or scheduled job"
]
}
],
"metadata": {}
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment