Skip to content

Instantly share code, notes, and snippets.

@gbishop
Last active July 18, 2022 11:43
Show Gist options
  • Star 17 You must be signed in to star a gist
  • Fork 7 You must be signed in to fork a gist
  • Save gbishop/acf40b86a9bca2d571fa to your computer and use it in GitHub Desktop.
Save gbishop/acf40b86a9bca2d571fa to your computer and use it in GitHub Desktop.
Allow arguments to be passed to notebooks via URL or command line.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"metadata": {},
"cell_type": "markdown",
"source": "# Args - parse arguments for CLI and Notebooks\n\nThis is my attempt at a simple argument parser for mostly key=value pairs that I can use both on the command line and in IPython notebooks. I support query parameters in notebook URLs and a _Run_ command for notebooks.\n\nI convert this notebook to a Python script for import and use like this.\n\n```\nimport Args\n\n# In a notebook you need a cell boundary here so that the collection of the query string\n# can happen before the Parse. Is there a better way? The ideal would be an effectively \n# synchronous call between python and the javascript in the front end but that seems\n# unlikely.\n\nargs = Args.Parse(\n verbose=False, # optional boolean, verbose alone will set it\n due='', # optional string\n timeout=30, # optional int\n limit=int, # required int\n assignment=str, # required string\n _config=\"{assignment}/config.json\" # read values from a config file too.\n)\n```\n\n`_config` allows me to specify the path to a json configuration file to pick up values that may be overridden. The path is interpolated with all the args using the string `format` method.\n\nThe returned value is a named tuple with attributes given by the key word arguments and an attribute `extra_` that gathers other arguments.\n\nIn the shell I can run a Python script the normal with args like this:\n```\npython3 script.py verbose limit=5 assignment=A2 rain.txt\n```\n\nOr I can run it as a notebook with\n```\njpn script verbose limit=5 assignment=A2 rain.txt\n```\n\nOr I can link to the notebook with\n```\nhttp://localhost:8990/notebooks/script.ipynb?limit=5&assignment=A2&rain.txt\n```\n\nI went for this minimalist key=value format because it is simple to implement and meets my needs."
},
{
"metadata": {},
"cell_type": "markdown",
"source": "## Handle URL query strings\n\nI'm injecting a bit of Javascript into the notebook so that I can grab the query string off the URL. You must put the import and the Parse call below in separate cells so that we have a chance to grab the query string off the URL."
},
{
"metadata": {
"collapsed": false,
"trusted": true
},
"cell_type": "code",
"source": "from IPython.display import Javascript, display\nimport os\n\n# I will pack the json encoded argv into this environment variable when running with nbconvert.\nevname = 'IPYTHONARGV'\n\n# try to detect that I'm in a notebook\ntry:\n __IPYTHON__\nexcept NameError:\n inIPython = False\nelse:\n inIPython = True\n if evname not in os.environ:\n # Javascript to grab the query string\n display(Javascript('''\n require(['base/js/namespace'], function(Jupyter) {\n var ready = function() {\n var query = window.location.search.substring(1);\n Jupyter.notebook.kernel.execute(\"import Args; Args._grabQS('\" + query + \"')\");\n };\n // If the kernel is ready when we get here the event apparently doesn't fire. They should\n // use promises instead.\n if (Jupyter.notebook.kernel) {\n ready();\n } else {\n Jupyter.notebook.events.on('kernel_ready.Kernel', ready);\n }\n });\n '''))\n\n_argv = []\ndef _grabQS(qs):\n '''Convert query string into an argv like list'''\n # do I need to urldecode or some such?\n global _argv\n _argv = [ kv for kv in qs.split('&') if kv ]\n",
"execution_count": 6,
"outputs": [
{
"output_type": "display_data",
"data": {
"application/javascript": "\n require(['base/js/namespace'], function(Jupyter) {\n console.log('requirejs ready');\n var ready = function() {\n console.log('ready called');\n var query = window.location.search.substring(1);\n Jupyter.notebook.kernel.execute(\"import Args; Args._grabQS('\" + query + \"')\");\n };\n // If the kernel is ready when we get here the event apparently doesn't fire. They should\n // use promises instead.\n if (Jupyter.notebook.kernel) {\n ready();\n } else {\n console.log('waiting for ready');\n Jupyter.notebook.events.on('kernel_ready.Kernel', ready);\n }\n });\n ",
"text/plain": "<IPython.core.display.Javascript object>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "## Figure out where the args are supplied.\n\nThere are several possible sources:\n\n1. For notebooks running interactively I expect them in the query string as collected above.\n2. For notebooks running from the CLI I expect them json encoded in an enviroment variable.\n3. For normal scripts I expect them in sys.argv."
},
{
"metadata": {
"collapsed": true,
"trusted": true
},
"cell_type": "code",
"source": "import sys\nimport os\nimport json\n\nif not _argv:\n if evname in os.environ:\n _argv = json.loads(os.environ[evname])\n elif 'ipykernel' not in sys.argv[0]:\n _argv = sys.argv[1:]\n else:\n _argv = []",
"execution_count": null,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "## Process the supplied argument definition and defaults"
},
{
"metadata": {
"collapsed": true,
"trusted": true
},
"cell_type": "code",
"source": "from collections import namedtuple\nimport os.path as osp\n\ndef Parse(**kwargs):\n '''Return an object containing arguments collected from an optional configuration file,\n then the values specified here, then values from the sys.argv, the URL query string, or an\n environment variable.'''\n \n args = {}\n types = {}\n extra = []\n required = set()\n supplied = set()\n \n def addValue(k, v):\n if v is None:\n if k in types and types[k] is bool:\n args[k] = True\n elif k not in types:\n extra.append(k)\n else:\n raise ValueError('{k} without value expected {t}'.format(k=k, t=types[k]))\n elif k in types:\n if types[k] is bool:\n args[k] = v.lower() not in ['0', 'false']\n else:\n try:\n args[k] = types[k](v)\n except ValueError:\n raise ValueError('{k}={v} expected {t}'.format(k=k, v=v, t=types[k]))\n else:\n raise ValueError('{k}={v} unexpected argument'.format(k=k, v=v))\n supplied.add(k)\n \n # first fill the defaults\n for k, v in kwargs.items():\n if k.startswith('_'):\n continue\n if isinstance(v, type): # a required argument\n required.add(k)\n types[k] = v\n \n else:\n args[k] = v\n types[k] = type(v)\n \n try:\n # get values from _argv\n for a in _argv:\n if '=' in a:\n k, v = a.split('=')\n else:\n k, v = a, None\n addValue(k, v)\n\n # get the values from the config file but don't overwrite supplied values\n if '_config' in kwargs:\n path = kwargs['_config'].format(**args)\n if osp.exists(path):\n with open(path, 'r') as fp:\n for k, v in json.load(fp).items():\n if k not in supplied:\n addValue(k, v)\n\n # make sure we got the required values\n omitted = required - supplied\n if omitted:\n raise ValueError('missing required argument{} {}'.format('s'[len(omitted)==1:], omitted))\n except ValueError:\n # print a usage message\n print('args:', ' '.join([ '{}={}'.format(k,t.__name__) for k,t in types.items() ]))\n raise\n \n attrs = sorted(args.keys())\n attrs.append('extra_')\n args['extra_'] = extra\n \n return namedtuple('Args', attrs)(**args) ",
"execution_count": null,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "## Run Notebooks from the CLI\n\nSometimes I want to run the notebook without loading it up in the browser. `nbconvert` does nearly everything I need but it won't pass args. This is my attempt to cure that.\n\nI'm going to allow running them two ways. \n\n1. Simply convert the notebook to a python script and run it normally.\n2. Use nbconvert to execute the notebook but arrange to pass in an environment variable with the JSON encoded argv.\n\nI run this with a tiny script, named `jpn`, like this:\n```\n#!/usr/bin/env python3\nimport Args\nArgs.Run()\n```"
},
{
"metadata": {
"collapsed": false,
"trusted": true
},
"cell_type": "code",
"source": "def Run():\n import os\n import os.path as osp\n import sys\n import subprocess\n import json\n \n name = osp.basename(sys.argv[0])\n notebook = sys.argv[1]\n \n if name == 'jpy':\n script = notebook.replace('.ipynb', '.py')\n if not osp.exists(script) or osp.getmtime(notebook) > osp.getmtime(script):\n subprocess.call(['jupyter', 'nbconvert', '--log-level', '0', '--to', 'script', notebook])\n os.execlp('python3', script, script, *sys.argv[2:])\n \n elif name == 'jpn':\n env = os.environ\n env['IPYTHONARGV'] = json.dumps(sys.argv[2:])\n os.execlpe('jupyter', 'jupyter', 'nbconvert', '--execute', '--inplace', '--to', 'notebook', notebook,\n '--output', notebook, env)",
"execution_count": 6,
"outputs": []
}
],
"metadata": {
"kernelspec": {
"name": "python3",
"display_name": "Python 3",
"language": "python"
},
"language_info": {
"mimetype": "text/x-python",
"nbconvert_exporter": "python",
"name": "python",
"pygments_lexer": "ipython3",
"version": "3.4.3",
"file_extension": ".py",
"codemirror_mode": {
"version": 3,
"name": "ipython"
}
},
"gist_id": "acf40b86a9bca2d571fa"
},
"nbformat": 4,
"nbformat_minor": 0
}
@brazilbean
Copy link

Some great ideas here. Thank you for sharing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment