Last active
April 13, 2022 15:16
-
-
Save tonyfast/e4bf0b6d77225faf1b04 to your computer and use it in GitHub Desktop.
Demonstration of YAML parsers in the Jupyter Notebook http://nbviewer.ipython.org/gist/tonyfast/e4bf0b6d77225faf1b04
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# Using YAML in IPython notebooks\n", | |
"\n", | |
"## 1. Using PyYAML\n", | |
"\n", | |
"[PyYaml]() is a python module to load yaml files and strings.\n", | |
"\n", | |
"``pip install pyyaml``\n", | |
"\n", | |
"## 2. Using YamlMagic\n", | |
"\n", | |
"[Yamlmagic]() is method of creating python variables using yaml syntax in Jupyter Notebook cells. Yamlmagic uses [cell magic]().\n", | |
"\n", | |
"1. ``pip install yamlmagic``\n", | |
"1. ``%load_ext yamlmagic``" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## The Magics\n", | |
"\n", | |
"* __line magic__ starts with `%`\n", | |
"* __cell magic__ starts with `%%`" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 1, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"application/json": { | |
"cell": { | |
"!": "OSMagics", | |
"HTML": "Other", | |
"SVG": "Other", | |
"bash": "Other", | |
"capture": "ExecutionMagics", | |
"debug": "ExecutionMagics", | |
"file": "Other", | |
"html": "DisplayMagics", | |
"javascript": "DisplayMagics", | |
"latex": "DisplayMagics", | |
"perl": "Other", | |
"prun": "ExecutionMagics", | |
"pypy": "Other", | |
"python": "Other", | |
"python2": "Other", | |
"python3": "Other", | |
"ruby": "Other", | |
"script": "ScriptMagics", | |
"sh": "Other", | |
"svg": "DisplayMagics", | |
"sx": "OSMagics", | |
"system": "OSMagics", | |
"time": "ExecutionMagics", | |
"timeit": "ExecutionMagics", | |
"writefile": "OSMagics" | |
}, | |
"line": { | |
"alias": "OSMagics", | |
"alias_magic": "BasicMagics", | |
"autocall": "AutoMagics", | |
"automagic": "AutoMagics", | |
"autosave": "KernelMagics", | |
"bookmark": "OSMagics", | |
"cat": "Other", | |
"cd": "OSMagics", | |
"clear": "KernelMagics", | |
"colors": "BasicMagics", | |
"config": "ConfigMagics", | |
"connect_info": "KernelMagics", | |
"cp": "Other", | |
"debug": "ExecutionMagics", | |
"dhist": "OSMagics", | |
"dirs": "OSMagics", | |
"doctest_mode": "BasicMagics", | |
"ed": "Other", | |
"edit": "KernelMagics", | |
"env": "OSMagics", | |
"gui": "BasicMagics", | |
"hist": "Other", | |
"history": "HistoryMagics", | |
"install_default_config": "DeprecatedMagics", | |
"install_ext": "ExtensionMagics", | |
"install_profiles": "DeprecatedMagics", | |
"killbgscripts": "ScriptMagics", | |
"ldir": "Other", | |
"less": "KernelMagics", | |
"lf": "Other", | |
"lk": "Other", | |
"ll": "Other", | |
"load": "CodeMagics", | |
"load_ext": "ExtensionMagics", | |
"loadpy": "CodeMagics", | |
"logoff": "LoggingMagics", | |
"logon": "LoggingMagics", | |
"logstart": "LoggingMagics", | |
"logstate": "LoggingMagics", | |
"logstop": "LoggingMagics", | |
"ls": "Other", | |
"lsmagic": "BasicMagics", | |
"lx": "Other", | |
"macro": "ExecutionMagics", | |
"magic": "BasicMagics", | |
"man": "KernelMagics", | |
"matplotlib": "PylabMagics", | |
"mkdir": "Other", | |
"more": "KernelMagics", | |
"mv": "Other", | |
"notebook": "BasicMagics", | |
"page": "BasicMagics", | |
"pastebin": "CodeMagics", | |
"pdb": "ExecutionMagics", | |
"pdef": "NamespaceMagics", | |
"pdoc": "NamespaceMagics", | |
"pfile": "NamespaceMagics", | |
"pinfo": "NamespaceMagics", | |
"pinfo2": "NamespaceMagics", | |
"popd": "OSMagics", | |
"pprint": "BasicMagics", | |
"precision": "BasicMagics", | |
"profile": "BasicMagics", | |
"prun": "ExecutionMagics", | |
"psearch": "NamespaceMagics", | |
"psource": "NamespaceMagics", | |
"pushd": "OSMagics", | |
"pwd": "OSMagics", | |
"pycat": "OSMagics", | |
"pylab": "PylabMagics", | |
"qtconsole": "KernelMagics", | |
"quickref": "BasicMagics", | |
"recall": "HistoryMagics", | |
"rehashx": "OSMagics", | |
"reload_ext": "ExtensionMagics", | |
"rep": "Other", | |
"rerun": "HistoryMagics", | |
"reset": "NamespaceMagics", | |
"reset_selective": "NamespaceMagics", | |
"rm": "Other", | |
"rmdir": "Other", | |
"run": "ExecutionMagics", | |
"save": "CodeMagics", | |
"sc": "OSMagics", | |
"set_env": "OSMagics", | |
"store": "StoreMagics", | |
"sx": "OSMagics", | |
"system": "OSMagics", | |
"tb": "ExecutionMagics", | |
"time": "ExecutionMagics", | |
"timeit": "ExecutionMagics", | |
"unalias": "OSMagics", | |
"unload_ext": "ExtensionMagics", | |
"who": "NamespaceMagics", | |
"who_ls": "NamespaceMagics", | |
"whos": "NamespaceMagics", | |
"xdel": "NamespaceMagics", | |
"xmode": "BasicMagics" | |
} | |
}, | |
"text/plain": [ | |
"Available line magics:\n", | |
"%alias %alias_magic %autocall %automagic %autosave %bookmark %cat %cd %clear %colors %config %connect_info %cp %debug %dhist %dirs %doctest_mode %ed %edit %env %gui %hist %history %install_default_config %install_ext %install_profiles %killbgscripts %ldir %less %lf %lk %ll %load %load_ext %loadpy %logoff %logon %logstart %logstate %logstop %ls %lsmagic %lx %macro %magic %man %matplotlib %mkdir %more %mv %notebook %page %pastebin %pdb %pdef %pdoc %pfile %pinfo %pinfo2 %popd %pprint %precision %profile %prun %psearch %psource %pushd %pwd %pycat %pylab %qtconsole %quickref %recall %rehashx %reload_ext %rep %rerun %reset %reset_selective %rm %rmdir %run %save %sc %set_env %store %sx %system %tb %time %timeit %unalias %unload_ext %who %who_ls %whos %xdel %xmode\n", | |
"\n", | |
"Available cell magics:\n", | |
"%%! %%HTML %%SVG %%bash %%capture %%debug %%file %%html %%javascript %%latex %%perl %%prun %%pypy %%python %%python2 %%python3 %%ruby %%script %%sh %%svg %%sx %%system %%time %%timeit %%writefile\n", | |
"\n", | |
"Automagic is ON, % prefix IS NOT needed for line magics." | |
] | |
}, | |
"execution_count": 1, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"%lsmagic\n", | |
"# line magic" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 11, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"1\n", | |
"CPU times: user 32 µs, sys: 1 µs, total: 33 µs\n", | |
"Wall time: 58.9 µs\n" | |
] | |
} | |
], | |
"source": [ | |
"%%time\n", | |
"# cell magic\n", | |
"a = 1\n", | |
"print(a)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Using PyYaml" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 16, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"{'b': {'c': 'some string'}, 'a': 1}\n", | |
"{'c': 'some string'}\n" | |
] | |
} | |
], | |
"source": [ | |
"import yaml\n", | |
"yamlFromModule = yaml.safe_load(\"\"\"\n", | |
"a : 1\n", | |
"b: \n", | |
" c: some string\n", | |
"\"\"\")\n", | |
"print(yamlFromModule)\n", | |
"print(yamlFromModule['b'])" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 41, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"{'b': {'c': 'some string'}, 'a': 1}\n", | |
"{'c': 'some string'}\n" | |
] | |
} | |
], | |
"source": [ | |
"import yaml\n", | |
"yamlFromModuleLine = yaml.safe_load(\"a : 1\\nb:\\n c: some string\\n\")\n", | |
"print(yamlFromModuleLine)\n", | |
"print(yamlFromModuleLine['b'])" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Using yamlmagic" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 20, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"# load the magic \n", | |
"%load_ext yamlmagic" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 24, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"application/javascript": [ | |
"\n", | |
" require(\n", | |
" [\n", | |
" \"notebook/js/codecell\",\n", | |
" \"codemirror/mode/yaml/yaml\"\n", | |
" ],\n", | |
" function(cc){\n", | |
" cc.CodeCell.options_default.highlight_modes.magic_yaml = {\n", | |
" reg: [\"^%%yaml\"]\n", | |
" }\n", | |
" }\n", | |
" );\n", | |
" " | |
], | |
"text/plain": [ | |
"<IPython.core.display.Javascript object>" | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"%%yaml yamlFromMagic\n", | |
"a : 1\n", | |
"b: \n", | |
" c: some string" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 23, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"{'b': {'c': 'some string'}, 'a': 1}\n" | |
] | |
} | |
], | |
"source": [ | |
"print(yamlFromMagic)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Loading YAML from a remote location" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 25, | |
"metadata": { | |
"collapsed": true | |
}, | |
"outputs": [], | |
"source": [ | |
"import requests" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 26, | |
"metadata": { | |
"collapsed": true | |
}, | |
"outputs": [], | |
"source": [ | |
"remoteYamlResponse = requests.get(\n", | |
" url = \"https://raw.githubusercontent.com/ICBacon/icbacon.github.io/master/_config.yml\"\n", | |
")" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 29, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"yamlFromRemote = yaml.safe_load(remoteYamlResponse.text)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 32, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"The keys in this yaml file are: \n", | |
"github\n", | |
"markdown\n", | |
"title\n", | |
"email\n", | |
"description\n", | |
"url\n", | |
"disqus_username\n", | |
"baseurl\n", | |
"twitter_username\n" | |
] | |
} | |
], | |
"source": [ | |
"print('The keys in this yaml file are: ')\n", | |
"for key in yamlFromRemote:\n", | |
" print(key)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Hacking yaml front matter" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 42, | |
"metadata": { | |
"collapsed": true | |
}, | |
"outputs": [], | |
"source": [ | |
"urlStream = 'https://raw.githubusercontent.com/ICBacon/icbacon.github.io/master/_posts/2015-07-10-my-first-day-at-continuum.md'\n", | |
"stream = requests.get(\n", | |
" url = urlStream,\n", | |
")" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 47, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"{'title': 'Installing Miniconda3 for a Mac OS X', 'layout': 'post'}\n" | |
] | |
} | |
], | |
"source": [ | |
"# Split the string using the stream tokens ---\n", | |
"frontMatter = yaml.safe_load(stream.text.split('---')[1])\n", | |
"print(frontMatter)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 48, | |
"metadata": { | |
"collapsed": true | |
}, | |
"outputs": [], | |
"source": [ | |
"def myStreamParser( url ):\n", | |
" \"\"\"\n", | |
" Convert a remote file with yaml front matter\n", | |
" into Python variables.\n", | |
" \"\"\"\n", | |
" stream = requests.get(\n", | |
" url = url,\n", | |
" )\n", | |
" return yaml.safe_load(stream.text.split('---')[1])" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 51, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"layout\n", | |
"content1\n", | |
"im\n", | |
"flickr1\n", | |
"link1\n" | |
] | |
} | |
], | |
"source": [ | |
"yamlStreamFunc = myStreamParser(\n", | |
" 'https://raw.githubusercontent.com/tonyfast/nsf-goali/gh-pages/_posts/2014-08-06-Markdown-Example.html'\n", | |
")\n", | |
"for key in yamlStreamFunc:\n", | |
" print(key)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Yaml can load JSON!" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 53, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"True\n" | |
] | |
} | |
], | |
"source": [ | |
"remoteJSON = requests.get(\n", | |
" url='https://api.github.com/users/icbacon/events'\n", | |
")\n", | |
"print(yaml.safe_load(remoteJSON.text) == remoteJSON.json())" | |
] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 3", | |
"language": "python", | |
"name": "python3" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.4.3" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 0 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment