Skip to content

Instantly share code, notes, and snippets.

@tonyfast
Last active April 13, 2022 15:16
Show Gist options
  • Save tonyfast/e4bf0b6d77225faf1b04 to your computer and use it in GitHub Desktop.
Save tonyfast/e4bf0b6d77225faf1b04 to your computer and use it in GitHub Desktop.
Demonstration of YAML parsers in the Jupyter Notebook http://nbviewer.ipython.org/gist/tonyfast/e4bf0b6d77225faf1b04
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Using YAML in IPython notebooks\n",
"\n",
"## 1. Using PyYAML\n",
"\n",
"[PyYaml]() is a python module to load yaml files and strings.\n",
"\n",
"``pip install pyyaml``\n",
"\n",
"## 2. Using YamlMagic\n",
"\n",
"[Yamlmagic]() is method of creating python variables using yaml syntax in Jupyter Notebook cells. Yamlmagic uses [cell magic]().\n",
"\n",
"1. ``pip install yamlmagic``\n",
"1. ``%load_ext yamlmagic``"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## The Magics\n",
"\n",
"* __line magic__ starts with `%`\n",
"* __cell magic__ starts with `%%`"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"application/json": {
"cell": {
"!": "OSMagics",
"HTML": "Other",
"SVG": "Other",
"bash": "Other",
"capture": "ExecutionMagics",
"debug": "ExecutionMagics",
"file": "Other",
"html": "DisplayMagics",
"javascript": "DisplayMagics",
"latex": "DisplayMagics",
"perl": "Other",
"prun": "ExecutionMagics",
"pypy": "Other",
"python": "Other",
"python2": "Other",
"python3": "Other",
"ruby": "Other",
"script": "ScriptMagics",
"sh": "Other",
"svg": "DisplayMagics",
"sx": "OSMagics",
"system": "OSMagics",
"time": "ExecutionMagics",
"timeit": "ExecutionMagics",
"writefile": "OSMagics"
},
"line": {
"alias": "OSMagics",
"alias_magic": "BasicMagics",
"autocall": "AutoMagics",
"automagic": "AutoMagics",
"autosave": "KernelMagics",
"bookmark": "OSMagics",
"cat": "Other",
"cd": "OSMagics",
"clear": "KernelMagics",
"colors": "BasicMagics",
"config": "ConfigMagics",
"connect_info": "KernelMagics",
"cp": "Other",
"debug": "ExecutionMagics",
"dhist": "OSMagics",
"dirs": "OSMagics",
"doctest_mode": "BasicMagics",
"ed": "Other",
"edit": "KernelMagics",
"env": "OSMagics",
"gui": "BasicMagics",
"hist": "Other",
"history": "HistoryMagics",
"install_default_config": "DeprecatedMagics",
"install_ext": "ExtensionMagics",
"install_profiles": "DeprecatedMagics",
"killbgscripts": "ScriptMagics",
"ldir": "Other",
"less": "KernelMagics",
"lf": "Other",
"lk": "Other",
"ll": "Other",
"load": "CodeMagics",
"load_ext": "ExtensionMagics",
"loadpy": "CodeMagics",
"logoff": "LoggingMagics",
"logon": "LoggingMagics",
"logstart": "LoggingMagics",
"logstate": "LoggingMagics",
"logstop": "LoggingMagics",
"ls": "Other",
"lsmagic": "BasicMagics",
"lx": "Other",
"macro": "ExecutionMagics",
"magic": "BasicMagics",
"man": "KernelMagics",
"matplotlib": "PylabMagics",
"mkdir": "Other",
"more": "KernelMagics",
"mv": "Other",
"notebook": "BasicMagics",
"page": "BasicMagics",
"pastebin": "CodeMagics",
"pdb": "ExecutionMagics",
"pdef": "NamespaceMagics",
"pdoc": "NamespaceMagics",
"pfile": "NamespaceMagics",
"pinfo": "NamespaceMagics",
"pinfo2": "NamespaceMagics",
"popd": "OSMagics",
"pprint": "BasicMagics",
"precision": "BasicMagics",
"profile": "BasicMagics",
"prun": "ExecutionMagics",
"psearch": "NamespaceMagics",
"psource": "NamespaceMagics",
"pushd": "OSMagics",
"pwd": "OSMagics",
"pycat": "OSMagics",
"pylab": "PylabMagics",
"qtconsole": "KernelMagics",
"quickref": "BasicMagics",
"recall": "HistoryMagics",
"rehashx": "OSMagics",
"reload_ext": "ExtensionMagics",
"rep": "Other",
"rerun": "HistoryMagics",
"reset": "NamespaceMagics",
"reset_selective": "NamespaceMagics",
"rm": "Other",
"rmdir": "Other",
"run": "ExecutionMagics",
"save": "CodeMagics",
"sc": "OSMagics",
"set_env": "OSMagics",
"store": "StoreMagics",
"sx": "OSMagics",
"system": "OSMagics",
"tb": "ExecutionMagics",
"time": "ExecutionMagics",
"timeit": "ExecutionMagics",
"unalias": "OSMagics",
"unload_ext": "ExtensionMagics",
"who": "NamespaceMagics",
"who_ls": "NamespaceMagics",
"whos": "NamespaceMagics",
"xdel": "NamespaceMagics",
"xmode": "BasicMagics"
}
},
"text/plain": [
"Available line magics:\n",
"%alias %alias_magic %autocall %automagic %autosave %bookmark %cat %cd %clear %colors %config %connect_info %cp %debug %dhist %dirs %doctest_mode %ed %edit %env %gui %hist %history %install_default_config %install_ext %install_profiles %killbgscripts %ldir %less %lf %lk %ll %load %load_ext %loadpy %logoff %logon %logstart %logstate %logstop %ls %lsmagic %lx %macro %magic %man %matplotlib %mkdir %more %mv %notebook %page %pastebin %pdb %pdef %pdoc %pfile %pinfo %pinfo2 %popd %pprint %precision %profile %prun %psearch %psource %pushd %pwd %pycat %pylab %qtconsole %quickref %recall %rehashx %reload_ext %rep %rerun %reset %reset_selective %rm %rmdir %run %save %sc %set_env %store %sx %system %tb %time %timeit %unalias %unload_ext %who %who_ls %whos %xdel %xmode\n",
"\n",
"Available cell magics:\n",
"%%! %%HTML %%SVG %%bash %%capture %%debug %%file %%html %%javascript %%latex %%perl %%prun %%pypy %%python %%python2 %%python3 %%ruby %%script %%sh %%svg %%sx %%system %%time %%timeit %%writefile\n",
"\n",
"Automagic is ON, % prefix IS NOT needed for line magics."
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%lsmagic\n",
"# line magic"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1\n",
"CPU times: user 32 µs, sys: 1 µs, total: 33 µs\n",
"Wall time: 58.9 µs\n"
]
}
],
"source": [
"%%time\n",
"# cell magic\n",
"a = 1\n",
"print(a)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Using PyYaml"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'b': {'c': 'some string'}, 'a': 1}\n",
"{'c': 'some string'}\n"
]
}
],
"source": [
"import yaml\n",
"yamlFromModule = yaml.safe_load(\"\"\"\n",
"a : 1\n",
"b: \n",
" c: some string\n",
"\"\"\")\n",
"print(yamlFromModule)\n",
"print(yamlFromModule['b'])"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'b': {'c': 'some string'}, 'a': 1}\n",
"{'c': 'some string'}\n"
]
}
],
"source": [
"import yaml\n",
"yamlFromModuleLine = yaml.safe_load(\"a : 1\\nb:\\n c: some string\\n\")\n",
"print(yamlFromModuleLine)\n",
"print(yamlFromModuleLine['b'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Using yamlmagic"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# load the magic \n",
"%load_ext yamlmagic"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"application/javascript": [
"\n",
" require(\n",
" [\n",
" \"notebook/js/codecell\",\n",
" \"codemirror/mode/yaml/yaml\"\n",
" ],\n",
" function(cc){\n",
" cc.CodeCell.options_default.highlight_modes.magic_yaml = {\n",
" reg: [\"^%%yaml\"]\n",
" }\n",
" }\n",
" );\n",
" "
],
"text/plain": [
"<IPython.core.display.Javascript object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%yaml yamlFromMagic\n",
"a : 1\n",
"b: \n",
" c: some string"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'b': {'c': 'some string'}, 'a': 1}\n"
]
}
],
"source": [
"print(yamlFromMagic)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Loading YAML from a remote location"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import requests"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"remoteYamlResponse = requests.get(\n",
" url = \"https://raw.githubusercontent.com/ICBacon/icbacon.github.io/master/_config.yml\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"yamlFromRemote = yaml.safe_load(remoteYamlResponse.text)"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The keys in this yaml file are: \n",
"github\n",
"markdown\n",
"title\n",
"email\n",
"description\n",
"url\n",
"disqus_username\n",
"baseurl\n",
"twitter_username\n"
]
}
],
"source": [
"print('The keys in this yaml file are: ')\n",
"for key in yamlFromRemote:\n",
" print(key)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Hacking yaml front matter"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"urlStream = 'https://raw.githubusercontent.com/ICBacon/icbacon.github.io/master/_posts/2015-07-10-my-first-day-at-continuum.md'\n",
"stream = requests.get(\n",
" url = urlStream,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'title': 'Installing Miniconda3 for a Mac OS X', 'layout': 'post'}\n"
]
}
],
"source": [
"# Split the string using the stream tokens ---\n",
"frontMatter = yaml.safe_load(stream.text.split('---')[1])\n",
"print(frontMatter)"
]
},
{
"cell_type": "code",
"execution_count": 48,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def myStreamParser( url ):\n",
" \"\"\"\n",
" Convert a remote file with yaml front matter\n",
" into Python variables.\n",
" \"\"\"\n",
" stream = requests.get(\n",
" url = url,\n",
" )\n",
" return yaml.safe_load(stream.text.split('---')[1])"
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"layout\n",
"content1\n",
"im\n",
"flickr1\n",
"link1\n"
]
}
],
"source": [
"yamlStreamFunc = myStreamParser(\n",
" 'https://raw.githubusercontent.com/tonyfast/nsf-goali/gh-pages/_posts/2014-08-06-Markdown-Example.html'\n",
")\n",
"for key in yamlStreamFunc:\n",
" print(key)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Yaml can load JSON!"
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"True\n"
]
}
],
"source": [
"remoteJSON = requests.get(\n",
" url='https://api.github.com/users/icbacon/events'\n",
")\n",
"print(yaml.safe_load(remoteJSON.text) == remoteJSON.json())"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.4.3"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment