Skip to content

Instantly share code, notes, and snippets.

@noteed
Last active January 9, 2023 21:53
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save noteed/8af266e250f485ad078df65525f47ba8 to your computer and use it in GitHub Desktop.
Save noteed/8af266e250f485ad078df65525f47ba8 to your computer and use it in GitHub Desktop.
Pandoc, YAML, JSON, CSV, Sqlite
hosts.csv
hosts.db

Pandoc, YAML, JSON, CSV, SQLite

This gists shows a little idea: use Pandoc to format data into HTML, or any other output format supported by Pandoc. This uses the ability of Pandoc to uses templates, and to receive data as YAML either from a separate file, or within a metadata block contained in the document itself..

It also shows a few scripts to use other data formats than YAML and extract metadata blocks as JSON. Note that multiple metadata blocks can be present within a single document.

See also this Gist to see a small Scotty-based web server using full-text search against a SQLite database containing Markdown files.

Self-contained Pandoc document

The document.md file contains data in its metadata YAML block and is its own template:

$ pandoc --template document.md document.md
---
title: The document title
host:
- name: host-1
  ip: 172.17.0.1
- name: host-2
  ip: 172.17.0.2
---

Some hosts:

- host-1: 172.17.0.1
- host-2: 172.17.0.2

The result is a normal Markdown document that can be processed with Pandoc as usual, e.g. to produce an HTML page.

Separate data

The data can also come from an external file (here, hosts.yml):

$ pandoc --template document.md hosts.yml document.md
---
title: The document title
host:
- name: host-1
  ip: 172.17.0.1
- name: host-2
  ip: 172.17.0.2
---

Some hosts:

- host-1: 172.17.0.1
- host-2: 172.17.0.2
- host-3: 172.17.0.3

Note: here the hosts key in the metadata provided by document.md is not used because Pandoc is left-biased when merging multiple metadata blocks.

Other formats

Several scripts are provided to use other data format. For instance:

  • From hosts.json (one object per line) to CSV,
  • From CSV to a SQLite database,
  • From a SQLite database to JSON again (a list of objects),
  • From a SQLite database to YAML (using the previous script).

This shows that the data rendered by Pandoc can come from SQLite with minimal efforts.

Extract metadata block

It is also possible to extract the data "stored" within a metadata block by using a dedicated template (here, metadata.tpl):

$ pandoc --template metadata.tpl document.md
{"title":"The document title","host":[{"ip":"172.17.0.1","name":"host-1"},{"ip":"172.17.0.2","name":"host-2"}]}

TODO

Rendering JSON as a nested table (with the appropriate definitions in html.jq):

$ echo '{"a": {"b": {"c": 2}}, "d": 3}' | jq -r 'include "html"; render_as_table'
#! /usr/bin/env bash
sqlite3 hosts.db -cmd '.mode csv' '.import hosts.csv hosts'
# Demonstrate the import succeeded
sqlite3 hosts.db '.schema hosts'
sqlite3 hosts.db 'select * from hosts'
title host
The document title
name ip
host-1
172.17.0.1
name ip
host-2
172.17.0.2

Some hosts:

$for(host)$

  • $host.name$: $host.ip$ $endfor$
{ "name":"host-1", "ip":"172.17.0.1" }
{ "name":"host-2", "ip":"172.17.0.2" }
{ "name":"host-3", "ip":"172.17.0.3" }
def td: "<td>\(.)</td>";
def th: "<th>\(.)</th>";
def tr:"<tr>\(.)</tr>";
def table: "<table>\(.)</table>";
def tbody: "<tbody>\(.)</tbody>";
def thead: "<thead>\(.)</thead>";
def render_as_table:
"\(
to_entries | map(.key|th) | join("") | tr|thead
)\n\(
to_entries | map("\(
.value |
if type=="object" then
render_as_table|td
elif type=="array" then
render_as_table|td
else
td end
)") | join("") | tr | tbody
)" | table ;
#! /usr/bin/env bash
# Construct the CSV header
head -1 hosts.json | jq -r 'keys | @csv' > hosts.csv
# Extract rows
cat hosts.json | jq -r '[.[]] | @csv' >> hosts.csv
#! /usr/bin/env bash
sqlite3 hosts.db \
"select json_group_array(json_object('ip', ip, 'name', name)) as host
from (select * from hosts)" | jq .
#! /usr/bin/env bash
echo '---'
echo host:
./sqlite-to-json.sh | jq -r -f to-yaml.jq
echo '---'
def y:
(objects | to_entries[] | (.value | type) as $type |
if $type == "array" then
"\(.key):", (.value | y)
elif $type == "object" then
"\(.key):", " \(.value | y)"
else
"\(.key): \(.value)"
end
)
// (arrays | select(length > 0)[] | [y] |
" - \(.[0])", " \(.[1:][])"
)
// .
;
y
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment