Skip to content

Instantly share code, notes, and snippets.

@sgharms
Created July 25, 2018 03:01
Show Gist options
  • Star 5 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save sgharms/cb9451b35dfa88543f5c62694aa07c03 to your computer and use it in GitHub Desktop.
Save sgharms/cb9451b35dfa88543f5c62694aa07c03 to your computer and use it in GitHub Desktop.
Convert Gitbook export JSON blob to Markdown
#!/usr/bin/env node
const fs = require('fs');
const processFile = process.argv[2];
const theStack = [];
let nestDepth = -1;
const applyMarks = textNode => {
let theText = textNode.text;
// Plain text
if (textNode.marks.length == 0) return theText;
let signals = textNode
.marks
.map( m => m.type );
for (let signal of signals ) {
switch (signal) {
case "italic":
theText = `_${theText}_`;
break;
case "bold":
theText = `**${theText}**`;
break;
case "code":
theText = `\`${theText}\``;
break;
}
}
return theText;
}
const processRange = (o, accum, depth) => {
accum.push(applyMarks(o));
}
const defLink = (node) => {
let t = [];
parseBlob(node.nodes, t, 0)
return `[${t.join()}](${node.data.href})`;
}
const determineKind = (o, accum, depth) => {
switch (o.kind) {
case "document":
parseBlob(o.nodes, accum, depth + 1);
break;
case "range":
processRange(o, accum, depth);
break;
case "inline":
/* Here we assume all inlines are links, probably not true in the
* general case. I'd imagine img are probably handled like this, but we
* didn't use those in our docs */
accum.push(defLink(o));
break;
case "text":
parseBlob(o.ranges, accum, depth + 1)
break;
case "block":
processBlock(o, accum, depth,
() => parseBlob(o.nodes, accum, depth + 1))
break;
}
}
const processBlock = (block, accum, depth, cb) => {
const breakline = () => accum.push("\n\n");
if (block.type.startsWith("heading-")) {
// Make sure headings "pop" vertically"
let prefix = (accum.length > 0) ? "\n" : '';
let headingLevel = parseInt(block.type.split('-').pop());
let mdHeader = prefix + "#".repeat(headingLevel);
/* Implement a custom parseHeader queue */
/* Sometimes sprintf() would be great */
let newStack = []
parseBlob(block.nodes, newStack, depth);
let body = newStack.join('');
accum.push(`${mdHeader} ${body}`);
breakline();
}
switch (block.type) {
case "paragraph":
cb();
/* Total hack to make the paragraphs gap properly when they're used as
* plain old paragraphs, but not when they're (oddly?) paragraphs nested
* inside of other list-items (wha?). I might not understand this very
* well, but easy to tweak the sources in vim from this assumption */
if (accum[accum.length - 1] !== "\n") accum.push("\n");
break;
case "list-unordered":
cb();
break;
case "list-item":
nestDepth++;
accum.push(" ".repeat(nestDepth) + "* ");
cb();
nestDepth--;
}
}
const parseBlob = (src, accum, depth) => {
// The "heart" of the recursion: document, Object of something that has an
// array that needs to be passed down, or the Array of things that need to be
// treated as Objects by this method.
if (depth < 0) parseBlob(src.document, accum, 0);
if (Array.isArray(src)) {
for (let sub_s of src) {
parseBlob(sub_s, accum, depth + 1)
}
return;
}
if (typeof(src) === "object") {
determineKind(src, accum, depth)
return;
}
throw(`The src in parseBlob was unintelligible`);
}
const renderAsMarkdown = (data) => {
parseBlob(data, theStack, -1)
console.log(theStack.join(''));
};
fs.readFile(processFile, 'utf8', (err, data) => renderAsMarkdown(JSON.parse(data)));
@RohanM
Copy link

RohanM commented Jul 22, 2022

Thanks very much for sharing this! I ended up using this gist as the starting point for a more comprehensive crawler and exporter:

https://github.com/conversation/gitbook-to-md

@sgharms
Copy link
Author

sgharms commented Jul 27, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment