Skip to content

Instantly share code, notes, and snippets.

@wernsey
Last active July 18, 2023 12:29
Show Gist options
  • Save wernsey/08a63ea1a2d0103a7eaa to your computer and use it in GitHub Desktop.
Save wernsey/08a63ea1a2d0103a7eaa to your computer and use it in GitHub Desktop.
An Awk script to generate HTML documentation from source code with /* */ style comments
BEGIN {
if(!title) title = "Documentation"
print "<!DOCTYPE html>\n<html>\n<head>\n<title>" title "</title>";
print "<style><!--";
print "body {font-family:Arial, Verdana, Helvetica, sans-serif;margin-left:20px;margin-right:20px;}";
print "h1 {color:#575c91;border:none;padding:5px;}";
print "h2 {color:#575c91;border:none;padding:5px;}";
print "h3 {color:#9191c1;border:none;padding:5px;}";
print "a{padding:2px;border-radius:2px;}";
print "a:link {color: #575c91;}";
print "a:visited {color: #575c91;text-decoration:none;}";
print "a:active {background:#575c91;color:#f0f0ff;}";
print "a:hover {background:#b8b8e0;color:#f0f0ff;}";
print "code,strong {color:#575c91}";
print "pre {color:#575c91;background:#d4d4ff;border:none;border-radius:5px;padding:7px;margin-left:15px;margin-right:15px;}";
print "div.title {color:#575c91;font-weight:bold;background:#b8b8e0;border:none;border-radius:5px;padding:10px;margin:10px 5px;font-family:monospace;}";
print "div.box {background:#f0f0ff;border:none;border-radius:5px;margin:10px 2px;padding:1px;}";
print "div.inner-box {border:none;margin:5px;padding:3px;}";
print "--></style>";
print "<meta http-equiv=\"Content-Type\" content=\"text/html; charset=utf-8\">";
print "</head>\n<body>";
}
/\/\*/ { comment = 1; }
/\*1/ { if(!comment) next; s = substr($0, index($0, "*1") + 2); print "<h1>" filter(s) "</h1>"; next;}
/\*2/ { if(!comment) next; s = substr($0, index($0, "*2") + 2); print "<h2>" filter(s) "</h2>"; next;}
/\*3/ { if(!comment) next; s = substr($0, index($0, "*3") + 2); print "<h3>" filter(s) "</h3>"; next;}
/\*@/ { if(!comment) next; s = substr($0, index($0, "*@") + 2); print "<div class=\"box\"><div class=\"title\">" filter(s) "</div><div class=\"inner-box\">"; div+=2; next;}
/\*#[ \t\r]*$/ { if(!comment) next; if(!pre) print "<br>"; next;}
/\*#/ { if(!comment) next; s = substr($0, index($0, "*#") + 2); print filter(s);}
/\*&/ { if(!comment) next; s = substr($0, index($0, "*&") + 2); print "<code>" filter(s) "</code><br>"; next;}
/\*X/ { if(!comment) next; s = substr($0, index($0, "*X") + 2); print "<p><strong>Example:</strong><code>" filter(s) "</code></p>"; next;}
/\*N/ { if(!comment) next; s = substr($0, index($0, "*N") + 2); print "<p><strong>Note:</strong><em>" filter(s) "</em></p>"; next;}
/\*\[/ { if(!comment) next; pre=1; print "<pre>"; next;}
/\*]/ { if(!comment) next; pre=0; print "</pre>"; next;}
/\*\{/ { if(!comment) next; print "<ul>"; next;}
/\*\*/ { if(!comment) next; s = substr($0, index($0, "**") + 2); print "<li>" filter(s); next;}
# Mistake on my part where *} at the begining of a line clashes with the *} to insert the </strong>
/^[ \t]*\*}/ { if(!comment) next; print "</ul>"; next;}
/\*-/ { if(!comment) next; print "<hr size=2>"; next;}
/\*=/ { if(!comment) next; print "<hr size=5>"; next;}
/\*\// { comment=0; while(div > 0) {print "</div>"; div--;} }
END { print "</body></html>" }
function filter(ss, j, k1, k2, k3)
{
gsub(/&/, "\\&amp;", ss);
gsub(/</, "\\&lt;", ss);
gsub(/>/, "\\&gt;", ss);
gsub(/\\n[ \t\r]*$/, "<br>", ss);
gsub(/{{/, "<code>", ss);
gsub(/}}/, "</code>", ss);
gsub(/{\*/, "<strong>", ss);
gsub(/\*}/, "</strong>", ss);
gsub(/{\//, "<em>", ss);
gsub(/\/}/, "</em>", ss);
gsub(/{_/, "<u>", ss);
gsub(/_}/, "</u>", ss);
# Hyperlinks (excuse my primitive regex)
gsub(/http:\/\/[a-zA-Z0-9._\/\-%~]+/, "<a href=\"&\">&</a>", ss);
# Use a ##word to specify an anchor, eg. ##foo gets translated to <span id="foo">foo</a>
while(j = match(ss, /##[A-Za-z0-9_]+/)) {
k1 = substr(ss, 1, j - 1);
k2 = substr(ss, j + 2, RLENGTH-2);
k3 = substr(ss, j + RLENGTH);
ss = k1 "<span id=\"" k2 "\">" k2 "</span>" k3
}
# Use a ~~word to specify an anchor, eg. ~~foo gets translated to <a href="#foo">foo</a>
while(j = match(ss, /~~[A-Za-z0-9_]+/)) {
k1 = substr(ss, 1, j - 1);
k2 = substr(ss, j + 2, RLENGTH-2);
k3 = substr(ss, j + RLENGTH);
ss = k1 "<a href=\"#" k2 "\">" k2 "</a>" k3
}
gsub(/\*\//, "", ss);
return ss;
}
@coderofsalvation
Copy link

finally something which works without installing operating-system-size toolchains :D

@wernsey
Copy link
Author

wernsey commented Mar 30, 2023

Thanks for the comment.

This script is very old, though. In the meantime I've created this one, wernsey/d.awk, that uses the same idea, but allows you to write your documentation in Markdown. It is a lot more complicated than this script, but it doesn't require an operating-system-size toolchain either.

(I also describe in the alternatives section how you could use an Awk script to extract Markdown comments from your code and then create a HTML document using Markdeep, which is a much simpler Awk script, but requires the third party markdeep.js)

@coderofsalvation
Copy link

coderofsalvation commented Apr 2, 2023

Thanks. For another project (which simply needs to generate a README.md for github), I've decided to refactor/extend one of your snippets.

NOTE: I've added variable expansion (so I can use the awk recursively / include dynamically generated text(files) e.g. ).

snippet

# a no-nonsense source-to-markdown generator which scans for:
#
# /**
#  * # foo
#  *
#  * this is markdown $(cat bar.md)
#  */
#
#  var foo; //  comment with 2 leading spaces is markdown too $(date)
#
/\$\(/                   { cmd=$0; 
                           gsub(/^.*\$\(/,"",cmd); 
                           gsub(/\).*/,"",cmd);
                           cmd | getline stdout; close(cmd);
                           sub(/\$\(.*\)/,stdout);
                         } 
/\/\*\*/                 { doc=1; sub(/^.*\/\*/,""); }
doc && /\*\//            { doc=0;
                           sub(/[[:space:]]*\*\/.*/,"");
                           sub(/^[[:space:]]*\*[[:space:]]?/,"");
                           print
                         }
doc && /^[[:space:]]*\*/ { sub(/^[[:space:]]*\*[[:space:]]?/,""); 
                           print 
                         }
!doc && /\/\/  /         { sub(".*//  ",""); 
                           sub("# ","\n# ");
                           sub("> ","\n> ");
                           print
                         }

Quite bizarre how much you can do with few awk lines ♥

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment