Skip to content

Instantly share code, notes, and snippets.

@janert
Last active November 27, 2024 10:53
Show Gist options
  • Save janert/4e22671044ffb06ee970b04709dd7d81 to your computer and use it in GitHub Desktop.
Save janert/4e22671044ffb06ee970b04709dd7d81 to your computer and use it in GitHub Desktop.
A Hugo Survival Guide

A Hugo Survival Guide

Hugo is a static site generator: it takes some plain-text content, marries it to a bunch of HTML templates, and produces a set of complete, static HTML pages that can be served by any generic, stand-alone web server. Simple.

Or maybe not. Hugo does a lot of things automatically, relying on conventions and implicit rules, rather than on explicit configuration. For example, it tries to match each piece of content with the most "appropriate" template. Hugo will also generate certain pages entirely by itself (mostly content summaries and directory listings, or technical files like sitemap.xml).

All this convenience comes at a price, however: Hugo's operations can appear very opaque. This would matter less if Hugo was configured to "just work" right out of the box. But Hugo by itself only provides the transformational engine: to actually produce output, it also needs a set of page templates (a "theme"). Not understanding the interplay between the website author's source files, Hugo's processing rules, and the theme templates, will result in hours of frustration.

Another difficulty for newcomers is that the "static site model" breaks with many assumptions we have come to make about the way the Web works. Because the generated website is completely static, all visible URLs must map to filesystem entities (files and directories) exactly. This is very different from the typical web app or REST API, where a URL is merely a logical identifier that can be interpreted essentially arbitrarily.

This write-up is an attempt to summarize my understanding of "how Hugo works". It makes no attempt at being complete, and eschews most detailed howtos or reference documentation. Instead, it concentrates on the "Big Picture" ideas and architectural concepts that I could not find in the official Hugo documentation.

Disclaimer

I am not a Hugo developer, not even a power-user. I have never attempted to write a theme. I merely want to use Hugo to build my personal website. But to even use Hugo with confidence, I found it necessary to research its behavior to a far greater level of detail than was easily accessible in the current documentation. This write-up is essentially a cleaned-up version of the notes I took while doing this research.

My own understanding of Hugo is rudimentary; if you find a mistake, please let me know so that I can fix it.

Part 1: A Roadmap to Hugo Concepts

Hugo takes a bunch of input files (typically in Markdown format), marries them to an appropriate set of templates, and produces a set of static HTML pages. These generated files are intended to be served "as is" by a web server, without any further processing; in particular, without any URL rewriting. The consequence is that for any publicly visible URL, there must be a filesystem entity (either a file, or a directory containing an index.html file).

Note 1:

The entire site generated by Hugo is completely static. Its filesystem paths correspond directly to public URLs. To have a public URL https://example.com/a/b/c requires a file or directory at location a/b/c under the web server's docroot.

Basic as this concept is, it keeps tripping me up. I am so used application servers and web apps, which treat URLs as merely logical "endpoints", that I need to keep reminding myself: with Hugo, if you expect to have some public URL, then you must have a filesystem entity, with exactly the same path!

Having established that, to Hugo, a website corresponds directly to a set of filesystem objects, the next important fact is that Hugo expects a strictly hierarchical layout of a website: a hierarchical, tree-like system of directories with files in them.

Note 2:

To Hugo, a website is a strictly hierarchical tree of directories with files (or directories) in them.

That makes good sense, but it is quite different from the way web applications usually organize and manage content. In the typical "shopping cart" application, pages are not so much organized hierarchically, as laterally: a series of workflow steps (checkout, shipment, payment, and so on) that have to be traversed in sequence. And for a site built using single-page architecture, the concept of a "page" as a distinct entity has become fluid anyway, so that the idea of a hierarchy of pages is almost meaningless.

Hugo is not built for this kind of flexibility. In Hugo's world, a website consists of directories with files in them. If a user selects a file, the user receives that file. If the user navigates to a directory, the user will see a list of items (files) in that directory.

Hugo generates HTML pages from a set of flat files that contain the "content" to be shown on the website. Each individual file in the source directory becomes a single page in the website, and with the same path as the source file. And vice versa: each page of content in the generated website requires its own, separate input file, and at the same location in the filesystem.

Note 3:

Each content file in the source directory becomes a separate page in the website, and with the same path. Conversely, every page in the website requires a separate input file, and at the same location in the filesystem.

The template that Hugo will use to render the contents of the input file is a "single page template". (The other kind of template are "list templates", which we will see in a moment.)

This concept is so central to Hugo that it is baked into its architecture: each page is based on a single input file, and its URL is given by the input file's path in the filesystem. This means that it is not possible to combine several, independent bits of content on a single page. As opposed to other templating frameworks, Hugo does not support a "component model", where a page is built up from widget-like components: every page is either a list, or corresponds to a single input file. (It is possible to "decorate" a page with things like menus, footers, sidebars. But it is in general not possible to collate a single web page from multiple bits of input.)

So far we talked about files in the input tree. Directories are treated differently. For each directory, Hugo creates a page that displays all the items (typically files) in that directory. The template used for this page is a "list template".

Note 4:

For each directory, Hugo automatically generates a list of the items (typically files) contained in that directory.

While it is possible to modify or augment the information that is displayed on these "list" pages, it is important to realize that they are usually created automatically. This is in contrast to "single" pages, which are created specifically to display an item of content provided by the user.

To generate any output, Hugo needs at least one "single page" and one "list" template --- otherwise, it doesn't know how to produce output. (It is possible to have multiple templates, with different ones being used for different parts of the site.) A set of templates, together with required items such as stylesheets or JavaScript files, constitutes a "theme".

Note 5:

Hugo is just a templating engine. It must be supplemented with a set of templates that define the look and feel of the generated website.

Unfortunately, Hugo ships without a default or fall-back theme. To do anything with Hugo, it is necessary to either adopt a theme (for instance from the repository at themes.gohugo.io), or to create at least a "single page" and a "list" template.

Templates are more than merely aesthetic "skin". They determine what information is visible on a page and can be surprisingly complex. As Hugo parses the input content, it builds up internal data structures, then selects an appropriate template ("single page" for files, "list" for directories), and passes the content data to the template. The template, in turn, must navigate the data structure passed to it by the Hugo engine, and build up a complete HTML page from it.

Review and Outlook

When working with Hugo, it is important to keep its operating model and its limitations firmly in mind.

  • All content is static. URLs are fixed and must correspond to items in the filesystem.

  • Each piece of content is rendered as a single, separate page; it is not possible to combine independent blocks of content to form a page. For a directory, Hugo will automatically generate a list of all items contained in the directory.

In practice, the first is inherent in the "static website" architecture. It does take some getting used to, in particular when coming from a background of application servers or web apps, and it does pose problems as soon as any sort of user response is desired. (For example, it is not easy to support user comments on blog entries.)

More problematic, in my experience, is the decision to treat all content as either "single page" or "list of items". Although valid on some level (even a site like Amazon consists primarily of list and single-detail pages), I think it is just too limiting and constrains the design of the resulting website too much. A comparison with Amazon is revealing: an Amazon single-detail page does not only contain information about one item, but also a list of other items ("Customers who bought also bought..."), as well as a list of reviews! This has nothing to do with the static website model (although they probably don't, Amazon could pre-generate it's single-detail pages, including recommendations and reviews, updating them once a day), but stems from Hugo's expectation that the final website must map, directly, to a hierarchy of filesystem objects.

In fact, much of Hugo's complexity comes from the fact that its underlying architectural metaphors are simply too restrictive, necessitating all kinds of ad-hoc workarounds (for menus, partials, URL management, and so on --- we will cover some of them later) that make Hugo so opaque and difficult to understand!

Why Hugo?

The idea of a static-site generator has a lot of appeal; and being independent of both WordPress and the various blogging platforms is very attractive. Among the various static-site generators, Hugo and Jekyll seem to be the two most popular. As a non-Ruby developer, I ruled out Jekyll. I did not want to become dependent on the larger Node/npm ecosystem, so that ruled out the JavaScript contenders. And none of the Python entries seemed to match any of the other options in popularity. That left Hugo.

Part 2: Input Formats, Directory Layout, Tooling and Workflow

In its most basic mode of operation, Hugo expects a hierarchy of input files in a source directory (by default called content/). When run, Hugo marries these to the appropriate templates and places the generated HTML files in an output directory (by default: public/). The content of this directory can be placed, as is, in the docroot of a public-facing web server. If Hugo encounters other files (such as images or stylesheets) in the source directory, it copies them unmodified to the output directory.

Input Formats

Input format for "content" is usually Markdown. Out of the box, Hugo can also handle plain HTML and Emacs Org-Mode. AsciiDoc, reStructuredText, and Pandoc require the appropriate tools to be installed. Hugo determines the format by the file extension (.md, .html, .org) or from the frontmatter or preamble of each file.

Frontmatter

Hugo input files typically begin with a preamble in YAML format. (TOML and JSON are also possible.) The preamble can be used to set configuration values for each piece of content: for example, it is possible to assign keyword tags to the page, to modify its public URL, or to specify the specific template to render it.

Here is an example (in YAML format):

---
title: "A Hugo Survival Guide"
date: "2020-02-22T22:22:22-00:00"
slug: "a-hugo-survival-guide"
---

Notice the YAML-specific fencing with ---. (TOML uses +++, and for JSON, the entire preamble needs to be enclosed in curlies {...}.)

Input File Warnings and Remarks

Frontmatter parameters can interfere with rendering in surreptitious ways. For example, by default, Hugo creates new content items with a frontmatter parameter draft set to true, with the consequence that no output will be generated for this piece of content.

Markdown must be HTML-compatible. For example, it is not possible to have emphasis (_ ... _) span multiple paragraphs. If you want to emphasize two consecutive paragraphs, emphasize them individually.

There may be additional problems when using JavaScript (such as MathJax) on the created website. Any JavaScript library runs after template expansion is complete; it may therefore be necessary to protect some markup from being interpreted as Markdown. (Underscores, specifically, are both used by Markdown and MathJax, but for different purposes: emphasis or italics here, subscripts there.)

Markdown is less expressive than HTML. In particular, Markdown has no provisions to include rich formatting directives (such as color specifications or font changes) in a Markdown file. Embedded HTML or shortcodes provide workarounds. (Shortcodes are pre-defined snippets of Go Template language that can be embedded in a Markdown file and will be evaluated when the content is rendered.) Note that by default, recent versions of Hugo do not pass through embedded HTML, but discard it. (See next paragraph.)

HTML files without frontmatter, in general, will be copied to the output as-is, whereas HTML files with frontmatter (even if it is empty) will be treated as content files and subject to template expansion.

Usually, HTML that is embedded in Markdown is passed through untouched. Recent versions of Hugo do not behave like this by default; instead, the following addition must be made to the global configuration file config.toml to enable this behavior:

[markup]
  [markup.goldmark]
    [markup.goldmark.renderer]
      unsafe = true

Input processors are often external tools or libraries that Hugo uses to parse and process input files; some may behave slightly different than others. The Hugo documentation may not always be up-to-date in this regard.

Important Frontmatter Parameters

Hugo defines a large number of parameters that can be set via the frontmatter; themes may define additional ones. In general, these parameters are optional: they are a low-level way to override global configurations (either default or from the global configuration file) for a specific piece of content. Two remarks:

  • There is no magic here. All that the "frontmatter" does is to populate or override some of Hugo's internal key/value data structures.

  • Frontmatter is nice because it often gives fairly detailed control over individual pieces of content. At the same time, it scatters configuration information widely. I have found it better to rely as much as possible on global configurations and only override them, where necessary, for individual pieces of content.

Some of the more important pre-defined parameters include the following:

Metadata

date
The primary "date" associated with this page. Other date-related frontmatter variables are expiryDate, publishDate, and lastmod. (If present, Hugo will use the first two to decide whether to publish this piece of content or not.) Be warned that Hugo will not publish posts with a date that lies in the future!

description, keywords, author
May be used by the template to populate <meta> tags in the HTML header. (Not all themes use this information.)

tags, categories
Keyword tags assigned to this piece of content. Hugo uses this information to create lists of all tagged pages. (Also see the discussion of "Taxonomies".)

Naming and URL Management

title
This is typically used by the template to populate the page's headline and to set the page's title in the HTML header. It may be required by some themes for the site to work properly

slug
Specify the slug (that is, the last part of the URL); if not present, the slug will be generated from the file name.

url
Complete path, relative to the document root, for this piece of content. Intermediate directories will be created as necessary.

aliases
One or more paths, relative to the document root. For each one, Hugo will create a file with a redirect notice that redirects to the current piece of content. (Note the spelling of the keyword!)

Processing instructions

draft
Do not publish, unless the -D or --buildDrafts option is given. (Hugo sets this parameter to true by default. It is absolutely safe to remove this parameter from the frontmatter of any piece of content.)

layout
The template to be used to render this document. (The details of the template lookup are complicated, see the Hugo documentation.)

weight
A numeric value that is used when sorting pages (in list views, for instance). May be positive or negative, integer or floating point.

markup
Specify the markup format of the current document.

Directory Layout

Hugo assumes a specific layout of its working directory. You can use Hugo itself to create a skeleton workspace directory for a new project. Run:

hugo new site demo

This will create a new directory called demo. In this directory, you will find the following files and directories:

config.toml     site-wide configuration file
archetypes/     skeleton documents, frontmatter (see below)
content/        plain-text content (the input directory)
data/
layouts/        template files
static/         static docs, like images or JavaScript files
themes/

All of them are empty or almost empty by default.

All content, in plain text (Markdown), goes into the content/ directory; this is the source or input directory. Notice that while frontmatter is typically YAML, the global config file is usually TOML.

The archetype/ directory holds skeleton outlines for content documents; essentially some default frontmatter. They are used by the hugo new command to create new content files. Archetypes may contain references to Hugo variables; if so, they are be evaluated by hugo new. The frontmatter in the resulting starter files will contain only values, not references to Hugo variables.

Complete, downloaded themes go into the themes/ directory. It is possible to provide customized templates that may override some of the theme's settings; they go into the layouts/ directory.

The static/ directory is a repository for static files that are not rendered via templates, such as images, stylesheets, or JavaScript libraries. It is possible to create subdirectories (such as static/imgs/); they will be replicated into the output directory. The data/ directory, finally, is intended as a place for additional configuration data, or for static data to populate a page.

Hugo recognizes additional directories. For example, if one wants to let Hugo process a stylesheet, then it needs to be placed into an assets/ directory. These directories are not created by default; see the Hugo documentation for additional details.

Finally, many of these directories can be changed using command-line flags. Keep in mind, however, that Hugo will generally only access files that are below the current working directory; it will not follow arbitrary paths.

Tooling

The workflow description below assumes that you already have Hugo installed. As a Go program, Hugo is a stand-alone application that does not depend on external libraries. There are two different versions of Hugo in a release. The "extended" version includes some additional functionality for image processing (scaling and cropping), for the compilation of SASS/SCSS files, and some other features. Be aware that some themes require the extended version!

In general, the hugo command must be executed within the the project's workspace directory. The hugo command takes a number of subcommands and command-line options. Some of the most important subcommands are:

hugo
When invoked without a subcommand, Hugo will process the inputs and static files, creating a set of static HTML pages and auxiliary files. By default, Hugo will create a directory public/ inside the project's workspace as destination for the output files. Hugo also creates a resources/ directory as destination for intermediate results (such as cropped images).

hugo server
When invoked with the server command, Hugo does not write its results out to disk. Instead, it starts up an HTTP server (by default on port 1313) and serves the generated pages from memory. The server watches the content directory and configuration file, and automatically refreshes the results when it detects changes. (This is primarily intended as a development tool.) Be warned that by default the server only binds to localhost and will not be visible using the host's public IP address. (This can be overridden using the --bind option.)

hugo new
The new command, when followed by a path in the content/ directory, will create a new file (as well as any required intermediate directories) at the given location. The file will be empty, except for the frontmatter, which will have been pulled from the archtetypes/ directory. Any Hugo variables within the frontmatter will have been evaluated.

hugo new site
This, when followed by the desired directory name, will create a new workspace directory of the given name, and containing the default workspace directory hierarchy.

Compared to other build systems, Hugo's tooling is fairly rudimentary. Many activities (such as adding a theme) require multiple, manual steps.

There is no clean command or target. If there already is a public/ output directory, Hugo will continue placing results into it. It will be necessary to manually remove public/ (and resources/) to get a fresh start.

Diagnostics are extremely poor. If something doesn't work, it is very difficult to get any information what might be causing the problem. (At the same time, the hugo command does not follow the Unix tradition to "succeed silently, but fail loudly": it will chattily and unnecessarily tell you when it has created a new file using hugo new, but it will silently discard various input files according to its own processing rules, leaving you none the wiser.)

Finally, the hugo command (in particular the server) can get easily confused or even crash when dealing with malformed input files, or the temporary files that editors sometimes leave in working directories.

Workflow

Here are the steps to create a project, add a theme, and begin creating content. Remember that, out of the box, Hugo does not contain a default theme: hence, not only is it necessary to add a theme, but also to configure it.

Note that this sequence of steps contains additional steps, compared to the version in the Hugo documentation!

  1. Create a workspace and skeleton directory hierarchy using hugo new site

  2. Add a theme into the themes directory in the workspace. This can be done using git or by downloading a theme and copying it to its destination; it is not done using a hugo subcommand.

  3. A theme typically contains an exampleSite directory, containing a complete website demonstrating the theme. Use its global configuration file (typically: exampleSite/config.toml) as basis for the configuration file of your new project. Edit your copy of this file as needed. (The reason is that many themes require specific configurations values to work properly; using the configurations from the example site as basis provides a fairly reliable starting point.)

  4. Possibly copy the contents of the theme's archetypes/ directory into the corresponding top level location. (Again, the reason is that some themes require specific frontmatter variables that the default archetypes will not know about.)

  5. Create some pieces of content using hugo new. Edit them as desired. (Remember to remove any draft: true lines from the frontmatter, otherwise Hugo will ignore the piece of content.) Alternatively, copy the files from the example site into your content/ directory to have a starting point.

  6. Start the development server using hugo server. (Default port 1313, use the -p flag to change the port.)

  7. Examine the site at http://localhost:1313. Hugo injects some JavaScript into the generated HTML which reloads the page automatically whenever the content or configuration of the site changes. This is convenient, but I found it occasionally a bit flaky if many changes are made quickly, or in the presence of symbolic links. (Remember to use the --bind option if you would like to access the development server with any device other than the actual development host.)

  8. Finally, build the site using hugo. Deploy the contents of the public/ output directory to the hosting provider of your choice. Remember that Hugo does not clean an existing output directory automatically if there is one.

Part 3: Processing Model, Input/Output Mapping, URL Management

In principle, Hugo takes a hierarchy of directories and files underneath the source directory, and recreates the same hierarchy in the destination directory: it couldn't be simpler. But there are two circumstances that conspire to turn the whole topic of input/output mapping into the most confusing aspect of working with Hugo:

  • The path names of the generated files will be the public URLs of the finished site. Any amount of URL management, rewriting, or cleaning therefore amounts to changes in the mapping of source to destination files.

  • For each directory, Hugo automatically creates a page, showing all the items in that directory. This page is not based on user-provided content; it is created synthetically by Hugo. But users may want to add to or modify the content of these created pages. Hugo provides a mechanism for doing so that sometimes creates additional confusion. (In particular as the Hugo documentation of this mechanism is not noted for its clarity.)

Clean URLs

The first source of complexity is the desire to have "clean URLs" that end with a directory name, not a filename and extension:

www.example.com/news/what-happened-today/           Clean
www.example.com/news/what-happened-today.html       Ugly

Because in a static site, any public URL must correspond to an object in the filesystem, the generated filesystem objects must be:

public/news/what-happened-today/index.html

Most web servers are configured to silently serve the index.html file when the request URL points to the parent directory.

To create output at this URL, Hugo allows two different input styles:

content/news/what-happened-today.md                 File
content/news/what-happened-today/index.md           Directory with index.md

Either of these alternatives will map to the public URL stated earlier. (Of course, you shouldn't have both of them in your input directory; otherwise, the results will clobber each other).

Here is the problem: remember that Hugo will automatically create a synthetic page for all directories in the input source tree? Clearly, for the directory what-happened-today in the second alternative, this is not appropriate, because this directory contains only a single item, which is itself a page. Hence Hugo has the special rule:

If a directory contains a file called index.md, then process this directory as if it was a file!

Why, then, allow directories that don't contain items, but that map to single pages at all? Because they prevent cluttering the namespace if there are auxiliary files (such as images)!

Imagine that the page in question was referring to an image, say img.png. Hugo copies files that are not Markdown directly from their location in the source tree to exactly the same position in the destination directory. Hence a file at content/news/img.png would be copied to public/news/img.png, cluttering the namespace in that directory. (Alternatively, you could have all image files in the content/static/ directory, again cluttering the global namespace.)

By contrast, if the input file resides in its own directory, then the image file can also be placed into that directory:

content/news/what-happened-today/index.md
content/news/what-happened-today/img.png

Both files will be mapped to the directory public/news/what-happened-today/ in the output directory. The image file will be local to this directory, and not clutter the wider namespace.

To summarize:

  • Input can either be a Markdown file with an arbitrary name, or a directory containing a Markdown file named index.md.
  • Either will be mapped to a directory, containing an index.html file, with the content placed into that file.
  • Directories containing an index.md file will not be treated as directories, but will be processed as if they were a file.

Customizing Directory Listings

For each directory, Hugo creates a synthetic page, typically showing the items in the directory. It uses the "list" template for the layout of the resulting page, and in general, there is no user-provided "content" for that page.

But what if the user would like to provide some content, after all? Or possibly just some processing instructions in the frontmatter?

To allow for this, Hugo allows for a special file to be placed into a directory. This file must be called _index.md. If such a file is found, then its contents will be made available to the list template that is used to generate the directory listing page. (It is up to the template to make use of the content; the template may ignore it. A typical use is for the _index.md file to contain only processing instructions in its frontmatter.)

To summarize:

  • If a file called _index.md is found in a directory, then its contents will be made available to the list template that is used to generate the directory listing page for this directory.

  • The directory will be processed as a directory, not as a file.

Overriding Filenames

In everything so far, I assumed that the filesystem name of an object in the source tree was going to become part of the public URL for the generated page. (In the example above, either the file basename or the directory name what-happened-today became part of the public URL.)

But Hugo also allows to override the filename of the input file through frontmatter parameters! In this case, the generated HTML file can be at an arbitrary position in the destination directory; no matter where its corresponding input file resides in the source tree.

There are three frontmatter parameters that matter in this context:

title
The title parameter is generally important, because many themes use its value for visible headlines. But it is also the default for the page-specific part of the visible URL.

slug
The last part of a URL, identifying the specific page or piece of content. (In www.example.com/news/what-happened-today/, the slug is what-happened-today.)

url
The full path part of a URL (the part following the domain).

Yet another way to override the default output location is to configure "permalinks" in the global config.toml file. This option is only available for "sections" (that is, for the top-level directories directly underneath content/). For each such "section" a URL pattern can be specified in the site configuration file. For all content in this section, the corresponding output will be generated at the location pointed to by that pattern. The pattern can include fixed strings, as well a certain variables populated by Hugo. For example, it is possible to interject the year into the URL for blog posts:

blog = "/blog/:year/:slug/"

This will render all content underneath content/posts/ at URLs whose path starts with the fixed string "blog", followed by year, and the title of the piece.

The Home Page

The Home page is a special case: one may think of it as a "content" page. But because it sits at the top of the directory hierarchy, it must be a "list" page. Furthermore, any user-provided content must be in a file called _index.md to ensure that processing does not stop at the root of the document directory! (Many themes provide a special template, called index.html, that is only going to be used to render the home page.)

A Worked Example

The following example shows the contents of a source directory, and the directories and files that Hugo will typically map them to (assuming nothing is overridden in any of the files' frontmatter). (Two dashes -- indicate a missing file!)

content/                public/
  --                      index.html                         LIST page
                        
  stuff.md                stuff/index.html
  
  about/
    index.md              about/index.html

  posts/
    --                    posts/index.html                   LIST page
    first.md              posts/first/index.html
    other/
      post.md             posts/other/post/index.html
      fedex.md            posts/other/fedex/index.html
    second.md             posts/second/index.html
    final/
      index.md            posts/final/index.html

  guides/
    _index.md             guides/index.html                  LIST page
    victor.md             guides/victor/index.html
    hugo.md               guides/hugo/index.html

  bundle/
    index.md              bundle/index.html
    img.png               bundle/img.png                     direct

  problem/
    index.md              problem/index.html                 SINGLE page
    topic.md              --                                 LOST
    text.md               --                                 LOST
    img.png               problem/img.png                    direct

  nested/
    index.md              nested/index.html                  SINGLE page
    img1.png              nested/img1.png                    direct
    deeper/               --                                 LOST
      index.md            --                                 LOST
    img/
      img2.png            nested/img/img2.png                direct
    mixed/
      index.md            --                                 LOST
      img3.png            nested/mixed/img3.png              direct

It is worth studying this example in some detail.

  1. Although there is no user-provided content for it, Hugo does create a home page! Remember that the home page uses a list template. To provide custom content for the home page, it must be in a file called _index.md at the root of the source directory.

  2. The next two pages demonstrate the two possible types of input: either as named file (stuff.md) or as named directory (about/) containing an index.md file.

  3. The posts/ directory shows that directories can be nested. The directory listing page for the posts/ directory does not have user-provided content; it is synthetically generated by Hugo.

  1. By contrast, the guides/ directory contains an _index.md file that is used by Hugo to supplement the directory listing page. Hugo treats the guides/ directory as directory, generating pages for the content items (victor.md and hugo.md).

  2. The bundle/ directory shows how to bundle an image with a page.

  3. The next two directories show some commonly encountered problems. The problem/ directory contains an index.md file, which means that Hugo treats this directory as a "page" and will not process any input (Markdown) files in this directory or any directory below. By contrast, non-input files (such as images) are faithfully copied to the destination directory.

  4. The nested/ directory demonstrates the same problem with nested directories.

Hugo's Processing Model

Hugo's processing model for input files can be summarized like this (this may not be exactly correct, but it seems good enough for now):

  1. Recursively visit each directory.

  2. For each directory, create a public destination directory of the same name.

  3. If the current directory contains index.md, the directory is considered a "leaf directory":

    • use the single page template to transform index.md into index.html in the destination directory.
    • STOP processing any Markdown files in this directory or any of its children.
    • do copy any Non-Markdown resources (images, also those in subdirectories) to the destination directory (see step 5).
  4. If the current directory does not contain index.md, then the directory is considered a "branch directory":

    • use the list page template to create index.html in the destination directory, showing items in the current directory.
    • if there is an _index.md in the current directory, include its contents when generating index.html.
  5. For all items in current directory:

    • If Markdown, create a public directory, and use the single page template to create index.html in that directory.
    • Otherwise, copy over directly, without processing, to target directory.
  6. Do not create a public destination directory if it would be empty (because the source directory is empty, or because it contains only materials that would be discarded).

Part 4: Templates, Template Selection and Composition, Shortcodes

Hugo uses Go templates to turn Markdown into HTML. Hugo's rules for matching a piece of content with a template are complex. Moverover, each template may in turn be composed of smaller component templates. Shortcodes provide a way to inject template logic directly into Markdown content.

Template Selection

In general, it is unnecessary to configure which template will be used to render a given piece of content: just drop a Markdown file into the content/ directory, and Hugo will select the most appropriate template, automatically.

Hugo tries to find the most specific layout available for each piece of content. To do so, it takes into account both the type of content, as well as its place in the directory hierarchy. It then tries to find a template, suitable for the type at a comparable location in the layouts/ directory. The detailed rules for template selection are complex, but in practice, only a handful of observations suffice:

  1. First, Hugo determines the type (or "kind", as in "kind of page") of the content. The primary distinction is whether the input represents the content for a single page, or does it represent a list of items (such as the files in a directory or a list of tags). Based on this distinction, either a single page or a list page template will be used; the filenames of the templates are expected to be single.html or list.html.

    Other possible "kinds" are home, section, taxonomy, taxonomyTerm, all of which, including home, are considered list templates; and RSS, sitemap, robotsTXT, and 404. Breaking all the rules, the template for the home page is called index.html and is located in the layouts/ directory itself, not in a subdirectory.

  2. Next, Hugo considers the location of the content in the source directory, and tries to find a template at a matching location in the layouts/ directory. The idea is that the template directory may replicate some of the directory hierarchy of the source tree, and use this information when selecting a template.

    This is actually quite intuitive. To render the file found at content/blog/some-post.md, Hugo will choose the template found at layouts/blog/single.html (if it exists), rather than the one at layouts/_default/single.html.

  3. To locate a template, Hugo will look in two locations: first in the projects layouts/ directory, and only then in the theme's layouts/ directory. Only if it doesn't find a template in either location will it move up one level in the filesystem hierarchy.

    This "cascade" provides a non-intrusive way to customize a theme without actually having to modify the theme files themselves. (In principle, this also allows to upgrade the theme to a new version, without destroying the custom overrides.)

  4. Finally, a specific template file can be identified in the frontmatter of a piece of content using the layout parameter.

There are additional rules, but this will suffice in practice. To summarize:

  • Most content constitutes either a single or a list (or maybe a home) page.

  • Templates are selected by the closest corresponding location of the input content. Exploit this behavior to override themes with local customizations.

  • A specific template can be configured in the content's frontmatter.

Because template selection is partially based on location (that is, filesystem paths or, equivalently, URLs) the Hugo documentation conflates template selection and URL management, but this is unnecessary. It is advantageous to distinguish clearly between these two topics and discuss them separately.

Template Composition

Hugo templates make use of the templating package in the Go standard library. This package is extensively documented elsewhere. Check the Go Package Documentation or this series of tutorials.

The Hugo template for a page can be composed from smaller fragments.

  • Partial templates are template snippets that can be included in other pages.

  • A baseof.html template is a wrapper that can provide infrastructure common to all pages. A baseof.html template represents a complete page (but possibly without guts).

The primary difference between these two mechanisms is the lookup order: partial templates live in the partials/ subdirectory of the layouts/ directory. When a page template (such as list.html or single.html) is evaluated, it sucks in the appropriate partials. By contrast, the baseof.html template lives in layouts/_default/ or one of the subdirectories mirroring the page hierarchy. When a page template is invoked and a baseof.html exists, the base template is evaluated first, which then invokes the actual page template. (Both page and base template can invoke partials themselves.)

Shortcodes

Hugo shortcodes are a way to inject template commands into the plain text content. They are passed to the template and evaluated with it. Shortcodes are a way to send content-specific formatting options to the template. In effect, shortcodes are a way to circumvent Markdown's limited markup capabilities, and to reflect the far richer possibilities of HTML.

Technically, shortcodes are template snippets stored in layouts/shortcodes/. The filename of the snippet, without the extension, becomes the shortcode command. For example, a shortcode stored in a file img.html would be invoked using {{< img />}}.

Shortcodes can take parameters that are available when the shortcode is expanded during template evaluation. Parameters are listed after the shortcode command and can be either positional or preceded by a keyword using key=value syntax.

As said earlier, shortcodes are a way to inject HTML formatting into content files. For example, the following trick keeps being rediscovered on the Hugo mailing list (but has only very recently made it into Hugo's standard release). Imagine you would like to style some piece of content using a CSS class. Create a file layouts/shortcodes/div.html with the following content:

<div class="{{ .Get 0 }}">
{{ .Inner }}
</div>

and then use it within a Markdown file like this (callout is supposed to be the name of a CSS class, defined in a suitable stylesheet):

{{< div callout />}}
Some content...
{{< /div />}}

Finally, Hugo recognizes two different forms of invoking a shortcode:

  • {{< img />}} means that the "content" of the shortcode will be passed through to the template directly, it will not be parsed and rendered as Markdown.

  • {{% img /%}} means that the "content" of the shortcode will be treated as Markdown, and be parsed and interpreted accordingly.

The exact behavior of shortcodes has changed in different versions of Hugo; check the documentation for details.

Part 5: Odds and Ends --- Tags, Content Summaries, Menus

Hugo provides some additional features, mostly to organize, summarize, and present content.

Adding Tags to Content

Hugo supports tagging content with keyword "tags", in order to select and display all pages that have been tagged with some term. The Hugo documentation uses the term "taxonomy" for this functionality, and unfortunately makes a total hash of explaining what they are. That's too bad, because it's really very simple.

A "taxonomy" is simply a map (Hashmap, Dictionary, associative array). The keys in this map are strings, the values are ordered lists of pages.

That wasn't so hard, was it?

Here is an example of a taxonomy, for simplicity rendered as JSON:

tags: {
  "Linux": [ "page1.md", "page2.md", "page3.md" ],
  "Ubuntu": [ "page3.md", "page1.md" ]
}

Tags are assigned to content in the content's frontmatter. For example, the frontmatter of the file page3.md in the code sample above might include the following lines:

tags:
  - "Linux"
  - "Ubuntu"

It's all very simple and straightforward.

Because a Hugo taxonomy is simply an internal data structure, themes can generate pages displaying taxonomy terms and the content associated with them (for example, displaying a list of all pages tagged with "Linux").

Hugo ships with two taxonomies ready to use. (Remember that a Hugo "taxonomy" is simply an instance of a Hashmap.) They are called "tags" and "categories". There is no difference between them, the names are arbitrary, you can use them in any way you like. However, themes may attach specific semantics to either and render them differently. Check the documentation for the theme of your choice.

Finally, it is possible to create additional "taxonomy" instances in the global configuration file. They can then be used in templates exactly like the built-in taxonomies.

Two concluding remarks:

  • Be aware that, although the term "taxonomy" usually implies a hierarchical ordering, Hugo's taxonomies are "flat": there is no nesting of taxonomy terms.

  • The Hugo documentation often refers to "adding a taxonomy to content", but that is not what's happening. Instead, the content is added to the taxonomy (remember that a taxonomy is a Hashmap).

Menus

Like "taxonomies", menus are primarily another internal data structure that templates can access. Basically, a menu is an array of URLs. The template can then render this collection of links as a graphical "menu".

A piece of content can add itself to a menu (through a frontmatter entry). Alternatively, menu entries can be made in the global configuration file.

It is possible to add directories to a menu. The menu entry will link to the directory's "list" page that will display the items contained in the directory.

As with taxonomies, it is possible to have multiple instances of this data structure, and hence multiple, independent menus.

Content Summaries

Hugo has the notion of content "summaries" that a theme may display. For example, one can think of a "list" page, showing not only the title of each post, but also a brief summary of its contents.

There are three ways to define the "summary" for each piece of content:

  • If the content contains the separator <!--more--> (exactly like this), then all content up to that separator constitutes the summary.

  • Alternatively, the summary may be defined in the frontmatter, using the summary: key.

  • Lastly, the length of the summary (in words) can be defined in the global configuration file, using the summaryLength key.

It is not possible to switch off summaries by setting the summary length to zero, or by leaving the frontmatter entry blank. But placing the separator first in the content file (right after the frontmatter) does the trick.

Syntax Highlighting

Hugo uses the Chroma Go library for adding syntax highlighting to code samples. The generated HTML therefore does not have to rely on external JavaScript libraries. (A gallery of available styles can be found here.)

Unexpected Defaults

Sometimes the default behavior of Hugo is not what one would expect. Moreover, the Hugo documentation may not reflect changes in Hugo's default behavior, leading to even more confusion. A few points that seem to cause frequent confusion include:

  • The default Hugo "archetype" includes a draft: true line in its frontmatter. By default, Hugo (silently) discards content that is labelled as draft. If some new piece of content fails to be processed, check for this first! (I recommend getting rid of the entire draft entry in the frontmatter entirely; it's just too rich a source of confusion. Once in production, one may want to introduce it again, as a way to structure the workflow. But during set-up and experimentation, it is an unnecessary nuisance.)

  • Recent versions of Hugo do not pass embedded HTML through to downstream processing, but instead discard it. (This is also true for shortcodes that have expanded into HTML.) To allow embedded HTML, it is necessary to add the following to the global configuration file:

    [markup]
      [markup.goldmark]
        [markup.goldmark.renderer]
          unsafe = true
  • By default, the Hugo development server is only visible locally. If you want to access it from another device, you must specify the public IP address using the --bind option.

Part 6: Closing Thoughts --- The Good, the Bad, the Rest

This is the extent to which I have researched Hugo, in order to feel comfortable applying it to my problems. I'm sure there is more --- but that's all I needed, and therefore all I know. (And even that is most likely not entirely accurate.) Should I ever need to learn more, I may add to this write-up.

It's time to take inventory of my impressions of working with Hugo so far: the Good, the Bad, and All That.

The (Mostly) Good...

  • I like the idea of a "static website", although it does take a little getting used to that there is no URL rewriting at runtime. The inability to respond to user input is, of course, a fundamental limitation, which is more painful than I had anticipated. (Very few sites are really, truly, completely static!)

  • I like the idea of writing content in flat text files, using a lightweight data format, but I am not convinced about Markdown. It's never good if the input format is fundamentally less expressive than the intended output --- the mere existence of short codes (a blatantly hackish workaround) or the (even more hackhish) technique of embedding "pass-through" HTML demonstrate Markdown's inadequacy. Moreover, in the present case, there are two output formats: HTML and LaTeX (at least for someone like me who needs MathJax), requiring even more hackish (and probably brittle) workarounds. I think the ultimate realization is that a contemporary web experience, including color and layout, not to mention MathJax, requires a different, more expressive input format.

  • I like the clear separation of content and presentation: content is kept as Markdown, presentation is (almost entirely) left to templates. No lock-in: it should be fairly easy to take the content to another platform if necessary or desired.

  • I like the "cascade" in the template lookup: to customize a theme, I don't need to touch the original files, but can simply override them in the project's theme directory. I also appreciate how this decouples any customizations from upgrades to the original theme (at least in principle).

  • I like the built-in support for metadata: tag collections, sitemaps, and so on. It's too bad that they only exist as internal data structures, and are not written to file (where they would be available for use by external add-on tools).

  • I actually like the frontmatter, and the ability to add configuration details right in the content itself, without the need to touch a separate configuration file. The downside is that configuration details are potentially scattered around widely. I try to limit the use of frontmatter only to configurations specific to the piece of content they are part of, and put global settings into the global configuration file. (A mechanism for directory-level configuration files, like Apache's .htaccess, would be a nice-to-have.)

  • I like the built-in development server, and the live-reload functionality, but neither is essential. A nice-to-have.

  • I like that (once the entire set-up and configuration is complete!) adding content is indeed very, very simple.

  • Finally, Hugo often gets credit for being fast: that doesn't matter to a small site (like mine) with a few dozen pages, but I can see that it's nice for a larger installation. Even for a small site, it's nice that rebuilds are essentially instantaneous.

The (Mostly) Bad...

  • Hugo is too difficult to understand. The amount of time I spent to get even to my current, limited level of understanding is nowhere short of ridiculous. Given what Hugo tries to accomplish, it is too complex; and the documentation, although extensive, is fragmented, and does little to help a mere user make sense of Hugo.

  • Hugo's reliance on opaque rules and implicit conventions makes it very difficult to understand and predict its behavior. This is where the documentation's lack of a comprehensive, conceptual overview is most painfully felt.

  • Hugo relies on implicit rules and conventions in an attempt to dispense with the need for explicit configuration. Unfortunately, understanding and manipulating all the implied rules and conventions causes more pain (and more uncertainty) than some reasonable amount of explicit configuration would. Moreover, I find that the explicit configuration keeps sneaking in by the back door: No piece of content without frontmatter! At that point, I could dispense with all the opaque black magic around conventions and rules, and instead rely on minimal, but explicit configuration to begin with.

  • The "list-and-page" metaphor seems too restrictive for general applications. It does not allow to combine two or more pieces of content on the same page, making it impossible, for example, to create contemporary, rich portal page. (Also notice how many of the themes look alike: austere personal web pages with a text-centric blog. How much complexity would really be required to support just that?)

  • Hugo provides few creature comforts. The tooling is rudimentary. Some of the out-of-the-box defaults (such as the draft: true default archetype) help neither the beginner nor the advanced user. The absence of a built-in theme that is guaranteed available and guaranteed compatible is particularly painful.

  • The absence of diagnostics deserves a separate, dishonorable mention. If things don't go as planned (which, because of all the implied rules, happens a lot --- in particular in the beginning), there is no way of finding out what's wrong. In particular, it should be possible to receive a warning whenever Hugo decides to ignore certain files in the source tree, rather than just discarding them silently.

  • I am ambivalent regarding the choice of Go (Golang) as implementation language. Its main advantage (for this project) is the high execution speed, which is indeed nice, and a little bit the sense that it is all more than just "a bunch of scripts". On the downside stands the extreme closedness of the application binary, both with respect to runtime inspection as well as extensibility. By design, Go does not offer a plugin or extension mechanism. (This has changed, very recently.) Runtime integration of Go tools is supposed to occur via HTTP: fine for server applications, but inconvenient for a desktop app. One consequence seems to be that a lot of logic moves into the templates, because they provide a user-space method to add behavior. But as probably many people who have worked with template engines will agree, that's not a good place for application logic! (And that's not even mentioning Go's strange add-on language for template logic, with its mixture of Lisp and Unix pipeline syntax.)

  • I worry how safe it will be to rely on Hugo in the long run. Hugo has already seen a number of false starts and feature set changes. It depends on a large number of external libraries (close to 100). Finally, running Hugo requires a theme, with its own set of updates (or lack thereof). There seems to be lots of potential for either surreptitious or catastrophic version skews, incompatibilities, or other behavior changes. Given the history and state of the project, I am not confident that the Hugo project will isolate or protect the end user effectively from these risks.

In Conclusion

I am glad my site is up; Hugo has been helpful in getting me there.

But I feel that it required too much, in fact way too much effort for what should have been a relatively simple exercise. At this point, I've made the investment, and I hence will stick with it for a while, but I can't help thinking that there has got to be an easier way!

Copy link

ghost commented Apr 28, 2024

Thanks for this nice piece. I've been banging my head on the table for the last 2 days, getting Hugo to do what I want.
Is it me or is it just very complicated to turn my own design into a template. Thinking of moving to Wordpress...

@janert
Copy link
Author

janert commented Apr 28, 2024

Thanks for the note; glad you found the write-up helpful. Getting it going was a nightmare, but now that things are up and running, it is indeed very easy and comfortable to use. (You might also find this interesting, when evaluating the web-publishing space more generally: https://janert.me/blog/2022/jamstack-reconsidered/.)

Personally, I have not tried to turn a design into templates; I have always started with a theme and tried to work with that, so I can't really comment on the difficulties you might encounter there - sorry!

Copy link

ghost commented Apr 29, 2024

Thanks, I will continue with Hugo anyway. It's a pity to quit now after all these hours of trying and studying ;-).
Btw; I liked how you talked about Hugo being "implicit". As a matter of fact, I think that is a huge challenge for programming and languages in general. Many frameworks and languages "assume" that simple users know how to do certain things to achieve certain other things to achieve other things that will lead to the desired result. It's a recipe for disaster because it's all these in between steps that make some things hard to do. In the Jamstack article you talked about the "pull" model, which sounds very interesting and certainly easier. It's like putting up the structure of a house and knowing where the windows should go, as opposed to building the house around those windows.

@janert
Copy link
Author

janert commented Apr 29, 2024

I think that is a reasonable strategy: as I said, getting a site up with Hugo for the first time is difficult, but once it's up, things work very smoothly and easily. So, in that sense, I found that the initial investment paid off.
Also keep in mind: I have not kept up with Hugo development, it is possible that some things have gotten easier in the meantime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment