Skip to content

Instantly share code, notes, and snippets.

@yaythomas
Last active February 10, 2024 20:45
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save yaythomas/20a9320f0edea559edc98029c8903150 to your computer and use it in GitHub Desktop.
Save yaythomas/20a9320f0edea559edc98029c8903150 to your computer and use it in GitHub Desktop.
convert flat hugo content pages to leaf bundles

convert hugo content files to page leaf bundles

This pipeline converts Hugo content pages to page leaf bundles.

From:

`-- content
    |-- about.md
    `-- post
        |-- content-1.md
        |-- content-2.md
        |-- content-3.md
`-- static
    `|-- images
        |-- image-1.jpg
        |-- image-2.jpg
        |-- image-3.jpg

To:

`-- content
    |-- about.md
    `-- post
        `-- content-1
            |-- index.md
            `-- images
                |-- image-1.jpg
        `-- content-2
            |-- index.md
            `-- images
                |-- image-1.jpg
                |-- image-2.jpg

requirements

to run me, you need pypyr.

this assumes Python 3.6 or higher.

tl;dr:

$ pip install pypyr

how to run

Run from the root of your Hugo site repo.

Put the attached flat-to-bundle.yaml file into a directory ./ops.

This is just for tidiness, you can put the flat-to-bundle.yaml file in root, in which case you run it $ pypyr flat-to-bundle, or in any other dir, in which case you run $ pypyr my-dir/flat-to-bundle

# use a glob to specify a bunch of files to process in one go:
$ pypyr ops/flat-to-bundle in='content/posts/post*.md'

# process a single file:
$ pypyr ops/flat-to-bundle in='content/posts/post1.md'

# set parent out directory (default is content/posts/out):
$ pypyr ops/flat-to-bundle in='content/posts/post*.md' out=content/out

# set image destination sub-directory in the bundle:
$ pypyr ops/flat-to-bundle in='content/posts/post*.md' out=content/out out_img=img

# set image source root directory (default is './static'):
$ pypyr ops/flat-to-bundle in='content/posts/post*.md' out=content/out in_img=assets out_img=img

input arguments.

You don't have to specify all of these, unless you want something other than the default values.

  • in: look for a file or bunch of files depending on this input glob. Default content/posts/*.md
  • out: create the bundles in this output directory. Default content/posts/out
  • in_img: root directory for input images. Default static. If you keep your images in assets, change that here.
  • out_img: sub-directory for images in the output bundle. Default images.

markdown image formats

Tested with the following styles of markdown image declarations:

![image 1 here](/img/img1.jpg)
![image 4 here](http://myurl.blah/img/img4.jpg)
![](/img/img2-arb.jpg)

Note: NOT particularly tested with image-within-link.

Note: unlikely to work as-is with multi-lingual - you might need to tweak line 61 & 63 for this.

# to run me, you need pypyr. https://github.com/pypyr/pypyr
#
# run from the root of your hugo repo.
#
# put this file in ./ops/flat-to-bundle.yaml
#
# use a glob to specify a bunch of files to process in one go:
# $ pypyr ops/flat-to-bundle in='content/posts/post*.md'
#
# process a single file:
# $ pypyr ops/flat-to-bundle in='content/posts/post1.md'
#
# set out directory (default is content/posts/out):
# $ pypyr ops/flat-to-bundle in='content/posts/post*.md' out=content/out
#
# set image destination sub-directory in the bundle:
# $ pypyr ops/flat-to-bundle in='content/posts/post*.md' out=content/out out_img=img
#
# set image source root directory (default is 'static'):
# $ pypyr ops/flat-to-bundle in='content/posts/post*.md' out=content/out in_img=assets out_img=img
context_parser: pypyr.parser.keyvaluepairs
steps:
- name: pypyr.steps.default
comment: default in/out dirs if not specified
in:
defaults:
in: content/posts/*.md # glob for input file or files.
out: content/posts/out # create bundles in this root directory
in_img: static # in images root dir
out_img: images # dest images sub-dir in bundle
- name: pypyr.steps.contextsetf
in:
contextSetf:
processed_images: !py set()
- name: pypyr.steps.glob
description: '--> processing this path: {in}'
in:
glob: '{in}'
- name: pypyr.steps.call
comment: iterate over each file found for path glob
foreach: '{globOut}'
in:
call: do_file
- name: pypyr.steps.echo
description: --> processed images
foreach: '{processed_images}'
in:
echoMe: "{i}"
- name: pypyr.steps.echo
in:
echoMe: done!
do_file:
- name: pypyr.steps.py
description: '--> processing file: {i}'
in:
pycode: |
from pathlib import Path
current_path = Path(context['i'])
context['current_file'] = current_path.stem
current_out_dir = Path(context['out']).joinpath(current_path.stem)
context['current_out_dir'] = current_out_dir
context['current_out_file'] = current_out_dir.joinpath('index.md')
- name: pypyr.steps.py
comment: parse current .md for ![]() image tags
in:
pycode: |
imgs_in_md = set()
from pathlib import Path, PosixPath
from urllib.parse import urlparse
import re
with open(context['i']) as f:
imgs_in_md.update(re.findall(r'(?:!\[.*?\]\()(.+?)(?:\))',f.read()))
img_path_prefix = PosixPath(context['out_img'])
img_replace_str = ']({})'
img_in_root = Path(context['in_img'])
out_dir = context['current_out_dir']
img_out_dir = out_dir.joinpath(context['out_img'])
replace_pairs = {}
images = {}
if imgs_in_md:
img_out_dir.mkdir(parents=True, exist_ok=True)
for img in imgs_in_md:
basename = PosixPath(img).name
replace_pairs[img_replace_str.format(img)] = img_replace_str.format(img_path_prefix.joinpath(basename))
img_in_path = img_in_root.joinpath(urlparse(img).path.strip('/'))
images[img_in_path] = out_dir.joinpath(context['out_img'], basename)
context['processed_images'].add(str(img_in_path))
context['replacePairs'] = replace_pairs
context['images'] = images
- name: pypyr.steps.filereplace
description: '--> writing to: {current_out_file}'
in:
fileReplace:
in: '{i}'
out: '{current_out_file}'
replacePairs: '{replacePairs}'
- name: pypyr.steps.cmd
comment: cp instead of mv, in case there are duplicates
foreach: !py images.items()
in:
cmd: cp -f {i[0]} {i[1]}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment