Skip to content

Instantly share code, notes, and snippets.

@ivan3bx
Last active February 9, 2024 00:16
Show Gist options
  • Save ivan3bx/0b8e3bbe28edf83c615dea403677ed22 to your computer and use it in GitHub Desktop.
Save ivan3bx/0b8e3bbe28edf83c615dea403677ed22 to your computer and use it in GitHub Desktop.
Simple script using Puppet to generate image previews from article links on a Hugo blog
version: '2.4'
services:
hugo:
build:
context: .
dockerfile: Dockerfile-hugo
mem_limit: 256m
platform: linux/amd64
volumes:
- ../:/var/www
# Uncomment to access this containers postgres instance via port 5432
ports:
- "127.0.0.1:1313:1313"
expose:
- "1313"
puppet:
build:
context: .
dockerfile: Dockerfile
mem_limit: 512m
volumes:
- ./:/tmp/data/site
logging:
driver: none
depends_on:
- hugo
FROM ghcr.io/peaceiris/hugo:v0.122.0-mod
RUN apt-get update && apt-get install -y \
git bash net-tools \
&& rm -rf /var/lib/apt/lists/*
# Set the working directory
RUN mkdir -p /var/www
WORKDIR /var/www
ENTRYPOINT ["hugo", "server", "--renderToDisk", "--bind", "0.0.0.0", "--baseURL=http://hugo:1313"]
FROM ghcr.io/puppeteer/puppeteer:latest
RUN mkdir -p /tmp/data/site
WORKDIR /tmp/data/site
const puppeteer = require('puppeteer')
// import puppeteer from 'puppeteer';
// fetch all "/post" URLs from sitemap.xml
async function fetchAndExtract() {
try {
const response = await fetch('http://hugo:1313/sitemap.xml')
const xmlData = await response.text()
// Regular expression to match <loc> elements
const locRegex = /<loc>(.*?)<\/loc>/g
const matchingLocs = []
let match
while ((match = locRegex.exec(xmlData)) !== null) {
const loc = match[1]
if (loc.match(/.\/posts\/./)) {
matchingLocs.push(loc)
}
}
console.log('Matching <loc> elements:')
console.log(matchingLocs)
return matchingLocs
} catch (error) {
console.error('Error fetching or parsing XML:', error)
return []
}
}
(async () => {
// Launch the browser and open a new blank page
const browser = await puppeteer.launch()
const page = await browser.newPage()
console.log("started browser")
// Set screen size
await page.setViewport({ width: 1000, height: 1200 })
// get all post urls
var urls = await fetchAndExtract()
console.log(urls.length, " urls found")
// for each small version, take a screenshot
for (const url of urls) {
let previewURL = url + 'small.html'
console.log("target", previewURL)
await page.goto(previewURL)
// wait for <article> element
await page.waitForSelector("article")
const bodyHeight = await page.evaluate(() => {
var articleHeight = document.querySelector("article").clientHeight
var offsetTop = document.querySelector("article").offsetTop
// offset is 2x the offsetTop
return articleHeight + offsetTop + offsetTop
})
console.log('The height is "%s".', bodyHeight)
await page.setViewport({ width: 1000, height: bodyHeight })
await page.screenshot({ path: previewURL.replaceAll("/", "_") + '.png' })
}
await browser.close()
})()
@ivan3bx
Copy link
Author

ivan3bx commented Feb 9, 2024

Running this example:

  1. Ensure docker-compose mounts your hugo installation to the rigth path.
  2. docke-compose build should pass

Execute the following:

  docker-compose run --rm puppet node -e "$(cat ./script.mjs)"

Images will generate in the current directory corresponding to the URLs in the sitemap.xml

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment