Skip to content

Instantly share code, notes, and snippets.

@goliatone
Forked from BrianLeishman/payload.json
Last active July 4, 2024 00:01
Show Gist options
  • Save goliatone/951d47981912b5f040f26fcc235dcb2a to your computer and use it in GitHub Desktop.
Save goliatone/951d47981912b5f040f26fcc235dcb2a to your computer and use it in GitHub Desktop.
Serverless HTML to PDF with AWS Lambda + Chrome

AWS Lambda Image/PDF Capture

Puppeteer

To deploy Puppeteer in AWS Lambda we need to use a slim version of chromium, and layers.

Get the supported versions from this list.

npm install puppeteer-core@$PUPPETEER_VERSION
npm install @sparticuz/chromium@$CHROMIUM_VERSION

To configure puppeteer:

const browser = await puppeteer.launch({
  args: chromium.args,
  defaultViewport: chromium.defaultViewport,
  executablePath: await chromium.executablePath(),
  headless: chromium.headless,
  ignoreHTTPSErrors: true,
});

Alternative option to render content:

await page.setContent(html);

const content = await page.$("body");
const imageBuffer = await content.screenshot({ omitBackground: true });

Send image to S3:

const s3File = await s3.putObject({
  bucket: "<Your Bucket Name Here>",
  key: `${screenshot}.png`,
  body: imageBuffer,
  contentType: "image/png",
});

Links

How to setup an S3 bucket and access from the lambda function here. Create a layer.

{
"html": "<p>Hello, world!</p>",
"format": "image|pdf"
}
const chromium = require('chrome-aws-lambda');
const fs = require('fs');
exports.handler = async (event, context, callback) => {
let result = null;
let browser = null;
try {
browser = await chromium.puppeteer.launch({
args: chromium.args,
defaultViewport: chromium.defaultViewport,
executablePath: await chromium.executablePath(
process.env.AWS_EXECUTION_ENV
? '/opt/nodejs/node_modules/@sparticuz/chromium/bin'
: undefined,
),
headless: chromium.headless,
ignoreHTTPSErrors: true,
});
// this guarantees that the file won't conflict with other files
// that could possibly be generated at the same time
//
// this is assuming of course that the file system in lambda
// is shared and that we don't want name conflicts to occur
// if the function is invoked at the same time
let fileName = Date.now();
fs.writeFileSync(`/tmp/${fileName}.html`, event.html);
const page = await browser.newPage();
await page.goto(`file:///tmp/${fileName}.html`);
if (event.format === 'pdf') {
result = await makePDF(page);
} else {
result = await makeImage(page);
}
result = result.toString('base64');
} catch (error) {
return callback(error);
} finally {
if (browser !== null) {
await page.close();
await browser.close();
}
}
return callback(null, result);
};
function makeImage(page, filename='screenshot.png') {
return page.screenshot({path: filename});
}
function makePDF(page) {
return page.pdf({
displayHeaderFooter: false,
printBackground: true,
margin: {
'top': '0.4in',
'bottom': '0.4in',
'left': '0.4in',
'right': '0.4in'
}
});
}
@goliatone
Copy link
Author

Golang browser drivers and more:

  • rod: A Chrome DevTools Protocol driver for web automation and scraping.
  • chromedp: A faster, simpler way to drive browsers supporting the Chrome DevTools Protocol.
  • playwright-go: Playwright for Go a browser automation library to control Chromium, Firefox and WebKit with a single API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment