Skip to content

Instantly share code, notes, and snippets.

@bobmonsour
Created April 13, 2023 04:45
Show Gist options
  • Save bobmonsour/895945ad5652d11129d6bbde67ffb2a1 to your computer and use it in GitHub Desktop.
Save bobmonsour/895945ad5652d11129d6bbde67ffb2a1 to your computer and use it in GitHub Desktop.
An Eleventy filter that extracts the meta description from within the <head> element of a web page
// getDescription - given a url, this Eleventy filter extracts the meta
// description from within the <head> element of a web page using the cheerio
// library.
//
// The full html content of the page is fetched using the eleventy-fetch plugin.
// If you have a lot of links from which you want to extract descriptions, the
// initial build time will be slow. However, the plugin will cache the content
// for a duration of your choosing (in this example, it's set to 1 day).
//
// The description is extracted from the <meta> element with the name attribute
// of "description".
//
// If no description is found, the filter returns an empty string. In the event
// of an error, the filter logs an error to the console and returns the string
// "(no description available)"
//
// Be sure to create a .cache folder in your project root and add .cache to your
// .gitignore file. See https://www.11ty.dev/docs/plugins/fetch/#installation
//
const EleventyFetch = require("@11ty/eleventy-fetch");
const cheerio = require("cheerio");
eleventyConfig.addFilter(
"getDescription",
async function getDescription(link) {
try {
let htmlcontent = await EleventyFetch(link, {
duration: "1d",
type: "buffer",
});
const $ = cheerio.load(htmlcontent);
// console.log(
// "description: " + $("meta[name=description]").attr("content")
// );
return $("meta[name=description]").attr("content");
} catch (e) {
console.log(
"Error fetching description for " + link + ": " + e.message
);
return "(no description available)";
}
}
);
@bobmonsour
Copy link
Author

Thanks, Zach. I can't quite understand how to make that work, but I'm still in the early stages of javascript and npm package knowledge journey. Once I understand "cascading asset bucketing" I think I'll be read to conquer linkedom ;-)

@bobmonsour
Copy link
Author

For my use case, specifically for the 11tybundle.dev site, I have changed the cache duration to '*', meaning that eleventy will never fetch new data (after the first success). There's no need for me to be re-fetching complete blog posts to extract a description...once is quite enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment