Skip to content

Instantly share code, notes, and snippets.

Last active March 18, 2023 07:26
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
Star You must be signed in to star a gist
What would you like to do?
chat gpt response
require 'csv'
require 'yaml'
require 'redcarpet'
markdown_folder_path = '/path/to/markdown/folder'
csv_file_path = '/path/to/csv/file.csv'
# Initialize CSV file with headers, 'wb') do |csv|
csv << ['Date', 'Title', 'URL']
# Initialize markdown parser
markdown =
# Loop through each markdown file in the folder
Dir.glob("#{markdown_folder_path}/*.md") do |markdown_file_path|
# Read markdown file and extract YAML frontmatter
file_content =
frontmatter = YAML.load(file_content.match(/\A---\n(.+?)\n---/m)[1])
# Extract date and title from frontmatter
date = frontmatter['date']
title = frontmatter['title']
# Parse markdown and extract anchor HREF tags
html = markdown.render(file_content)
urls = html.scan(/<a\s+(?:[^>]*?\s+)?href=(["'])(.*?)\1/m).flatten
# Add each URL to CSV file
urls.each do |url|, 'a+') do |csv|
csv << [date, title, url]

Here's a brief overview of how the script works:

  • The script requires the csv, yaml, and redcarpet libraries, which will be used to read and write CSV files, extract YAML frontmatter, and parse markdown, respectively.

  • The script initializes the paths to the folder containing the markdown files and the CSV file that will be created.

  • The script opens the CSV file and adds headers for the three columns.

  • The script initializes a Redcarpet markdown parser.

  • The script uses the Dir.glob method to loop through each markdown file in the folder.

  • For each markdown file, the script reads the file contents and extracts the YAML frontmatter using a regular expression.

  • The script extracts the date and title from the frontmatter.

  • The script parses the markdown file into HTML using the Redcarpet parser.

  • The script uses another regular expression to extract all anchor HREF tags from the HTML.

  • The script loops through each URL and adds a new row to the CSV file with the date, title, and URL.

Note that this script assumes that all markdown files in the folder have YAML frontmatter with date and title fields. If your markdown files have different frontmatter or none at all, you'll need to modify the script accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment