human (Dec 30, 2024, 09:12 PM)
- blog_entry id, title, slug, body, created, metadata, search_document, tweet_html, import_ref, extra_head_html, card_image, custom_template, series_id, is_draft, live_timezone
- blog_entry_tags id, entry_id, tag_id
- blog_tag id, tag, description
SQL query postgresql to return id, title, created for every blog_entry in 2024 that was tagged with tag=llms
assistant (Dec 30, 2024, 09:12 PM)
I'll help you write a SQL query to get blog entries from 2024 tagged with "llms". We'll need to join all three tables to connect the entries with their tags.
SELECT DISTINCT
be.id,
be.title,
be.created
FROM blog_entry be
JOIN blog_entry_tags bet ON be.id = bet.entry_id
JOIN blog_tag bt ON bet.tag_id = bt.id
WHERE
bt.tag = 'llms'
AND EXTRACT(YEAR FROM be.created) = 2024
ORDER BY be.created DESC;
This query:
- Joins
blog_entry
withblog_entry_tags
and then withblog_tag
- Filters for entries tagged with "llms"
- Uses EXTRACT to filter for entries created in 2024
- Uses DISTINCT to avoid duplicate entries in case a post has multiple tags
- Orders results by creation date (newest first)
human (Dec 30, 2024, 09:14 PM)
SELECT DISTINCT
be.created,
be.slug,
be.title,
be.id
FROM blog_entry be
JOIN blog_entry_tags bet ON be.id = bet.entry_id
JOIN blog_tag bt ON bet.tag_id = bt.id
WHERE
bt.tag = 'llms'
AND EXTRACT(YEAR FROM be.created) = 2024
ORDER BY be.created
Now add a url
column - you'll need to calculate that from the be.created and the be.slug where the URL format is e.g. https://simonwillison.net/2024/Dec/9/llama-33-70b/
- note the Dec month and the no-leading-0 9 for the day
assistant (Dec 30, 2024, 09:14 PM)
I'll help you construct the URL using date extraction functions. We'll need to:
- Extract the month as a 3-letter abbreviation (like 'Dec')
- Extract the day without leading zeros
- Combine with the slug
Here's the modified query:
SELECT DISTINCT
be.created,
be.slug,
be.title,
be.id,
'https://simonwillison.net/' ||
EXTRACT(YEAR FROM be.created) || '/' ||
TO_CHAR(be.created, 'Mon') || '/' ||
TRIM(LEADING '0' FROM TO_CHAR(EXTRACT(DAY FROM be.created), '99')) || '/' ||
be.slug || '/' as url
FROM blog_entry be
JOIN blog_entry_tags bet ON be.id = bet.entry_id
JOIN blog_tag bt ON bet.tag_id = bt.id
WHERE
bt.tag = 'llms'
AND EXTRACT(YEAR FROM be.created) = 2024
ORDER BY be.created
This query:
- Uses TO_CHAR with 'Mon' format to get the 3-letter month abbreviation
- Uses EXTRACT(DAY) to get the day number
- Uses TO_CHAR and TRIM to remove leading zeros from the day
- Concatenates all parts with '/' between them
- Keeps all your existing columns and ordering
human (Dec 30, 2024, 09:16 PM)
Now it's this:
SELECT DISTINCT
be.created,
be.slug,
be.title,
be.id,
'https://simonwillison.net/' ||
EXTRACT(YEAR FROM be.created) || '/' ||
TO_CHAR(be.created, 'Mon') || '/' ||
EXTRACT(DAY FROM be.created) || '/' ||
be.slug || '/' as url
FROM blog_entry be
JOIN blog_entry_tags bet ON be.id = bet.entry_id
JOIN blog_tag bt ON bet.tag_id = bt.id
WHERE
bt.tag = 'llms'
AND EXTRACT(YEAR FROM be.created) = 2024
ORDER BY be.created
But I need a query that outputs the following:
- January
- [Title of post](URL to post)
- [Title of post](URl to post)
- February
- ...
So it needs to do some fancy aggregations and grouping by month to generate that text
assistant (Dec 30, 2024, 09:16 PM)
I'll help create a query that generates Markdown-formatted output grouped by month. We'll use string_agg to concatenate the entries and format them as Markdown links:
WITH monthly_posts AS (
SELECT
EXTRACT(MONTH FROM be.created) as month_num,
TO_CHAR(be.created, 'Month') as month_name,
string_agg(
' - [' || be.title || '](https://simonwillison.net/' ||
EXTRACT(YEAR FROM be.created) || '/' ||
TO_CHAR(be.created, 'Mon') || '/' ||
EXTRACT(DAY FROM be.created) || '/' ||
be.slug || '/)',
E'\n'
ORDER BY be.created
) as post_links
FROM blog_entry be
JOIN blog_entry_tags bet ON be.id = bet.entry_id
JOIN blog_tag bt ON bet.tag_id = bt.id
WHERE
bt.tag = 'llms'
AND EXTRACT(YEAR FROM be.created) = 2024
GROUP BY
EXTRACT(MONTH FROM be.created),
TO_CHAR(be.created, 'Month')
ORDER BY month_num
)
SELECT
'- ' || trim(month_name) || E'\n' || post_links as markdown
FROM monthly_posts;
This query:
- Creates a CTE that groups posts by month
- Uses string_agg to create the bullet-pointed links for each post
- Formats each month as a heading with its posts underneath
- Orders everything chronologically by month
- Returns a single column containing the formatted Markdown text
The output will look exactly like your example, with each month as a heading and the posts as nested bullet points with proper Markdown link formatting.
human (Dec 30, 2024, 09:18 PM)
perfect - now add the day of the month like this:
- January
- 1st: [Title of post](URL to post)
- 14th: [Title of post](URl to post)
assistant (Dec 30, 2024, 09:18 PM)
Here's the modified query that adds the day with the correct ordinal suffix (st, nd, rd, th):
WITH monthly_posts AS (
SELECT
EXTRACT(MONTH FROM be.created) as month_num,
TO_CHAR(be.created, 'Month') as month_name,
string_agg(
' - ' ||
EXTRACT(DAY FROM be.created)::integer ||
CASE
WHEN EXTRACT(DAY FROM be.created)::integer % 10 = 1
AND EXTRACT(DAY FROM be.created)::integer != 11 THEN 'st'
WHEN EXTRACT(DAY FROM be.created)::integer % 10 = 2
AND EXTRACT(DAY FROM be.created)::integer != 12 THEN 'nd'
WHEN EXTRACT(DAY FROM be.created)::integer % 10 = 3
AND EXTRACT(DAY FROM be.created)::integer != 13 THEN 'rd'
ELSE 'th'
END ||
': [' || be.title || '](https://simonwillison.net/' ||
EXTRACT(YEAR FROM be.created) || '/' ||
TO_CHAR(be.created, 'Mon') || '/' ||
EXTRACT(DAY FROM be.created) || '/' ||
be.slug || '/)',
E'\n'
ORDER BY be.created
) as post_links
FROM blog_entry be
JOIN blog_entry_tags bet ON be.id = bet.entry_id
JOIN blog_tag bt ON bet.tag_id = bt.id
WHERE
bt.tag = 'llms'
AND EXTRACT(YEAR FROM be.created) = 2024
GROUP BY
EXTRACT(MONTH FROM be.created),
TO_CHAR(be.created, 'Month')
ORDER BY month_num
)
SELECT
'- ' || trim(month_name) || E'\n' || post_links as markdown
FROM monthly_posts;
The key changes are:
- Added the day number using
EXTRACT(DAY FROM be.created)
- Added a CASE statement to handle the ordinal suffixes correctly:
- st for numbers ending in 1 (but not 11)
- nd for numbers ending in 2 (but not 12)
- rd for numbers ending in 3 (but not 13)
- th for all other numbers (including 11th, 12th, 13th)