Skip to content

Instantly share code, notes, and snippets.

@jdevalk
Last active June 14, 2025 15:13
Show Gist options
  • Select an option

  • Save jdevalk/d4ae38001139f128dd65914cf03bd207 to your computer and use it in GitHub Desktop.

Select an option

Save jdevalk/d4ae38001139f128dd65914cf03bd207 to your computer and use it in GitHub Desktop.
WITH site_cms AS (
SELECT
url,
COALESCE(
MAX(IF(category = 'CMS', app, NULL)),
MAX(IF(category = 'Ecommerce' AND app != 'Cart Functionality', app, NULL)),
'Unrecognized'
) AS cms_name
FROM
`httparchive.technologies.2024_05_01_mobile`
GROUP BY
url
)
SELECT
cms_name,
COUNT(*) AS count,
ROUND( 100 * ( COUNT(*) / SUM(COUNT(*)) OVER() ), 2 ) AS market_share
FROM
site_cms
GROUP BY
cms_name
ORDER BY
count DESC
LIMIT 100
WITH site_builder AS (
SELECT
url,
COALESCE(
MAX(IF(category = 'Page builders', app, NULL)),
'Unrecognized'
) AS builder_name
FROM
`httparchive.technologies.2024_05_01_mobile`
GROUP BY
url
)
SELECT
builder_name,
COUNT(*) AS count,
ROUND( 100 * ( COUNT(*) / SUM(COUNT(*)) OVER() ), 2 ) AS market_share
FROM
site_builder
GROUP BY
builder_name
ORDER BY
count DESC
LIMIT 100
WITH site_plugin AS (
SELECT
url,
COALESCE(
MAX(IF(category = 'WordPress plugins' AND app IN ('All in One SEO Pack', 'RankMath SEO', 'Slim SEO', 'The SEO Framework', 'Yoast SEO'), app, NULL)),
'Unrecognized'
) AS plugin_name
FROM
`httparchive.technologies.2024_05_01_mobile`
GROUP BY
url
)
SELECT
plugin_name,
COUNT(*) AS count,
ROUND( 100 * ( COUNT(*) / SUM(COUNT(*)) OVER() ), 2 ) AS market_share
FROM
site_plugin
GROUP BY
plugin_name
ORDER BY
count DESC
LIMIT 100
@matteo-greco
Copy link
Copy Markdown

matteo-greco commented Jun 14, 2025

Hey @jdevalk, I was looking for a way to get page builder stats and this was really useful - thank you!

This doc on minimizing BigQuery costs mentions that legacy tables such as httparchive.technologies.2024_05_01_mobile don't take advantage of partitioning and clustering. So I came up with a modified version of your page builder query. It uses about 25GB of processing - not sure if that's more or less than yours. I'll leave it here in case you want to look at it:

SELECT
  tech.technology AS technology,
  COUNT(*) AS count,
  ROUND(100 * COUNT(*) / SUM(COUNT(*)) OVER(), 2) AS market_share
FROM
  `httparchive.crawl.pages` p,
  UNNEST(p.technologies) AS tech
WHERE
  p.date = '2025-05-01'
  AND p.is_root_page
  AND 'Page builders' IN UNNEST(tech.categories)
  AND EXISTS (
    SELECT 1
    FROM UNNEST(p.technologies) AS t2
    WHERE t2.technology = 'WordPress'
  )
GROUP BY technology
ORDER BY count DESC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment