Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save kjschiroo/2ea2ee5189797aa96cf0ef105326ad2e to your computer and use it in GitHub Desktop.
Save kjschiroo/2ea2ee5189797aa96cf0ef105326ad2e to your computer and use it in GitHub Desktop.
CREATE TABLE labeled_page
SELECT
article.page_id,
article.page_namespace,
article.page_title,
cl.cl_type AS category_type,
cl.cl_to AS label,
DATE_FORMAT(cl.cl_timestamp, "%Y%m%d%H%i%S") AS last_update
FROM enwiki_p.categorylinks cl
JOIN enwiki_p.page tp ON
cl.cl_from = tp.page_id AND
tp.page_namespace = 1
JOIN enwiki_p.page article ON
article.page_title = tp.page_title AND
article.page_namespace = 0
JOIN pages s_pages ON
article.page_id = s_pages.page_id
WHERE
cl.cl_to LIKE "Top-importance%articles" OR
cl.cl_to LIKE "High-importance%articles" OR
cl.cl_to LIKE "Mid-importance%articles" OR
cl.cl_to LIKE "Low-importance%articles" OR
cl.cl_to LIKE "Unknown-importance%articles" OR
cl.cl_to LIKE "Stub-Class%articles" OR
cl.cl_to LIKE "Start-Class%articles" OR
cl.cl_to LIKE "C-Class%articles" OR
cl.cl_to LIKE "B-Class%articles" OR
cl.cl_to LIKE "GA-Class%articles" OR
cl.cl_to LIKE "A-Class%articles" OR
cl.cl_to LIKE "FA-Class%articles" OR
cl.cl_to LIKE "WikiProject%";
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment