Skip to content

Instantly share code, notes, and snippets.

@whymarrh
Created March 21, 2021 02:05
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save whymarrh/b2255aa7dcdce1b688cee3f23d64971e to your computer and use it in GitHub Desktop.
Save whymarrh/b2255aa7dcdce1b688cee3f23d64971e to your computer and use it in GitHub Desktop.
Select duplicate clues from the J! Archive
-- Using the db from https://github.com/whymarrh/jeopardy-parser
WITH duplicates(clue, occurrences) AS (
SELECT d.clue, COUNT(d.clue) AS occurrences
FROM documents AS d
WHERE d.clue NOT IN (
'=', '...',
'(missing clue)',
'(audio clue)', '[audio clue]', '[audio]',
'[Instrumental theme plays]', '[instrumental]',
'[Music plays]', '[theme music]',
'[video clue]',
'[sports logo]',
'[flag]',
'[State outline]'
)
GROUP BY clue
HAVING occurrences > 1
ORDER BY occurrences DESC
)
SELECT a.airdate, cats.category, d.clue, d.answer
FROM clues c
JOIN airdates a ON a.game = c.game
JOIN documents d ON d.id = c.id
JOIN classifications classes on c.id = classes.clue_id
JOIN categories cats on cats.id = classes.category_id
JOIN duplicates dupes ON dupes.clue = d.clue
WHERE d.clue IN (
SELECT dupes.clue
FROM duplicates AS dupes
WHERE dupes.occurrences > 1
ORDER BY dupes.occurrences DESC
)
ORDER BY dupes.occurrences DESC, d.clue, a.airdate
;
@whymarrh
Copy link
Author

The dependencies for the now-outdated jeopardy-parser:

apt-get install \
build-essential \
python-minimal \
python-pip \
python-dev \
libxml2-dev \
libxslt-dev \
zlib1g-dev

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment