Planning for a Spaced-Repetition Learning System
Planning for an SRS project.
Markdown will be used for flashcards.
A card begins with
***. The front / back of the card are separated with
A markdown file ("deck") may contain many cards.
A deck may begin with a headling, description, etc. before the first card.
A deck may contain any number of cards and a user may have any number of decks (in some known location, perhaps
The design goal is that users should be able to use something like https://gist.github.com to create / publish decks.
# Jokes Some of my favorites. *** Why did the chicken cross the road? --- To get to the other side! *** What do you call a cow with no legs? --- Ground beef!
A client should strive to support as much markdown rendering as possible on each card (e.g. images, links, tables, etc).
Beware of writing your own naive markdown parser! Consider the following deck. Would your parser handle this correctly?
# Markdown flashcards Some cards about markdown syntax. *** What is the markup for a horizontal rule? --- There are three ways to denote a horizontal rule: Three hyphens: ``` --- ``` Three asterisk: ``` *** ``` Or three underscores: ``` ___ ```
The decision of which card to show the user next is based on the user's history of answers, so the srs app needs to track their answers over time.
In order to track answers to questions, questions must be identifiable. One possibility would be to use the hash value of each card. Any change in the card's text would result in losing / resetting the tracking history of that card, which is perhaps a desireable property. This would also require using a markdown parser which allows access to the raw markdown of each question (so that a hash may be computed).
We'll use something similar to the Leitner algorithm.
- For each card, we track a score and a "last-seen" timestamp.
- All unseen cards start out with a score of 0 (and a NULL timestamp).
- A correct answer increments the score, while an incorrect answer decrements the score.
- The score "truncates" when switching sign, i.e.:
- An incorrect answer causes a positive score to "jump" to -1, and vice-versa.
- e.g. an answer sequence of "(unseen), correct, correct, incorrect" would result in a score history of "0, 1, 2, -1".
- When a user starts a new session, the session start-time is remembered.
- The next card chosen will have the most negative score with a last-seen timestamp older than the session start-time.
- When a user answers a card, its last-seen timestamp is updated, ensuring it won't be presented again during the current session.
- When all cards are exhausted, the user will be asked if they would like to start a new session.
Here is a possible database format to track the user's answers:
CREATE TABLE IF NOT EXISTS srs_stats_v1 ( card_id TEXT NOT NULL UNIQUE PRIMARY KEY, -- MD5 of the card markdown score INTEGER NOT NULL DEFAULT 0, -- +1 per correct, -1 per incorrect answer last_seen INTEGER DEFAULT NULL -- unix timestamp, null means "never seen" );
Choosing which card to show the user next would be:
-- Choose the next card to show to the user. SELECT card_id, score FROM srs_stats_v1 WHERE last_seen < :session_start ORDER BY score ASC, last_seen ASC, card_id ASC LIMIT 1;
Inserting a new card would be:
-- Insert a new card. INSERT INTO srs_stats_v1 (card_id, score, last_seen) VALUES (:card_id, 0, NULL);
Updating a card would be:
-- Update a card. UPDATE srs_stats_v1 SET score = :score, last_seen = :last_seen WHERE card_id = :card_id
Deleting the tracking data for a card would be:
-- Delete a card. DELETE FROM srs_stats_v1 WHERE card_id = :card_id
When starting the app / starting a new session:
- Each deck file in
~/srs/is opened and parsed:
- The markdown snippet of each card is stored in RAM and the MD5 hash is computed.
- The MD5 sum is added to the database if needed.
SELECTquery is used to choose the next card to show to the user.
- If that card isn't in the list of cards we just parsed at startup, the database entry is deleted and the
SELECTquery is re-run.
- Rather than keeping each card in RAM, simply keep the card's beginning / ending file offsets in RAM, and read the card from disk on-demand.
- Rather than re-computing the MD5 sum of every card on startup, store the file offsets and MD5 sums in a cache file. If the modification timestamp of the deck file is older than the timestamp of the cache entry, then there is no need to parse the deck file.