Skip to content

Instantly share code, notes, and snippets.

@cellularmitosis cellularmitosis/README.md
Last active Oct 15, 2019

Embed
What would you like to do?
Planning for a Spaced-Repetition Learning System

Blog 2019/8/14

<- previous | index | next ->

Planning for a Spaced-Repetition Learning System

Planning for an SRS project.

Flashcard format

Markdown will be used for flashcards.

A card begins with ***. The front / back of the card are separated with ---.

A markdown file ("deck") may contain many cards.

A deck may begin with a headling, description, etc. before the first card.

A deck may contain any number of cards and a user may have any number of decks (in some known location, perhaps ~/srs/).

The design goal is that users should be able to use something like https://gist.github.com to create / publish decks.

Example deck:

# Jokes
Some of my favorites.

***
Why did the chicken cross the road?
---
To get to the other side!

***
What do you call a cow with no legs?
---
Ground beef!

A client should strive to support as much markdown rendering as possible on each card (e.g. images, links, tables, etc).

Beware of writing your own naive markdown parser! Consider the following deck. Would your parser handle this correctly?

# Markdown flashcards
Some cards about markdown syntax.

***
What is the markup for a horizontal rule?
---
There are three ways to denote a horizontal rule:

Three hyphens:
```
---
```

Three asterisk:
```
***
```

Or three underscores:
```
___
```

Tracking

The decision of which card to show the user next is based on the user's history of answers, so the srs app needs to track their answers over time.

In order to track answers to questions, questions must be identifiable. One possibility would be to use the hash value of each card. Any change in the card's text would result in losing / resetting the tracking history of that card, which is perhaps a desireable property. This would also require using a markdown parser which allows access to the raw markdown of each question (so that a hash may be computed).

Algorithm

We'll use something similar to the Leitner algorithm.

  • For each card, we track a score and a "last-seen" timestamp.
  • All unseen cards start out with a score of 0 (and a NULL timestamp).
  • A correct answer increments the score, while an incorrect answer decrements the score.
  • The score "truncates" when switching sign, i.e.:
    • An incorrect answer causes a positive score to "jump" to -1, and vice-versa.
    • e.g. an answer sequence of "(unseen), correct, correct, incorrect" would result in a score history of "0, 1, 2, -1".
  • When a user starts a new session, the session start-time is remembered.
  • The next card chosen will have the most negative score with a last-seen timestamp older than the session start-time.
  • When a user answers a card, its last-seen timestamp is updated, ensuring it won't be presented again during the current session.
  • When all cards are exhausted, the user will be asked if they would like to start a new session.

Database format

Here is a possible database format to track the user's answers:

CREATE TABLE IF NOT EXISTS srs_stats_v1 (
    card_id TEXT NOT NULL UNIQUE PRIMARY KEY,  -- MD5 of the card markdown
    score INTEGER NOT NULL DEFAULT 0,  -- +1 per correct, -1 per incorrect answer
    last_seen INTEGER DEFAULT NULL  -- unix timestamp, null means "never seen"
);

Choosing which card to show the user next would be:

-- Choose the next card to show to the user.
SELECT card_id, score
FROM srs_stats_v1
WHERE last_seen < :session_start
ORDER BY score ASC, last_seen ASC, card_id ASC
LIMIT 1;

Inserting a new card would be:

-- Insert a new card.
INSERT INTO srs_stats_v1 (card_id, score, last_seen)
VALUES (:card_id, 0, NULL);

Updating a card would be:

-- Update a card.
UPDATE srs_stats_v1
SET score = :score, last_seen = :last_seen
WHERE card_id = :card_id

Deleting the tracking data for a card would be:

-- Delete a card.
DELETE FROM srs_stats_v1
WHERE card_id = :card_id

App startup

When starting the app / starting a new session:

  • Each deck file in ~/srs/ is opened and parsed:
    • The markdown snippet of each card is stored in RAM and the MD5 hash is computed.
    • The MD5 sum is added to the database if needed.
    • A SELECT query is used to choose the next card to show to the user.
    • If that card isn't in the list of cards we just parsed at startup, the database entry is deleted and the SELECT query is re-run.

Optimization opportunities:

  • Rather than keeping each card in RAM, simply keep the card's beginning / ending file offsets in RAM, and read the card from disk on-demand.
  • Rather than re-computing the MD5 sum of every card on startup, store the file offsets and MD5 sums in a cache file. If the modification timestamp of the deck file is older than the timestamp of the cache entry, then there is no need to parse the deck file.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.