Skip to content

Instantly share code, notes, and snippets.

@riceissa
Last active December 15, 2023 09:36
Show Gist options
  • Star 33 You must be signed in to star a gist
  • Fork 6 You must be signed in to fork a gist
  • Save riceissa/1ead1b9881ffbb48793565ce69d7dbdd to your computer and use it in GitHub Desktop.
Save riceissa/1ead1b9881ffbb48793565ce69d7dbdd to your computer and use it in GitHub Desktop.
my current understanding of Anki's spacing algorithm
"""
This is my understanding of the Anki scheduling algorithm, which I mostly
got from watching https://www.youtube.com/watch?v=lz60qTP2Gx0
and https://www.youtube.com/watch?v=1XaJjbCSXT0
and from reading
https://faqs.ankiweb.net/what-spaced-repetition-algorithm.html
There is also https://github.com/dae/anki/blob/master/anki/sched.py but I find
it really hard to understand.
Things I don't bother to implement here: the random fudge factor (that Anki
uses to decorrelate cards that were added on the same day and have the same
responses throughout their history), leech tracking, checking if a card from
the same notes has been reviewed already that day, delay in response (i.e. I
assume all cards are reviewed exactly on the day they are due).
Update (2023-12-15): Please note that the Anki review algorithm has possibly
changed in many ways since the time when I wrote this program (although I
believe that Anki still uses SM2 by default, so the basic concepts should
still be the same as what is shown below). I have sadly not had the time
or energy to keep up with the latest changes. In particular, Anki now
supports FSRS instead of the SM2 algorithm (which is the algorithm
below); FSRS is not covered at all below.
"""
# "New Cards" tab
NEW_STEPS = [1, 10] # in minutes
GRADUATING_INTERVAL = 1 # in days
EASY_INTERVAL = 4 # in days
STARTING_EASE = 250 # in percent
# "Reviews" tab
EASY_BONUS = 130 # in percent
INTERVAL_MODIFIER = 100 # in percent
MAXIMUM_INTERVAL = 36500 # in days
# "Lapses" tab
LAPSES_STEPS = [10] # in minutes
NEW_INTERVAL = 70 # in percent
MINIMUM_INTERVAL = 1 # in days
class Card:
def __init__(self):
self.status = 'learning' # can be 'learning', 'learned', or 'relearning'
self.steps_index = 0
self.ease_factor = STARTING_EASE
self.interval = None
def __repr__(self):
return "Card[%s; steps_idx=%s; ease=%s; interval=%s]" % (self.status,
self.steps_index,
self.ease_factor,
str(self.interval))
def schedule(card, response):
'''response is one of "again", "hard", "good", or "easy"
returns a result in days'''
if card.status == 'learning':
# for learning cards, there is no "hard" response possible
if response == "again":
card.steps_index = 0
return minutes_to_days(NEW_STEPS[card.steps_index])
elif response == "good":
card.steps_index += 1
if card.steps_index < len(NEW_STEPS):
return minutes_to_days(NEW_STEPS[card.steps_index])
else:
# we have graduated!
card.status = 'learned'
card.interval = GRADUATING_INTERVAL
return card.interval
elif response == "easy":
card.status = 'learned'
card.interval = EASY_INTERVAL
return EASY_INTERVAL
else:
raise ValueError("you can't press this button / we don't know how to deal with this case")
elif card.status == 'learned':
if response == "again":
card.status = 'relearning'
card.steps_index = 0
card.ease_factor = max(130, card.ease_factor - 20)
card.interval = max(MINIMUM_INTERVAL, card.interval * NEW_INTERVAL/100)
return minutes_to_days(LAPSES_STEPS[0])
elif response == "hard":
card.ease_factor = max(130, card.ease_factor - 15)
card.interval = card.interval * 1.2 * INTERVAL_MODIFIER/100
return min(MAXIMUM_INTERVAL, card.interval)
elif response == "good":
card.interval = (card.interval * card.ease_factor/100
* INTERVAL_MODIFIER/100)
return min(MAXIMUM_INTERVAL, card.interval)
elif response == "easy":
card.ease_factor += 15
card.interval = (card.interval * card.ease_factor/100
* INTERVAL_MODIFIER/100 * EASY_BONUS/100)
return min(MAXIMUM_INTERVAL, card.interval)
else:
raise ValueError("you can't press this button / we don't know how to deal with this case")
elif card.status == 'relearning':
if response == "again":
card.steps_index = 0
return minutes_to_days(LAPSES_STEPS[0])
elif response == "good":
card.steps_index += 1
if card.steps_index < len(LAPSES_STEPS):
return minutes_to_days(LAPSES_STEPS[card.steps_index])
else:
# we have re-graduated!
card.status = 'learned'
# we don't modify the interval here because that was already done when
# going from 'learned' to 'relearning'
return card.interval
else:
raise ValueError("you can't press this button / we don't know how to deal with this case")
def minutes_to_days(minutes):
return minutes / (60 * 24)
def human_friendly_time(days):
if not days:
return days
if days < 1:
return str(round(days * 24 * 60, 2)) + " minutes"
elif days < 30:
return str(round(days, 2)) + " days"
elif days < 365:
return str(round(days / (365.25 / 12), 2)) + " months"
else:
return str(round(days / 365.25, 2)) + " years"
card1 = Card()
# responses = ["good", "good", "good", "again", "good", "good", "good"]
responses = ["good"] * 10
for r in responses:
print(str(card1) + " [%s]" % r, end="→ ")
t = schedule(card1, r)
print(human_friendly_time(t), card1)
@ilbonte
Copy link

ilbonte commented Apr 17, 2020

On line 74 I think you mean status, not state

@riceissa
Copy link
Author

@ilbonte Thanks, fixed.

@ilbonte
Copy link

ilbonte commented Apr 20, 2020

Also there is something odd: the INTERVAL_MODIFIER is constant to 100 and it's always divided for 100

@riceissa
Copy link
Author

Also there is something odd: the INTERVAL_MODIFIER is constant to 100 and it's always divided for 100

I think that's the intended behavior (i.e. do absolutely nothing by default). From the Anki docs:

Interval modifier allows you to apply a multiplication factor to the intervals Anki generates. At its default of 100% it does nothing; if you set it to 80% for example, intervals will be generated at 80% of their normal size (so a 10 day interval would become 8 days). You can thus use the multiplier to make Anki present cards more or less frequently than it would otherwise, trading study time for retention or vice versa.

@ilbonte
Copy link

ilbonte commented Apr 22, 2020

Thanks for the explanation, I've missed that :)

@ctrngk
Copy link

ctrngk commented Sep 22, 2020

Thank you for your code. It's awesome.

How to calculate retention rate by the way?
It is somehow related to mature cards (>21 days ) and other stuffs.
Suppose we have 100 matured cards, we review all on the next day(22th). 50% wrong, 50% good. What is retention rate and true retention rate?
some related sources I am still confused
https://www.youtube.com/watch?v=kOj2xLTX_sY
https://www.reddit.com/r/Anki/comments/9jwosj/calculating_the_ideal_retention_rate_an/

@riceissa
Copy link
Author

@ctrngk If you have 100 cards and tend to get 90% of them correct if you review them when each card is due (rather than all on the next day), then the (ordinary) retention rate is 0.9 (if you go to Anki stats, this is the number you find in the "Answer Buttons" section where it says "Correct: X%"), and the forgetting index (FI) is 0.1. The true retention is -FI/log(1-FI) = 0.95. As Matt explains in the video you link to, the true retention is accounting for the fact that reviewing when a card is due is "unfair" (because even if you fail you would have remembered it for possibly most of the time period between the review session where you got it right and the review session where you got it wrong).

If you review all of your cards on a single day, then that is the true retention. (The true retention is just measuring "if you randomly stopped me on the street and asked me to review a random Anki card in my collection, what is the probability I get it right?" So by reviewing all cards at once we simulate this experiment.)

If for some magical reason all of your cards just happened to be due on the same day, then on the day the cards are due your true retention would equal your ordinary retention. But in real life your cards are never all due on the same day, so your true retention is higher.

@nejedlypetr
Copy link

Lines 77-79:
the Anki manual says "the current interval is multiplied by the value of new interval", but I have no idea what the "new interval" is
At first, I was confused by this as well but I think that the new interval is already defined on line 31:
NEW_INTERVAL = 70 # in percent

Other than that great job @riceissa, thanks a lot.

@riceissa
Copy link
Author

@nejedlypetr Ah ok great! It looks like I was applying the NEW_INTERVAL when going from 'relearning' back to 'learned', whereas the Anki manual says to apply it already when going from 'learned' to 'relearning'. I've fixed this now so it works more like what the Anki manual says.

@grandinquisitor
Copy link

there's a spelling inconsistency: LAPSE_STEPS vs LAPSES_STEPS

@riceissa
Copy link
Author

@grandinquisitor Thanks, fixed!

@L-M-Sherlock
Copy link

I developed a new spacing algorithm for Anki. Maybe you will be interested in it: https://github.com/open-spaced-repetition/fsrs4anki

@riceissa
Copy link
Author

@L-M-Sherlock I saw that Anki now supports FSRS by default. I've sadly not had any time to look into FSRS or to use it. I've added a note at the top of the script mentioning this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment