last edited on 2020-05-04 (please see edit history for changes)
See the results here: https://guzey.com/science/sleep/14-day-sleep-deprivation-self-experiment/
This is the protocol for my sleep experiment that I will start on 2020-04-03. I will sleep 4 hours per night for 2 weeks and evaluate the effects of doing so on my cognition using psychomotor vigilance task (essentially, a reaction test), SAT (a 3 hour test that involves reading and math), and Aimgod custom scenario I called guzey_arena_0 (video) (this is a first-person shooter trainer game that allows the creation of custom scenarios. The scenario I created requires constant attention, eye-hand coordination, tactical thinking).
(inspired by Van Dongen et al 2003 https://academic.oup.com/sleep/article/26/2/117/2709164)
I'm going to remain inside throughout the experiment and my behavior will be monitored by my wife between 07:30-23:30 every day and will not be monitored between 23:30-03:30. I will check in in a google sheet every 15 minutes during that time to confirm that I'm not sleeping. I will work, play video games, watch movies, browse the internet, read, walk around the apartment, but will not engage in any vigorous activities. I will avoid direct sunlight and will turn off all lights during sleep times. I did not use any caffeine, alcohol, tobacco, and/or medications in the 2 weeks before the experiment. I do not have any medical, psychiatric or sleep-related disorders, aside from occasionally experiencing stress-related strain in my chest, and I will write down any unusual symptoms I experience during the experiment. I worked neither regular night nor rotating shift work within the past 2 years. I have not travelled across time zones in the 3 months before the experiment.
Structure of the experiment
Days -7 to -1 (2020-03-27 to 2020-04-02): adaptation.
I will give myself a sleep opportunity of 8 hours (23:30-07:30) for 7 days prior to the experiment to make sure I don't have any sleep debt carried over. My average sleep duration in these days (2020-03-27 to 2020-04-02) was INPUT with the standard deviation of INPUT.
Day -1 (2020-04-02): practice
I will give myself a sleep opportunity of 8 hours (23:30-07:30). I will practice the tasks I'm going to use in order to mitigate the most extreme practice effects during the experiment. I will perform all tasks as follows:
- PVT at 07:30, 13:30, and 19:30.
- SAT at 07:40
- Aimgod guzey_arena_0 at 12:30-12:50, 15:30-15:50, 18:30-18:50 and 21:30-21:50 (I also played about 2 hours of guzey_arena_0 on 2020-03-31 and 2020-04-01 while designing the experiment)
Days 1 (2020-04-03), 19 (2020-04-21), 20 (2020-04-22): control.
I will give myself a sleep opportunity of 8 hours (23:30-07:30). I will perform all tasks on these days as follows:
- PVT at 7:40, 13:00, 13:20, 13:40, 16:00, 16:20, 16:40, 19:00 19:20, 19:40, 22:00, 22:20, 22:40
- SAT at 07:50
- Aimgod 11 sessions each at 12:00, 18:00, and 23:00
Days 2 to 12 (2020-04-04 to 2020-04-14): sleep deprivation
I will give myself a sleep opportunity of 4 hours (03:30-07:30). I will not test myself on days 2 to 12.
Days 13-15 (2020-04-15 to 2020-04-17): sleep deprivation, testing
I will perform all tasks on days 13, 14, 15 as follows:
- PVT at 7:40, 13:00, 13:20, 13:40, 16:00, 16:20, 16:40, 19:00 19:20, 19:40, 22:00, 22:20, 22:40 (on day 15 I'm going to bed at 22:30, so the last PVT tests will be at 21:00, 21:20, 21:40 instead of 22:00, 22:20, 22:40)
- SAT at 07:50
- Aimgod 11 sessions each at 12:00, 18:00, and 00:00 (on day 15 I'm going to bed at 22:30, so the last Aimgod test will be at 22:00)
Days 16-18 (2020-04-18 to 2020-04-20): recovery
I will give myself a sleep opportunity of 9 hours (22:30-07:30). I will not test myself on days 16-18.
Rationale behind sample sizes and data analysis
I will test two statistical hypotheses in the study:
- number of lapses on PVT (defined as response time >500ms) is different between control and experimental conditions
- time to finish the scenario (i.e. time to kill 20 bots) in guzey_arena_0 is different between control and experimental conditions
Effect sizes sleep-deprived number of lapses on PVT have previously been reported to be huge. In particular, Basner and Dinges 2011 report:
Effect sizes were high for both TSD (1.59–1.94) and PSD (0.88–1.21) for PVT metrics related to lapses and to measures of psychomotor speed
with PSD meaning that participants were allowed to sleep for 4 hours a night for 5 days. Studies show that the effect increases monotonically with an increase in the length of sleep deprivation (e.g. Van Dongen et al 2003), so we expect that the effect of sleeping for 4 hours a night for 2 weeks will be larger than this.
To be on the safe side, I'm going to collect enough PVT samples (n=39 control; n=39 experimental) to detect an effect size of 0.8 with 97.5% confidence level and 89.5% power.
I will collect enough Aimbot samples (n=99 control; n=99 experimental) to detect an effect size of 0.5 (medium-size effect) with 97.5% confidence level and 89.5% power.
Note that 97.5% confidence level is standard 95% confidence level Bonferroni corrected for 2 comparisons and that 89.5% ensures that we achieve 80% joint power (probability to correctly reject both null hypotheses if they're both false).
I expect that practice effects will accumulate monotonically with performance of SAT and Aimgod, therefore I'm trying to equalize the average practice effects between control and experimental conditions. (control: day 1 (low practice), 19 (medium practice), 20 (high practice); experimental: day 13 (medium practice), day 14 (medium practice), day 15 (medium practice))
PVT is resistant to practice effects.
I will perform two two-tailed t-tests to compare means for both hypotheses at alpha=0.025.
I'm taking the SAT as a curiosity, will collect very few samples and will not perform any statistical analysis on them.
Pre-registering my hypotheses
As Krause et al 2017 write in their review for Nature Reviews Neuroscience:
One cognitive ability that is especially susceptible to sleep loss is attention, which serves ongoing goal-directed behaviour4. Performance on attentional tasks deteriorates in a dose-dependent manner with the amount of accrued time awake, owing to increasing sleep pressure5,6,7. The prototypic impairments on such tasks are known as 'lapses' or 'microsleeps', which involve response failures that reflect errors of omission4,5,6. More specifically, attentional maintenance becomes highly variable and erratic (with attention being sustained, lost, re-established, then lost again), resulting in unstable task performance4
My key hypothesis is that the number of lapses on PVT increases because it's a boring as hell task. It's 10 minutes just staring into the computer waiting for the red dot to appear periodically. So I think that sleep deprived are just falling asleep during the task but are otherwise functioning normally and their cognitive function is not compromised. Therefore, I expect that my PVT performance will deteriorate but Aimgod performance will not, since this is an interesting task, even though it primarily relies on attention for performance.
I will record the data and will not look at it until after the data collection is over to try to minimize the effect of my expectations on my performance. I will partially be able to see my performance level at Aimgod while performing it but will try not to do it...
Why are you not testing memory, executive function, mental rotation, creative reasoning...?
The more hypotheses I have, the more samples I need to collect for each hypothesis, in order to maintain the same false positive probability (https://en.wikipedia.org/wiki/Multiple_comparisons_problem). This is a n=1 study and I'm barely collecting enough samples to measure medium-to-large effects and will spend 10 hours performing PVT. I'm not in a position to test many hypotheses at once.