As specified by the authors, the books corpus needs to be downloaded from smashwords. However, there is no easy download option, it seems that it needs to be scraped.
The Wikipedia dataset can be downloaded from Wikimedia but only as XML.
Huggingface makes these datasets available, making it easier to acquire them.
The steps are as follow: