Skip to content

Instantly share code, notes, and snippets.

@chappy84
Last active April 18, 2024 16:31
Show Gist options
  • Star 5 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save chappy84/c3b1533cec4ac0589b2dcc318b4e6606 to your computer and use it in GitHub Desktop.
Save chappy84/c3b1533cec4ac0589b2dcc318b4e6606 to your computer and use it in GitHub Desktop.
Packt Pub Downloader - Quick hacky script to download all your e-books from packtpub.com. This may not always work, they may change their api calls etc.

PacktPub Downloader

This is a script that's intended for bulk downloading all of your books and / or videos listed in the "My Owned Products" section of the PacktPub account pages.

Pre-requisite

If PacktPub are currently double-checking logins using Google's Captcha (This includes the message about trying again to ensure you're not a bot), then you'll have to use either dev-tools, and get the information out of the tokens XHR request / the site cookies, or use the supplied user.js with TamperMonkey / GreaseMonkey to get a valid user session that can be refreshed.

Running

If they're not currently using Captcha you can just use the php script, you don't need anything else, so skip to that section.

TamperMonkey / GreaseMonkey Script:

If you've not already got one of these extensions installed, install TamperMonkey. Ensure you use the correct one for your browser. This should install via each browser's official extension website, to alay at-least some safety concerns.

Follow the guide on how to install a user script for whichever extension you're using:

Once you've done this refresh the PacktPub browser tab and login if you aren't already.

Once logged in there should now be an "Authentication Details" button in the bottom right hand corner. Click it and you should get a box which gives you a "Access Token" and "Refresh Token". You'll need these for the PHP downloader script.

PHP Downloader Script

The script can be run using:

php packtPubDownloader.php

or if you want to run inside Docker:

docker build . -t chappy84/packt-pub-downloader
docker run --rm -it -v "`pwd`":/mnt chappy84/packt-pub-downloader

This will fail without authentication details. These can be entered into the script, or passed via environment variables.

Depending on which way you're authenticating you'll need to do one of the following:

  1. If you don't need to bypass the captcha...
    • Inside the script: Fill in the $emailAddress and $password variables at the top
    • Environment Vars: Set PP_EMAIL_ADDRESS and PP_PASSWORD
  2. If you do...
    • Inside the script: Fill in the $accessToken and $refreshToken variables at the top
    • Environment Vars: Set PP_ACCESS_TOKEN and PP_REFRESH_TOKEN

After which you can run it.

The environment vars can be set on the command line as normal via say export or sourcing a file containing them. With Docker they can be passed to the script in various ways

There's various options that can be passed to the script. These can either be edited in the php script, or passed to the script using command line arguments.

The options are:

  • --save-parent-dir - Parent dir of the ebooks and extras directories
  • --ebooks-dir - path of the ebooks directory relative to $saveParentDir
  • --extras-dir - path of the extras directory relative to $saveParentDir
  • --sleep-duration - Time to delay between page requests / different book downloads
  • --books-per-list-page - Book details to try requesting from the PacktPub API. This can be max 25
  • --file-types-wanted - Different file types you want to download (see script head for available types)
  • --download-front-cover - Whether or not you want the book front cover downloading (if available)
  • --start-index - If set to a number this will be the first book downloaded of a range
  • --end-index - If set to a number this will be the last book downloaded of a range

These can be passed on the command line to either the PHP script, or to the docker image, e.g.

php packtPubDownloader.php --ebooks-dir="my-ebooks"

or when using Docker:

docker run --rm -it -v "`pwd`":/mnt chappy84/packt-pub-downloader --start-index=3 --end-index=7
FROM php:cli-alpine
RUN mkdir /opt/ppd /mnt/ebooks
COPY packtPubDownloader.php /opt/ppd/
RUN sed -e "s#saveParentDir = __DIR__;#saveParentDir = '/mnt';#" -i /opt/ppd/packtPubDownloader.php
VOLUME /mnt
WORKDIR /mnt
ENTRYPOINT ["/usr/local/bin/php", "/opt/ppd/packtPubDownloader.php"]
// ==UserScript==
// @name PacktPub Auth token Scraper
// @version 0.2.0
// @description Gets the access and refresh tokens for you to use with the packtpub downloader
// @match https://account.packtpub.com/*
// @require https://cdnjs.cloudflare.com/ajax/libs/js-cookie/3.0.5/js.cookie.min.js
// @downloadURL https://gist.github.com/chappy84/c3b1533cec4ac0589b2dcc318b4e6606/raw/packtpub.auth-scraper.user.js
// @updateURL https://gist.github.com/chappy84/c3b1533cec4ac0589b2dcc318b4e6606/raw/packtpub.auth-scraper.user.js
// @supportURL https://gist.github.com/chappy84/c3b1533cec4ac0589b2dcc318b4e6606
// @grant GM_setClipboard
// ==/UserScript==
(function() {
// I'd prefer these to be constants IN the TokenStore, as that's only where they're used, but JS doesn't support that! :sadpanda:
const TOKEN_COOKIE_ACCESS_NAME = 'access_token_live';
const TOKEN_COOKIE_REFRESH_NAME = 'refresh_token_live';
class TokenStore {
get access() {
return Cookies.get(TOKEN_COOKIE_ACCESS_NAME) || '';
}
set access(access) {
this.setError('Access');
}
get refresh() {
return Cookies.get(TOKEN_COOKIE_REFRESH_NAME) || '';
}
set refresh(refresh) {
this.setError('Refresh');
}
// We're not changing PacktPub's cookies
setError(type) {
throw new Error(`The ${type} token is stored in PacktPub's cookies, thus won't be changed.`);
}
}
// In application code, this'd be split up further into components etc, use jsx, & TS, but it's just a user script
class TokenUI {
constructor(store) {
// define required class properties
this.dialogBox = null;
this.displayBtn = null;
this.accessArea = null;
this.refreshArea = null;
this.hiddenClassHash = this.randomHash();
// Initialise the UI
this.initBox();
if (this.isPolyfillRequired()) {
this.polyfill(this.dialogBox);
}
this.initLauncher();
this.toggleVisibility();
}
// Generates hashes. Used with css classes and element IDs
randomHash() {
return (Math.random().toString(36).substr(2) + Math.random().toString(36).substr(2)).replace(/^\d+(.+)$/, '$1');
}
// Checks if a polyfill is required for dialog element
isPolyfillRequired() {
if (window.HTMLDialogElement) {
return false;
}
const dialogEl = document.createElement('dialog');
return !dialogEl.showModal;
}
// Adds the polyfill script tag, then once loaded decorates the generated element
polyfillDialog(dialogBox) {
const polyfillScript = document.createElement('script');
polyfillScript.src = 'https://cdnjs.cloudflare.com/ajax/libs/dialog-polyfill/0.5.4/dialog-polyfill.min.js';
polyfillScript.addEventListener('load', () => {
if (dialogPolyfill && dialogBox) {
dialogPolyfill.registerDialog(dialogBox);
}
});
polyfillScript.addEventListener('error', (e) => {
console.error('couldn\'t polyfill dialog, polyfill failed to load: ', e);
});
document.head.appendChild(polyfillScript);
}
isLoginPage(urlPath = document.location.pathname) {
return !!urlPath.match(/^(?:\/(?:[#\?].*)?|\/login[#\?]?.*)$/);
}
// Toggles wheter or not to display the UI if it's on the login page or not.
toggleVisibility(urlPath = document.location.pathname) {
if (this.displayBtn && this.dialogBox) {
if (this.isLoginPage(urlPath)) {
this.displayBtn.classList.add(this.hiddenClassHash);
this.dialogBox.classList.add(this.hiddenClassHash);
this.dialogBox.close();
if (this.accessArea) {
this.accessArea.value = '';
}
if (this.refreshArea) {
this.refreshArea.value = '';
}
} else {
this.displayBtn.classList.remove(this.hiddenClassHash);
}
}
}
// Adds the dialog launch button
initLauncher() {
if (this.displayBtn) {
return;
}
const authDetHash = this.randomHash();
this.displayBtn = document.createElement('button');
this.displayBtn.id = authDetHash;
this.displayBtn.textContent = 'Authentication Details';
this.displayBtn.addEventListener('click', () => {
if (this.dialogBox && this.dialogBox.showModal) {
if (this.accessArea) {
this.accessArea.value = store.access;
}
if (this.refreshArea) {
this.refreshArea.value = store.refresh;
}
this.dialogBox.classList.remove(this.hiddenClassHash);
this.dialogBox.showModal();
}
});
this.displayBtn.classList.add(this.hiddenClassHash);
const style = document.createElement('style');
style.textContent = `
#${this.displayBtn.id} {
font-family: Montserrat,Helvetica,sans-serif;
font-weight: 400;
font-size: 16px;
color: #3c3c3b;
background-color: #ec6611;
cursor: pointer;
border: 0;
padding: 10px 20px;
border-radius: 2px;
color: #fff;
position: fixed;
bottom: 50px;
right: 50px;
cursor: pointer;
z-index: 10;
}
#${this.displayBtn.id}:hover {
background-color: #c85808;
}
#${this.displayBtn.id}.${this.hiddenClassHash} {
display: none;
visibility: hidden;
}
`;
document.head.appendChild(style);
document.body.appendChild(this.displayBtn);
}
// Adds the dialog box to display the access and refresh token
initBox() {
if (this.dialogBox) {
return;
}
const xBtnHash = this.randomHash();
const headerContHash = this.randomHash();
const copyBtnHash = this.randomHash();
const accessAreaHash = this.randomHash();
const refreshAreaHash = this.randomHash();
const closeBtnHash = this.randomHash();
const closeContHash = this.randomHash();
this.dialogBox = document.createElement('dialog');
this.dialogBox.id = this.randomHash();
this.dialogBox.classList.add(this.hiddenClassHash);
this.dialogBox.innerHTML = `
<button type="button" id="${xBtnHash}" class="mat-icon material-icons">close</button>
<h4>Authentication Details</h4>
<div class="${headerContHash}">
<label for="${accessAreaHash}">Access Token</label>
<button type="button" class="${copyBtnHash}">Copy Access Token to Clipboard</button>
</div>
<textarea readonly="readonly" id="${accessAreaHash}"
placeholder="If there's no token here, the required cookies have no value, or youre logged out. Login again."
></textarea>
<div class="${headerContHash}">
<label for="${refreshAreaHash}">Refresh Token</label>
<button type="button" class="${copyBtnHash}">Copy Refresh Token to Clipboard</button>
</div>
<textarea readonly="readonly" id="${refreshAreaHash}"></textarea>
<div class="${closeContHash}">
<button type="button" id="${closeBtnHash}">Close</button>
</div>
`;
const style = document.createElement('style');
style.textContent = `
#${this.dialogBox.id} {
font-family: Montserrat,Helvetica,sans-serif;
font-weight: 400;
font-size: 16px;
color: #3c3c3b;
min-width: 50%;
border: 0;
box-shadow: 0 11px 15px -7px rgba(0,0,0,.2), 0 24px 38px 3px rgba(0,0,0,.14), 0 9px 46px 8px rgba(0,0,0,.12);
}
#${this.dialogBox.id}.${this.hiddenClassHash} {
display: none;
visibility: hidden;
}
#${xBtnHash} {
box-shadow: 0 3px 5px -1px rgba(0,0,0,.2), 0 6px 10px 0 rgba(0,0,0,.14), 0 1px 18px 0 rgba(0,0,0,.12);
border-radius: 50%;
border: 0;
width: 40px;
height: 40px;
cursor: pointer;
position: absolute;
top: 35px;
right: 35px;
}
.${closeContHash}, #${this.dialogBox.id} h4, #${this.dialogBox.id} .${headerContHash}, #${this.dialogBox.id} textarea {
display: block;
width: 95%;
margin: 20px auto;
}
#${this.dialogBox.id} h4 {
padding-right: 60px;
}
#${this.dialogBox.id} .${headerContHash} {
margin-top: 40px;
}
#${this.dialogBox.id} .${headerContHash} label {
margin-top: 10px;
}
#${this.dialogBox.id} textarea {
background-color: #e5eaee;
border: 0;
padding: 10px;
}
#${accessAreaHash} {
height: 18em;
}
#${closeBtnHash}, .${copyBtnHash} {
cursor: pointer;
border: 0;
padding: 10px 20px;
border-radius: 2px;
}
#${closeBtnHash} {
background-color: #e5eaee;
margin: 20px auto;
}
#${closeBtnHash}:hover {
background-color: #dbe1e5;
}
#${this.dialogBox.id} .${copyBtnHash} {
background-color: #ec6611;
color: #fff;
}
#${this.dialogBox.id} .${copyBtnHash}:hover {
background-color: #c85808;
}
#${this.dialogBox.id} .${headerContHash} {
display: flex;
justify-content: space-between;
}
`;
document.head.appendChild(style);
document.body.appendChild(this.dialogBox);
const closeFn = () => {
this.dialogBox.close();
this.dialogBox.classList.add(this.hiddenClassHash);
};
document.getElementById(xBtnHash).addEventListener('click', closeFn);
document.getElementById(closeBtnHash).addEventListener('click', closeFn);
this.dialogBox.querySelectorAll(`.${copyBtnHash}`).forEach((el) => {
el.addEventListener('click', (e) => {
const textArea = document.getElementById(e.target.parentNode.querySelector('label[for]').htmlFor);
if (textArea) {
GM_setClipboard(textArea.value, 'text');
}
})
});
this.accessArea = document.getElementById(accessAreaHash);
this.refreshArea = document.getElementById(refreshAreaHash);
}
}
const store = new TokenStore();
const ui = new TokenUI(store);
// Clear the token values on logout, and hide the UI
window.history.pushState = new Proxy(window.history.pushState, {
apply: (target, thisArg, argumentsList) => {
// pushState arguments: state, title, url
const retVal = target.apply(thisArg, argumentsList);
ui.toggleVisibility(argumentsList[2]); // url from arguments
return retVal;
}
});
})();
<?php
define('DS', DIRECTORY_SEPARATOR);
// Check if in Docker Container
define('CGROUP_FILE', '/proc/1/cgroup');
define('IN_DOCKER', file_exists(CGROUP_FILE) && false !== preg_match('#\:/docker/#', file_get_contents(CGROUP_FILE)));
// config values
$saveParentDir = __DIR__; // Parent dir of the ebooks and extras directories
$ebooksDir = 'ebooks'; // path of the ebooks directory relative to $saveParentDir
$extrasDir = 'ebooks' . DS . 'extras'; // path of the extras directory relative to $saveParentDir
$sleepDuration = 4; // Time to delay between page requests / different book downloads
$booksPerListPage = 25; // Book details to try requesting from the PacktPub API. This can be max 25
$fileTypesWanted = ['epub', 'mobi', 'pdf', 'code', 'video']; // Different file types from BOOK_FORMATS_URL you want to download
$downloadFrontCover = true; // Whether or not you want the book front cover downloading (if available)
$startIndex = false; // If set to a number this will be the first book downloaded of a range
$endIndex = false; // If set to a number this will be the last book downloaded of a range
// Fill in your user credentials in the quotes. If bot login is currently protected against by Captcha, make sure these are blank
$emailAddress = ''; // packt-pub username
$password = ''; // packt-pub password
// If bot login is currently protected against by Captcha, put the correct tokens in the quotes
$accessToken = ''; // Access Token. Obtain using associated user.js with TamperMonkey, or use dev-tools console
$refreshToken = ''; // Refresh Token. Obtain using associated user.js with TamperMonkey, or use dev-tools console
define('AUTH_URL', 'https://services.packtpub.com/auth-v1/users/tokens');
define('REFRESH_TOKEN_URL', 'https://services.packtpub.com/auth-v1/users/me/tokens');
define(
'OWNED_BOOKS_URL',
'https://services.packtpub.com/entitlements-v1/users/me/products?sort=createdAt:DESC&limit=%d&offset=%d'
);
define('BOOK_FORMATS_URL', 'https://services.packtpub.com/products-v1/products/%d/types');
define('FILE_DOWNLOAD_DETAILS_URL', 'https://services.packtpub.com/products-v1/products/%d/files/%s');
define('BOOK_SUMMARY_URL', 'https://static.packt-cdn.com/products/%d/summary');
$defaultCurlOptions = [
CURLOPT_RETURNTRANSFER => true,
CURLOPT_USERAGENT => 'Mozilla/5.0 (Windows NT 10.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.54 Safari/537.36',
];
// Used to clear the current output line
// pinched from symfony console: https://github.com/symfony/console/blob/master/Helper/ProgressIndicator.php#L211
define('CLEAR_LINE', "\x0D\x1B[2K");
// I might write this properly, and use symfony console someday!
echo 'Confiruration Options:', PHP_EOL;
$cliOptMap = [
'save-parent-dir:' => 'saveParentDir',
'ebooks-dir:' => 'ebooksDir',
'extras-dir:' => 'extrasDir',
'sleep-duration:' => 'sleepDuration',
'books-per-list-page:' => 'booksPerListPage',
'file-types-wanted:' => 'fileTypesWanted',
'download-front-cover:' => 'downloadFrontCover',
'start-index:' => 'startIndex',
'end-index:' => 'endIndex',
];
$cliOptValues = getopt('', array_keys($cliOptMap));
if (IN_DOCKER && !empty($cliOptValues['save-parent-dir'])) {
echo 'Ignoring --save-parent-dir as running inside docker container. Mount a volume to /mnt instead.', PHP_EOL;
unset($cliOptValues['save-parent-dir']);
}
foreach ($cliOptMap as $cliOptName => $varName) {
$nameToUse = str_replace(':', '', $cliOptName);
if (!empty($cliOptValues[$nameToUse])) {
$valToAssign = $cliOptValues[$nameToUse];
if ($varName == 'fileTypesWanted' && !is_array($valToAssign)) {
$valToAssign = [$valToAssign];
}
// I know, variable variables are awful, but it's got some protection as they're known
// variable names, and at-least it's not PHP 4/5's register_globals! I'm just being lazy ;-P
$$varName = $valToAssign;
}
echo $nameToUse, ' = ', var_export($$varName), PHP_EOL;
}
// These are env vars rather than arguments due to their sensitive nature. If using docker run, use --env, -e or --env-file
$envVarMap = [
'PP_EMAIL_ADDRESS' => 'emailAddress',
'PP_PASSWORD' => 'password',
'PP_ACCESS_TOKEN' => 'accessToken',
'PP_REFRESH_TOKEN' => 'refreshToken',
];
foreach ($envVarMap as $envVarName => $varName) {
if (!empty($_ENV[$envVarName])) {
// see above note on variable variables
$$varName = $_ENV[$envVarName];
}
echo $varName, ' = ', var_export($$varName), PHP_EOL;
}
function errorAndDie($message)
{
echo $message, PHP_EOL;
die;
}
function getAccessTokenExpiry($accessToken)
{
// It's a JWT, so we can easily extract it out
list(, $tokenDataBase64, ) = explode('.', $accessToken);
$tokenData = json_decode(base64_decode($tokenDataBase64));
return $tokenData->exp;
}
// Human readable format, taken from here: https://stackoverflow.com/questions/15188033/human-readable-file-size#answer-23888858
function sizeToHuman($bytes)
{
$size = array('B', 'kB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB');
$factor = floor((strlen($bytes) - 1) / 3);
$dec = ($bytes > 0) ? 2 : 0;
return sprintf("%.{$dec}f %s", $bytes / (1000 ** $factor), @$size[$factor]);
}
$getJson = null;
$refreshTokens = function() use (&$getJson, &$accessToken, &$refreshToken)
{
if (empty($refreshToken)) {
errorAndDie('Empty refresh token, cannot refresh access token');
}
$tokenInfo = $getJson(
REFRESH_TOKEN_URL,
'Token Refresh Failed',
[
CURLOPT_HTTPHEADER => ['Content-Type: application/json'],
CURLOPT_POSTFIELDS => json_encode(['refresh' => $refreshToken]),
]
);
$accessToken = $tokenInfo->data->access;
$refreshToken = $tokenInfo->data->refresh;
echo 'Access Token: ', $accessToken, PHP_EOL, 'Refresh Token: ', $refreshToken, PHP_EOL;
};
$checkAccessTokenExpiry = function() use ($refreshTokens, &$accessToken)
{
$expiryTimestamp = getAccessTokenExpiry($accessToken);
if ($expiryTimestamp <= time()) {
echo 'Current Access Token has expired', PHP_EOL;
$refreshTokens();
$expiryTimestamp = getAccessTokenExpiry($accessToken);
if ($expiryTimestamp <= time()) {
errorAndDie('Access Token expired, and refresh failed');
}
echo 'Access Token expires at ', date('Y/m/d H:i:s T', $expiryTimestamp), PHP_EOL;
}
return $expiryTimestamp;
};
$getJson = function ($url, $errorMessage, $extraOptions = []) use ($checkAccessTokenExpiry, $defaultCurlOptions, &$accessToken)
{
if (!empty($accessToken)) {
if (!in_array($url, [AUTH_URL, REFRESH_TOKEN_URL])) {
$checkAccessTokenExpiry();
}
if (!isset($extraOptions[CURLOPT_HTTPHEADER])) {
$extraOptions[CURLOPT_HTTPHEADER] = [];
}
$extraOptions[CURLOPT_HTTPHEADER][] = 'Authorization: Bearer ' . $accessToken;
}
$ch = curl_init($url);
curl_setopt_array($ch, $defaultCurlOptions + $extraOptions);
$response = curl_exec($ch);
$responseCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);
if ($responseCode !== 200) {
errorAndDie($errorMessage, ': ', $responseCode);
}
if (null === ($decodedJson = json_decode($response))) {
errorAndDie($errorMessage);
}
return $decodedJson;
};
$downloadFile = function ($url, $savePath, $errorMessage) use ($defaultCurlOptions)
{
$fh = fopen($savePath, 'w+');
$filesize = 0;
$ch = curl_init($url);
curl_setopt_array(
$ch,
$defaultCurlOptions + [
CURLOPT_FOLLOWLOCATION => true,
CURLOPT_RETURNTRANSFER => false,
CURLOPT_FILE => $fh,
CURLOPT_BINARYTRANSFER => true,
CURLOPT_HEADERFUNCTION => function($ch, $header) use (&$filesize) {
static $setFilesize = false;
if (!$setFilesize) {
$headerParts = explode(':', $header, 2);
if (2 == count($headerParts) && 'content-length' == strtolower(trim($headerParts[0]))) {
$filesize = intval(trim($headerParts[1]));
echo ' , filesize: ', sizeToHuman($filesize), PHP_EOL;
$setFilesize = true;
}
}
return strlen($header);
},
CURLOPT_WRITEFUNCTION => function($ch, $data) use ($fh, &$filesize) {
if ($filesize > 0) {
static $downloadedSize = 0;
$downloadedSize += strlen($data);
$downloadedPercentage = $downloadedSize / $filesize;
echo CLEAR_LINE;
if ($downloadedPercentage < 1) {
echo 'Downloaded ', number_format($downloadedPercentage * 100, 2, '.', ''), '%';
}
return fwrite($fh, $data);
}
return false;
}
]
);
curl_exec($ch);
$responseCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);
fclose($fh);
if ($responseCode !== 200 || !filesize($savePath)) {
echo $errorMessage, ': ', $responseCode, PHP_EOL;
return false;
}
return true;
};
if (!empty($emailAddress) && !empty($password)) {
echo 'Logging In', PHP_EOL;
$tokenInfo = $getJson(
AUTH_URL,
'Login Failed',
[
CURLOPT_HTTPHEADER => ['Content-Type: application/json'],
CURLOPT_POSTFIELDS => json_encode(
[
'username' => $emailAddress,
'password' => $password,
]
),
]
);
unset($password);
$accessToken = $tokenInfo->data->access;
$refreshToken = $tokenInfo->data->refresh;
echo 'Access Token: ', $accessToken, PHP_EOL, 'Refresh Token: ', $refreshToken, PHP_EOL;
} else {
echo 'Email or Password blank, falling back to Access & Refresh tokens', PHP_EOL;
}
if (empty($accessToken) || empty($refreshToken)) {
errorAndDie('Missing one of the required tokens. Stopping script.');
}
echo 'Access Token expires at ', date('Y/m/d H:i:s T', getAccessTokenExpiry($accessToken)), PHP_EOL;
echo 'Sleeping for ', $sleepDuration, ' seconds', PHP_EOL;
sleep($sleepDuration);
$bookCount = 0;
echo 'Getting list of eBooks', PHP_EOL;
$booksInfo = $getJson(
sprintf(OWNED_BOOKS_URL, $booksPerListPage, $bookCount),
'Couldn\'t retrieve list of books'
);
$totalNumberOfBooks = $booksInfo->count;
$noOfPages = ceil($totalNumberOfBooks / $booksPerListPage);
echo 'Total number of books: ', $totalNumberOfBooks, ', Total number of pages: ', $noOfPages, PHP_EOL;
for ($pageCount = 1; $pageCount <= $noOfPages; $pageCount++) {
if ($pageCount > 1) {
$booksInfo = $getJson(
sprintf(OWNED_BOOKS_URL, $booksPerListPage, $bookCount),
'Couldn\'t retrieve list of books'
);
}
$pageBooksCount = count($booksInfo->data);
echo 'Found ', $pageBooksCount, ' books on page ', $pageCount, PHP_EOL;
if (count($booksInfo->data)) {
if (!file_exists($saveParentDir . DS . $ebooksDir)) {
mkdir($saveParentDir . DS . $ebooksDir);
}
if (!file_exists($saveParentDir . DS . $extrasDir)) {
mkdir($saveParentDir . DS . $extrasDir);
}
foreach ($booksInfo->data as $bookData) {
$bookCount++;
if ($startIndex !== false && $bookCount < $startIndex) {
continue;
}
$name = $bookData->productName;
echo $bookCount, '. Examining "', $name, '"', PHP_EOL;
$fileName = preg_replace(['/[\<\>\:\"\/\\\|\?\*\%]+/', '/\s+/', '/[\[\]]/'], ['-', '_', ''], $name);
$downloadFormatInfo = $getJson(
sprintf(BOOK_FORMATS_URL, $bookData->productId),
'Couldn\'t retrieve available book formats'
);
$downloadLinks = [];
foreach ($downloadFormatInfo->data[0]->fileTypes as $fileType) {
if (in_array($fileType, $fileTypesWanted)) {
$downloadLinks[$fileType] = sprintf(FILE_DOWNLOAD_DETAILS_URL, $bookData->productId, $fileType);
}
}
if (0 === count($downloadLinks)) {
echo 'No Downloadable Books / Code', PHP_EOL;
continue;
}
foreach ($downloadLinks as $format => $downloadHref) {
$downloadLinkInfo = $getJson(
$downloadHref,
'Couldn\'t retrieve book download link'
);
$savePath = ('code' === $format)
? $saveParentDir . DS . $extrasDir . DS . $fileName . '.zip'
: $saveParentDir . DS . $ebooksDir . DS . $fileName . '.' . (('video' === $format) ? 'zip' : $format);
echo 'Downloading ', $format, ' to ', $savePath;
$downloadFile($downloadLinkInfo->data, $savePath, $format . ' download failed');
}
if ($downloadFrontCover) {
$frontCoverLinkInfo = $getJson(
sprintf(BOOK_SUMMARY_URL, $bookData->productId),
'Couldn\'t retrieve book summary link'
);
if (!empty($frontCoverLinkInfo->coverImage)) {
$fileExt = preg_replace('/^.+\.([^\.]+)$/', '$1', $frontCoverLinkInfo->coverImage);
$savePath = $saveParentDir . DS . $extrasDir . DS . $fileName . '.' . $fileExt;
echo 'Downloading Front Cover to: ', $savePath;
$downloadFile(
$frontCoverLinkInfo->coverImage,
$savePath,
'Front cover download failed'
);
}
}
if ($endIndex !== false && $bookCount >= $endIndex) {
break 2;
}
echo 'Sleeping for ', $sleepDuration, ' seconds', PHP_EOL;
sleep($sleepDuration);
}
}
}
@nneul
Copy link

nneul commented Dec 19, 2018

No longer works due to changes in site - quick hack I put together here you can look at to see new method against their new REST endpoints which is much simpler than site parsing. https://gist.github.com/nneul/6eda98fd87a58a623b857523247f3471

@chappy84
Copy link
Author

chappy84 commented Mar 9, 2019

This is now once again working after PacktPub's major site changes

@chappy84
Copy link
Author

chappy84 commented Sep 7, 2022

PacktPub have just implemented some changes again. The ability to buy/own books seems to be being phased out, in favour of a subscription model to access their whole library. Their API URLs seem to be changing slightly e.g. the prefix https://services.packtpub.com/entitlements-v1 is changing to https://subscription.packtpub.com/api/entitlements.
Ultimately I wouldn't expect to be able to download all of your books using this tool forever, and eventually to have to use their website to read the books.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment