Skip to content

Instantly share code, notes, and snippets.

@jkubecki
Last active June 1, 2024 22:51
Show Gist options
  • Save jkubecki/d61d3e953ed5c8379075b5ddd8a95f22 to your computer and use it in GitHub Desktop.
Save jkubecki/d61d3e953ed5c8379075b5ddd8a95f22 to your computer and use it in GitHub Desktop.
Amazon Kindle Export
// The following data should be run in the console while viewing the page https://read.amazon.com/
// It will export a CSV file called "download" which can (and should) be renamed with a .csv extension
var db = openDatabase('K4W', '3', 'thedatabase', 1024 * 1024);
getAmazonCsv = function() {
// Set header for CSV export line - change this if you change the fields used
var csvData = "ASIN,Title,Authors,PurchaseDate\n";
db.transaction(function(tx) {
tx.executeSql('SELECT * FROM bookdata;', [], function(tx, results) {
var len = results.rows.length;
for (i = 1; i < len; i++) {
// Get the data
var asin = results.rows.item(i).asin;
var title = results.rows.item(i).title;
var authors = JSON.parse(results.rows.item(i).authors);
var purchaseDate = new Date(results.rows.item(i).purchaseDate).toLocaleDateString();
// Remove double quotes from titles to not interfere with CSV double-quotes
title = title.replace(/"/g, '');
// Concatenate the authors list - uncomment the next line to get all authors separated by ";"
// var authorList = authors.join(';');
// OR Take only first author - comment the next line if you uncommented the previous one
var authorList = authors[0];
// Write out the CSV line
csvData += '"' + asin + '","' + title + '","' + authorList + '","' + purchaseDate + '"\n'
}
// "Export" the data
window.location = 'data:text/csv;charset=utf8,' + encodeURIComponent(csvData);
console.log("Sample Row:");
console.log(results.rows.item(1));
});
});
};
getAmazonCsv();
@Vladyak
Copy link

Vladyak commented Jun 27, 2021

Hi, great code and is working fine! Is it possible to extend number of fields such as ISBN and Lending?

@darryllee
Copy link

Hi all - I never ran this in the past, but after changing the database version to 5, I still only get <- undefined after copy/pasting the code.

So I'm not sure the database is there at all for me.

But after clicking Books (so I wouldn't get Samples) and poking around in Developer Tools, I did find this:
https://read.amazon.com/kindle-library/search?query=&libraryType=BOOKS&sortType=recency&resourceType=EBOOK&querySize=50

And as I scrolled through my list of books, I found links like this:
https://read.amazon.com/kindle-library/search?query=&libraryType=BOOKS&paginationToken=47&sortType=recency&resourceType=EBOOK&querySize=50

So basically these are JSON files with all of what used to be in the database. I took the slow brute-force approach of opening those links in separate tabs, and saving them as files. If you look at the end of each file, it will have a new paginationToken, incremented by 50, so, 97, 147, 197, 247, and so on. So I simply updated the paginationToken with the next token, loaded that page, and saved. The last page will not have a paginationToken. That's how you know you're done.

Once I got the JSON files (for me there were 14 files, containing 714 books). I named them books00.json, books01.json, and so on.

I then was able to use the excellent tool, jq (https://stedolan.github.io/jq/) to convert the JSON to a CSV. I used this command:

jq -r '.itemsList[]|([.title, .authors[0], .asin, .productUrl] | @csv)' books*.json > allbooks.csv

The JSON data does not have purchaseDate, but it does have percentageRead AND a link to a picture of the book cover, which I guess is... something. (Actually in Google Sheets you can use the IMAGE(URL) macro to display the covers, so that's kind of cool.)

Anywho, here's a sample record:

{
      "asin": "B002ACNYFY",
      "webReaderUrl": "https://read.amazon.com/?asin=B002ACNYFY",
      "productUrl": "https://m.media-amazon.com/images/I/51+7sA25W-L._SY300_.jpg",
      "title": "The Pains (Mind Over Matter)",
      "percentageRead": 0,
      "authors": [
        "Sundman, John Damien:Cheeseburger Brown:"
      ],
      "resourceType": "EBOOK",
      "originType": "PURCHASE"
    },

I suspect that somebody better versed in Javascript than me could write some code to automatically download each JSON file, and read the paginationToken from that, and then continue downloading the files until complete.

Actually somebody better versed in Javascript could probably download the JSON files, extract the data and export it directly to CSV.

But that's not me. :-}

Hope this helps somebody until Amazon breaks it again.

@usayamadx
Copy link

@darryllee
Hi, can you run this code?
https://gist.github.com/usayamadx/9c638d9b70bc714d6dd6043fcd54085f

I was able to write this code by your good comment (^^)b

@johnnew2019
Copy link

@usayamadx
Thanks for this. I ran the code. It retrieved the ASIN and title of 2150 of my 10,000 or so books. Then got this message: Uncaught RangeError: Maximum call stack size exceeded.

@heathkat
Copy link

heathkat commented Aug 17, 2021

@usayamadx the code you posted worked for me! Thank you! I got most of my books. I did get an error near the end of the list: VM1374:10 Uncaught RangeError: Maximum call stack size exceeded.
at getItemsList (:10:7)
at XMLHttpRequest.xhr.onreadystatechange (:31:11) ...

@darryllee
Copy link

Hey @usayamadx, 1) THIS IS AWESOME!, 2) Yes, I ran it. Found one bug, and had some feature request. I will comment on your gist directly.

@rgla
Copy link

rgla commented Apr 25, 2022

Hi,

doesn't work for me ... I get the following error-code for using it for my library:
VM143:4 unable to open database, version mismatch, '3' does not match the currentVersion of '5'
(anonymous) @ VM143:4
VM143:4 Uncaught DOMException: Failed to execute 'openDatabase' on 'Window': unable to open database, version mismatch, '3' does not match the currentVersion of '5'
at :4:10

Would be nice if you could find the time to have a look ...

Best regards
R.

@onimaru
Copy link

onimaru commented Aug 16, 2022

@darryllee Hi, can you run this code? https://gist.github.com/usayamadx/9c638d9b70bc714d6dd6043fcd54085f

I was able to write this code by your good comment (^^)b

This worked for me, thank you :)

@csells
Copy link

csells commented Nov 27, 2022

VM50:38 Uncaught DOMException: Failed to execute 'item' on 'SQLResultSetRowList': The index provided (1) is greater than or equal to the maximum bound (1).
at :38:32

@AppMakerSupreme
Copy link

I made a Chrome extension to solve this problem: Kindle Book List Exporter

You can now get a csv file in 1 click. No need to deal with any code or console.

It should also avoid the error of "Maximum call stack size exceeded."

Please let me know if it works for you.

@heathkat
@johnnew2019
@darryllee

image

@johnnew2019
Copy link

@AppMakerSupreme Thanks for this. I like it. It's very easy to use. No need to download the Kindle for PC app. It shows mulitple authors nicely in a single cell, separated by a comma. It shows the & symbol as-is if in a book title. A few suggestions. Can you increase the book limit. I've got well over 10,000 books, that seems to be the limit right now. Also, the KindleSyncMetadataCache.xml file has more information than the extension, can you include it: for example, publication date and purchase date. Great work.

@heathkat
Copy link

Thank you @AppMakerSupreme! The extension works great for me. Thank for doing this work and sharing it with us!

@AppMakerSupreme
Copy link

AppMakerSupreme commented Oct 19, 2023

@johnnew2019 sorry I don't know how to increase the book limit. I haven't had time to look further into this issue.

I included all the info I can get from the amazon url I used. Maybe that's all we can get with this method.

@johnnew2019
Copy link

@AppMakerSupreme No problem, it's still an excellent extension that many people will find useful.

@AppMakerSupreme
Copy link

@johnnew2019 @heathkat May I know what you do with your list of Kindle books? I want to understand the need better so when I have time I can develop this project further.

I see a lot of people wanting a csv/excel sheet of their books, but I'm not sure why. I own many Kindle books too but I don't have this need.

@heathkat
Copy link

heathkat commented Nov 16, 2023 via email

@david-bakin
Copy link

david-bakin commented Nov 18, 2023 via email

@AppMakerSupreme
Copy link

@heathkat @david-bakin Thank you for your feedback. I get your use case better now.
@david-bakin that's some hardcore project there, good luck!

@johnnew2019
Copy link

I've got a lot of books and will get more. I like to keep track of authors, series, and titles. Filtering in excel is the best way to easily collect all that info. Hope that helps. The extension is pretty good already, any extra features would be a bonus.

@RobertBernstein
Copy link

I use the CLZ Books app to catalog all the books I own (eBooks, pBooks, audiobooks). This script lets me export them into a format for importing into that database.

@SlyJackHammer
Copy link

@johnnew2019 @heathkat May I know what you do with your list of Kindle books? I want to understand the need better so when I have time I can develop this project further.

I see a lot of people wanting a csv/excel sheet of their books, but I'm not sure why. I own many Kindle books too but I don't have this need.

Several use cases for me:

  • personal library catalog, able to be imported into other library programs (calibre, for instance). Your csv is a good first step towards this. Amazon's own system isn't always intuitive for a "scanning your shelves for something to read" experience
  • as a base for search. I own digital books in several platforms, a total of thousands, and it becomes a pain when considering buying a new book, to know whether I already own it. Requires login to each of the web platforms, manual search, etc. Have been considering how to write a programmatic solution for this for years. None of the services offer APIs for this use case, though, so I'm stuck with parsing a signed-in account via JS.

@david-bakin
Copy link

david-bakin commented Nov 22, 2023 via email

@tuforoiff
Copy link

This sounds exactly what I have been looking for - a version that works with https://read.amazon.co.uk/kindle-library rather than .com would be great.

One of the key things I want to do is tag and group the books I haven't read yet. Some would be best read in a certain order.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment