-
-
Save jkubecki/d61d3e953ed5c8379075b5ddd8a95f22 to your computer and use it in GitHub Desktop.
// The following data should be run in the console while viewing the page https://read.amazon.com/ | |
// It will export a CSV file called "download" which can (and should) be renamed with a .csv extension | |
var db = openDatabase('K4W', '3', 'thedatabase', 1024 * 1024); | |
getAmazonCsv = function() { | |
// Set header for CSV export line - change this if you change the fields used | |
var csvData = "ASIN,Title,Authors,PurchaseDate\n"; | |
db.transaction(function(tx) { | |
tx.executeSql('SELECT * FROM bookdata;', [], function(tx, results) { | |
var len = results.rows.length; | |
for (i = 1; i < len; i++) { | |
// Get the data | |
var asin = results.rows.item(i).asin; | |
var title = results.rows.item(i).title; | |
var authors = JSON.parse(results.rows.item(i).authors); | |
var purchaseDate = new Date(results.rows.item(i).purchaseDate).toLocaleDateString(); | |
// Remove double quotes from titles to not interfere with CSV double-quotes | |
title = title.replace(/"/g, ''); | |
// Concatenate the authors list - uncomment the next line to get all authors separated by ";" | |
// var authorList = authors.join(';'); | |
// OR Take only first author - comment the next line if you uncommented the previous one | |
var authorList = authors[0]; | |
// Write out the CSV line | |
csvData += '"' + asin + '","' + title + '","' + authorList + '","' + purchaseDate + '"\n' | |
} | |
// "Export" the data | |
window.location = 'data:text/csv;charset=utf8,' + encodeURIComponent(csvData); | |
console.log("Sample Row:"); | |
console.log(results.rows.item(1)); | |
}); | |
}); | |
}; | |
getAmazonCsv(); |
Hi, great code and is working fine! Is it possible to extend number of fields such as ISBN and Lending?
Hi all - I never ran this in the past, but after changing the database version to 5, I still only get <- undefined
after copy/pasting the code.
So I'm not sure the database is there at all for me.
But after clicking Books (so I wouldn't get Samples) and poking around in Developer Tools, I did find this:
https://read.amazon.com/kindle-library/search?query=&libraryType=BOOKS&sortType=recency&resourceType=EBOOK&querySize=50
And as I scrolled through my list of books, I found links like this:
https://read.amazon.com/kindle-library/search?query=&libraryType=BOOKS&paginationToken=47&sortType=recency&resourceType=EBOOK&querySize=50
So basically these are JSON files with all of what used to be in the database. I took the slow brute-force approach of opening those links in separate tabs, and saving them as files. If you look at the end of each file, it will have a new paginationToken, incremented by 50, so, 97, 147, 197, 247, and so on. So I simply updated the paginationToken with the next token, loaded that page, and saved. The last page will not have a paginationToken. That's how you know you're done.
Once I got the JSON files (for me there were 14 files, containing 714 books). I named them books00.json, books01.json, and so on.
I then was able to use the excellent tool, jq (https://stedolan.github.io/jq/) to convert the JSON to a CSV. I used this command:
jq -r '.itemsList[]|([.title, .authors[0], .asin, .productUrl] | @csv)' books*.json > allbooks.csv
The JSON data does not have purchaseDate, but it does have percentageRead AND a link to a picture of the book cover, which I guess is... something. (Actually in Google Sheets you can use the IMAGE(URL) macro to display the covers, so that's kind of cool.)
Anywho, here's a sample record:
{
"asin": "B002ACNYFY",
"webReaderUrl": "https://read.amazon.com/?asin=B002ACNYFY",
"productUrl": "https://m.media-amazon.com/images/I/51+7sA25W-L._SY300_.jpg",
"title": "The Pains (Mind Over Matter)",
"percentageRead": 0,
"authors": [
"Sundman, John Damien:Cheeseburger Brown:"
],
"resourceType": "EBOOK",
"originType": "PURCHASE"
},
I suspect that somebody better versed in Javascript than me could write some code to automatically download each JSON file, and read the paginationToken from that, and then continue downloading the files until complete.
Actually somebody better versed in Javascript could probably download the JSON files, extract the data and export it directly to CSV.
But that's not me. :-}
Hope this helps somebody until Amazon breaks it again.
@darryllee
Hi, can you run this code?
https://gist.github.com/usayamadx/9c638d9b70bc714d6dd6043fcd54085f
I was able to write this code by your good comment (^^)b
@usayamadx
Thanks for this. I ran the code. It retrieved the ASIN and title of 2150 of my 10,000 or so books. Then got this message: Uncaught RangeError: Maximum call stack size exceeded.
@usayamadx the code you posted worked for me! Thank you! I got most of my books. I did get an error near the end of the list: VM1374:10 Uncaught RangeError: Maximum call stack size exceeded.
at getItemsList (:10:7)
at XMLHttpRequest.xhr.onreadystatechange (:31:11) ...
Hey @usayamadx, 1) THIS IS AWESOME!, 2) Yes, I ran it. Found one bug, and had some feature request. I will comment on your gist directly.
Hi,
doesn't work for me ... I get the following error-code for using it for my library:
VM143:4 unable to open database, version mismatch, '3' does not match the currentVersion of '5'
(anonymous) @ VM143:4
VM143:4 Uncaught DOMException: Failed to execute 'openDatabase' on 'Window': unable to open database, version mismatch, '3' does not match the currentVersion of '5'
at :4:10
Would be nice if you could find the time to have a look ...
Best regards
R.
@darryllee Hi, can you run this code? https://gist.github.com/usayamadx/9c638d9b70bc714d6dd6043fcd54085f
I was able to write this code by your good comment (^^)b
This worked for me, thank you :)
VM50:38 Uncaught DOMException: Failed to execute 'item' on 'SQLResultSetRowList': The index provided (1) is greater than or equal to the maximum bound (1).
at :38:32
I made a Chrome extension to solve this problem: Kindle Book List Exporter
You can now get a csv file in 1 click. No need to deal with any code or console.
It should also avoid the error of "Maximum call stack size exceeded."
Please let me know if it works for you.
@AppMakerSupreme Thanks for this. I like it. It's very easy to use. No need to download the Kindle for PC app. It shows mulitple authors nicely in a single cell, separated by a comma. It shows the & symbol as-is if in a book title. A few suggestions. Can you increase the book limit. I've got well over 10,000 books, that seems to be the limit right now. Also, the KindleSyncMetadataCache.xml file has more information than the extension, can you include it: for example, publication date and purchase date. Great work.
Thank you @AppMakerSupreme! The extension works great for me. Thank for doing this work and sharing it with us!
@johnnew2019 sorry I don't know how to increase the book limit. I haven't had time to look further into this issue.
I included all the info I can get from the amazon url I used. Maybe that's all we can get with this method.
@AppMakerSupreme No problem, it's still an excellent extension that many people will find useful.
@johnnew2019 @heathkat May I know what you do with your list of Kindle books? I want to understand the need better so when I have time I can develop this project further.
I see a lot of people wanting a csv/excel sheet of their books, but I'm not sure why. I own many Kindle books too but I don't have this need.
@heathkat @david-bakin Thank you for your feedback. I get your use case better now.
@david-bakin that's some hardcore project there, good luck!
I've got a lot of books and will get more. I like to keep track of authors, series, and titles. Filtering in excel is the best way to easily collect all that info. Hope that helps. The extension is pretty good already, any extra features would be a bonus.
I use the CLZ Books app to catalog all the books I own (eBooks, pBooks, audiobooks). This script lets me export them into a format for importing into that database.
@johnnew2019 @heathkat May I know what you do with your list of Kindle books? I want to understand the need better so when I have time I can develop this project further.
I see a lot of people wanting a csv/excel sheet of their books, but I'm not sure why. I own many Kindle books too but I don't have this need.
Several use cases for me:
- personal library catalog, able to be imported into other library programs (calibre, for instance). Your csv is a good first step towards this. Amazon's own system isn't always intuitive for a "scanning your shelves for something to read" experience
- as a base for search. I own digital books in several platforms, a total of thousands, and it becomes a pain when considering buying a new book, to know whether I already own it. Requires login to each of the web platforms, manual search, etc. Have been considering how to write a programmatic solution for this for years. None of the services offer APIs for this use case, though, so I'm stuck with parsing a signed-in account via JS.
This sounds exactly what I have been looking for - a version that works with https://read.amazon.co.uk/kindle-library rather than .com would be great.
One of the key things I want to do is tag and group the books I haven't read yet. Some would be best read in a certain order.
Ah yeah, that must be it exactly -- I probably last successfully ran it in mid-Mar so that version is cached, while the current version of the read.kindle.com site uses a new database. Hmph! Well, this tool was so great/genius while it lasted!!