CALISHOT is a specialised search engine to unearth books on open calibre servers.
It allows you to search ebooks in full text across them or to browse the database by facets: authors, language, year, series, tags ... You can even run your own queries in SQL.
These servers are often up and down so, for now, the data are regularly updated and new snasphots are posted on ... you know, the first rule of the club.
Here is a list of regular mirrors (Keep in mind that they are not always online)
English books:
Non English books:
Well... It's not so glorious (on our side). This is just a simple Sqlite db indexed in full text with this poweful extension.
The Web UI and the server are powered by an awesome project which is able to serve a db as it: Datasette
Again, join the club, and download the db !
Then, just use your favorite Sqlite client. We strongly advice you to use another gem of this first class fighter: sqlite-utils (pipx is you friend).
Let say we want Asimov's works:
sqlite-utils index-eng.db --json-cols 'select title, authors, year from summary where instr(authors, "Asimov") >0 order by title'
Not only you can run regular SQL queries but you can run you queries in full text and you get your json dataset for free:
sqlite-utils index-eng.db --json-cols 'select * from summary_fts where title match("robots")'
sqlite-utils index-eng.db --json-cols 'select * from summary_fts where summary_fts match("robots")'
sqlite-utils index-eng.db --json-cols 'select * from summary_fts where summary_fts match("title:robots and formats:epub")'
or simpler with the new version:
sqlite-utils search --json-cols index-eng.db summary "robots"
sqlite-utils search --json-cols index-eng.db summary "title:robots and formats:epub"
You do prefer a CSV, no worry (jq is also your friend):
sqlite-utils index-eng.db 'select title, authors, year from summary where instr(authors, "asimov") >0 order by authors limit 100' --json-cols | jq -r '.[] | [.title.label, .authors[0], .year] | @csv
- Install datasette and it's plugins thanks to virtualenv/pip:
python -m venv calishot
. ./calishot/bin/activate
pip install datasette
pip install datasette-json-html
pip install datasette-pretty-json
- Prepare the calishot settings:
Move the sqlite db file to the same directory and then:
cat <<EOF > metadata.json
{
"databases": {
"index": {
"tables": {
"summary": {
"sort": "title",
"searchmode": "raw"
}
}
}
}
}
EOF
You can now run a local test:
datasette serve index.db --config sql_time_limit_ms:10000 --config allow_download:off --config max_returned_rows:2000 --config num_sql_threads:10 --metadata metadata.json
Open your browser to http://localhost:8001/ and check the result.
Install heroku-cli then :
heroku login -i
datasette publish heroku index.db -n calishot-3 --install=datasette-json-html --install=datasette-pretty-json --extra-options="--config sql_time_limit_ms:10000 --config allow_download:off --config num_sql_threads:10 --config max_returned_rows:500" --metadata metadata.json
First, Thank You for your labor of love!
I love your front end in Datasette. I would really like to get a copy of you sqlite db. Per above, I think you have it accessible somewhere, but I can't seem to find it. Please advise!