Skip to content

Instantly share code, notes, and snippets.

@bollwyvl
Last active April 2, 2022 18:28
Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save bollwyvl/132aaff5cdb2c35ee1f75aed83e87eeb to your computer and use it in GitHub Desktop.
Save bollwyvl/132aaff5cdb2c35ee1f75aed83e87eeb to your computer and use it in GitHub Desktop.
Accessing JupyterLite contents from pyolite
Display the source blob
Display the rendered blob
Raw
{"metadata":{"language_info":{"codemirror_mode":{"name":"python","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.8"},"kernelspec":{"name":"python","display_name":"Pyolite","language":"python"}},"nbformat_minor":4,"nbformat":4,"cells":[{"cell_type":"markdown","source":"# Using JupyterLite IndexedDB Storage\n\n> Big thanks to `@konwiddak` who [unraveled the first part](https://github.com/jupyterlite/jupyterlite/discussions/91#discussioncomment-1135504) of this!\n\nIf available, the JupyterLite \"Server\" will store its contents in the browser's [IndexedDB](https://developer.mozilla.org/en-US/docs/Web/API/IndexedDB_API#see_also). This API is available to WebWorkers, where `pyolite` kernels run.","metadata":{}},{"cell_type":"code","source":"import asyncio, js, io, pandas, IPython","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"The name of the database is hard-coded.","metadata":{}},{"cell_type":"code","source":"DB_NAME = \"JupyterLite Storage\"","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"As the various APIs are event-driven, we use the `async` and `await` keywords with a `Queue` to unwrap the lifecycle.","metadata":{}},{"cell_type":"code","source":"async def get_contents(path):\n \"\"\"use the IndexedDB API to acess JupyterLite's in-browser (for now) storage\n \n for documentation purposes, the full names of the JS API objects are used.\n \n see https://developer.mozilla.org/en-US/docs/Web/API/IDBRequest\n \"\"\"\n # we only ever expect one result, either an error _or_ success\n queue = asyncio.Queue(1)\n \n IDBOpenDBRequest = js.self.indexedDB.open(DB_NAME)\n IDBOpenDBRequest.onsuccess = IDBOpenDBRequest.onerror = queue.put_nowait\n \n await queue.get()\n \n if IDBOpenDBRequest.result is None:\n return None\n \n IDBTransaction = IDBOpenDBRequest.result.transaction(\"files\", \"readonly\")\n IDBObjectStore = IDBTransaction.objectStore(\"files\")\n IDBRequest = IDBObjectStore.get(path, \"key\")\n IDBRequest.onsuccess = IDBRequest.onerror = queue.put_nowait\n \n await queue.get()\n \n return IDBRequest.result.to_py() if IDBRequest.result else None","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"With this function, we can now access files that have beeen saved to IndexedDB, for example, this notebook.","metadata":{}},{"cell_type":"code","source":"IPython.display.JSON(await get_contents(\"pyolite - contents.ipynb\"))","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"For many purposes, only the `content` field will be interesting.\n\n> For this example, _Open With..._ the `iris.csv` file, and make a change and save it, as it is initially only available from the _actual_ HTTP server. Future work may allow hiding this implementation detail.","metadata":{}},{"cell_type":"code","source":"pandas.read_csv(io.StringIO((await get_contents(\"iris.csv\"))[\"content\"]), sep = \"\\t\")","metadata":{"trusted":true},"execution_count":null,"outputs":[]}]}
@jmshea
Copy link

jmshea commented Jan 30, 2022

This is so super helpful! Thank you, @bollwyvl . I would like to add a note for someone else who runs into this issue -- if you have a binary file, the 'content' will be a string that is encoded using latin-1. I.e., if you want the bytes array (say, to do pickle.loads()), you will need to do:

mybytes=data['content'].encode('latin-1')

@psychemedia
Copy link

Is a complementary put_contents(path) function also possible?

@oscar6echo
Copy link

Yes, see this discussion comment.
It needs be polished and wrapped for users, I'd say.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment