Skip to content

Instantly share code, notes, and snippets.

@jamestwebber
Created February 23, 2022 17:29
Show Gist options
  • Save jamestwebber/41163cab3b4899b75833caa384e21c49 to your computer and use it in GitHub Desktop.
Save jamestwebber/41163cab3b4899b75833caa384e21c49 to your computer and use it in GitHub Desktop.
zarr_colab.ipynb
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "zarr_colab.ipynb",
"provenance": [],
"collapsed_sections": [],
"authorship_tag": "ABX9TyP5YbUXMm202tITYdq6a7bE",
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/jamestwebber/41163cab3b4899b75833caa384e21c49/zarr_colab.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"source": [
"Accessing Zarr array from a public bucket\n",
"\n",
"Basically just needed to install the dependencies and I could open it. Zarr (and Dask) can open gcs urls directly."
],
"metadata": {
"id": "mZdtZXeZm-ju"
}
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "ykqCByTbRivI",
"outputId": "77b9bcd2-d208-47dc-e91c-1073b7fa2343"
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Collecting zarr\n",
" Downloading zarr-2.11.0-py3-none-any.whl (153 kB)\n",
"\u001b[K |████████████████████████████████| 153 kB 17.0 MB/s \n",
"\u001b[?25hCollecting numcodecs>=0.6.4\n",
" Downloading numcodecs-0.9.1-cp37-cp37m-manylinux2010_x86_64.whl (6.2 MB)\n",
"\u001b[K |████████████████████████████████| 6.2 MB 38.3 MB/s \n",
"\u001b[?25hCollecting asciitree\n",
" Downloading asciitree-0.3.3.tar.gz (4.0 kB)\n",
"Requirement already satisfied: numpy>=1.7 in /usr/local/lib/python3.7/dist-packages (from zarr) (1.21.5)\n",
"Collecting fasteners\n",
" Downloading fasteners-0.17.3-py3-none-any.whl (18 kB)\n",
"Building wheels for collected packages: asciitree\n",
" Building wheel for asciitree (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
" Created wheel for asciitree: filename=asciitree-0.3.3-py3-none-any.whl size=5050 sha256=37217d922908ebe5e5e1dd0119dc179d565115ffbdb6d0eaa0c4fdca61082d30\n",
" Stored in directory: /root/.cache/pip/wheels/12/1c/38/0def51e15add93bff3f4bf9c248b94db0839b980b8535e72a0\n",
"Successfully built asciitree\n",
"Installing collected packages: numcodecs, fasteners, asciitree, zarr\n",
"Successfully installed asciitree-0.3.3 fasteners-0.17.3 numcodecs-0.9.1 zarr-2.11.0\n"
]
}
],
"source": [
"!pip install zarr"
]
},
{
"cell_type": "code",
"source": [
"!pip install fsspec"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "GrzJM2OXTh8v",
"outputId": "ed18b84d-fdbf-47cd-80bf-2c5918efd198"
},
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Collecting fsspec\n",
" Downloading fsspec-2022.2.0-py3-none-any.whl (134 kB)\n",
"\u001b[?25l\r\u001b[K |██▍ | 10 kB 19.8 MB/s eta 0:00:01\r\u001b[K |████▉ | 20 kB 26.5 MB/s eta 0:00:01\r\u001b[K |███████▎ | 30 kB 29.1 MB/s eta 0:00:01\r\u001b[K |█████████▊ | 40 kB 16.2 MB/s eta 0:00:01\r\u001b[K |████████████▏ | 51 kB 13.7 MB/s eta 0:00:01\r\u001b[K |██████████████▋ | 61 kB 15.8 MB/s eta 0:00:01\r\u001b[K |█████████████████ | 71 kB 16.6 MB/s eta 0:00:01\r\u001b[K |███████████████████▍ | 81 kB 16.5 MB/s eta 0:00:01\r\u001b[K |█████████████████████▉ | 92 kB 17.9 MB/s eta 0:00:01\r\u001b[K |████████████████████████▎ | 102 kB 18.2 MB/s eta 0:00:01\r\u001b[K |██████████████████████████▊ | 112 kB 18.2 MB/s eta 0:00:01\r\u001b[K |█████████████████████████████▏ | 122 kB 18.2 MB/s eta 0:00:01\r\u001b[K |███████████████████████████████▋| 133 kB 18.2 MB/s eta 0:00:01\r\u001b[K |████████████████████████████████| 134 kB 18.2 MB/s \n",
"\u001b[?25hInstalling collected packages: fsspec\n",
"Successfully installed fsspec-2022.2.0\n"
]
}
]
},
{
"cell_type": "code",
"source": [
"!pip install gcsfs"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "3VnZU7blTkWf",
"outputId": "01bc0661-6637-4797-e256-a0b721860016"
},
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Collecting gcsfs\n",
" Downloading gcsfs-2022.2.0-py2.py3-none-any.whl (24 kB)\n",
"Requirement already satisfied: google-auth-oauthlib in /usr/local/lib/python3.7/dist-packages (from gcsfs) (0.4.6)\n",
"Requirement already satisfied: google-cloud-storage in /usr/local/lib/python3.7/dist-packages (from gcsfs) (1.18.1)\n",
"Requirement already satisfied: decorator>4.1.2 in /usr/local/lib/python3.7/dist-packages (from gcsfs) (4.4.2)\n",
"Requirement already satisfied: requests in /usr/local/lib/python3.7/dist-packages (from gcsfs) (2.23.0)\n",
"Requirement already satisfied: fsspec==2022.02.0 in /usr/local/lib/python3.7/dist-packages (from gcsfs) (2022.2.0)\n",
"Requirement already satisfied: google-auth>=1.2 in /usr/local/lib/python3.7/dist-packages (from gcsfs) (1.35.0)\n",
"Collecting aiohttp<4\n",
" Downloading aiohttp-3.8.1-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.1 MB)\n",
"\u001b[K |████████████████████████████████| 1.1 MB 35.3 MB/s \n",
"\u001b[?25hRequirement already satisfied: typing-extensions>=3.7.4 in /usr/local/lib/python3.7/dist-packages (from aiohttp<4->gcsfs) (3.10.0.2)\n",
"Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.7/dist-packages (from aiohttp<4->gcsfs) (21.4.0)\n",
"Collecting asynctest==0.13.0\n",
" Downloading asynctest-0.13.0-py3-none-any.whl (26 kB)\n",
"Collecting yarl<2.0,>=1.0\n",
" Downloading yarl-1.7.2-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (271 kB)\n",
"\u001b[K |████████████████████████████████| 271 kB 69.4 MB/s \n",
"\u001b[?25hCollecting async-timeout<5.0,>=4.0.0a3\n",
" Downloading async_timeout-4.0.2-py3-none-any.whl (5.8 kB)\n",
"Collecting aiosignal>=1.1.2\n",
" Downloading aiosignal-1.2.0-py3-none-any.whl (8.2 kB)\n",
"Collecting frozenlist>=1.1.1\n",
" Downloading frozenlist-1.3.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (144 kB)\n",
"\u001b[K |████████████████████████████████| 144 kB 39.4 MB/s \n",
"\u001b[?25hRequirement already satisfied: charset-normalizer<3.0,>=2.0 in /usr/local/lib/python3.7/dist-packages (from aiohttp<4->gcsfs) (2.0.12)\n",
"Collecting multidict<7.0,>=4.5\n",
" Downloading multidict-6.0.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (94 kB)\n",
"\u001b[K |████████████████████████████████| 94 kB 4.4 MB/s \n",
"\u001b[?25hRequirement already satisfied: setuptools>=40.3.0 in /usr/local/lib/python3.7/dist-packages (from google-auth>=1.2->gcsfs) (57.4.0)\n",
"Requirement already satisfied: rsa<5,>=3.1.4 in /usr/local/lib/python3.7/dist-packages (from google-auth>=1.2->gcsfs) (4.8)\n",
"Requirement already satisfied: six>=1.9.0 in /usr/local/lib/python3.7/dist-packages (from google-auth>=1.2->gcsfs) (1.15.0)\n",
"Requirement already satisfied: cachetools<5.0,>=2.0.0 in /usr/local/lib/python3.7/dist-packages (from google-auth>=1.2->gcsfs) (4.2.4)\n",
"Requirement already satisfied: pyasn1-modules>=0.2.1 in /usr/local/lib/python3.7/dist-packages (from google-auth>=1.2->gcsfs) (0.2.8)\n",
"Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in /usr/local/lib/python3.7/dist-packages (from pyasn1-modules>=0.2.1->google-auth>=1.2->gcsfs) (0.4.8)\n",
"Requirement already satisfied: idna>=2.0 in /usr/local/lib/python3.7/dist-packages (from yarl<2.0,>=1.0->aiohttp<4->gcsfs) (2.10)\n",
"Requirement already satisfied: requests-oauthlib>=0.7.0 in /usr/local/lib/python3.7/dist-packages (from google-auth-oauthlib->gcsfs) (1.3.1)\n",
"Requirement already satisfied: oauthlib>=3.0.0 in /usr/local/lib/python3.7/dist-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib->gcsfs) (3.2.0)\n",
"Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests->gcsfs) (1.24.3)\n",
"Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests->gcsfs) (2021.10.8)\n",
"Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests->gcsfs) (3.0.4)\n",
"Requirement already satisfied: google-resumable-media<0.5.0dev,>=0.3.1 in /usr/local/lib/python3.7/dist-packages (from google-cloud-storage->gcsfs) (0.4.1)\n",
"Requirement already satisfied: google-cloud-core<2.0dev,>=1.0.0 in /usr/local/lib/python3.7/dist-packages (from google-cloud-storage->gcsfs) (1.0.3)\n",
"Requirement already satisfied: google-api-core<2.0.0dev,>=1.14.0 in /usr/local/lib/python3.7/dist-packages (from google-cloud-core<2.0dev,>=1.0.0->google-cloud-storage->gcsfs) (1.26.3)\n",
"Requirement already satisfied: packaging>=14.3 in /usr/local/lib/python3.7/dist-packages (from google-api-core<2.0.0dev,>=1.14.0->google-cloud-core<2.0dev,>=1.0.0->google-cloud-storage->gcsfs) (21.3)\n",
"Requirement already satisfied: googleapis-common-protos<2.0dev,>=1.6.0 in /usr/local/lib/python3.7/dist-packages (from google-api-core<2.0.0dev,>=1.14.0->google-cloud-core<2.0dev,>=1.0.0->google-cloud-storage->gcsfs) (1.54.0)\n",
"Requirement already satisfied: pytz in /usr/local/lib/python3.7/dist-packages (from google-api-core<2.0.0dev,>=1.14.0->google-cloud-core<2.0dev,>=1.0.0->google-cloud-storage->gcsfs) (2018.9)\n",
"Requirement already satisfied: protobuf>=3.12.0 in /usr/local/lib/python3.7/dist-packages (from google-api-core<2.0.0dev,>=1.14.0->google-cloud-core<2.0dev,>=1.0.0->google-cloud-storage->gcsfs) (3.17.3)\n",
"Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /usr/local/lib/python3.7/dist-packages (from packaging>=14.3->google-api-core<2.0.0dev,>=1.14.0->google-cloud-core<2.0dev,>=1.0.0->google-cloud-storage->gcsfs) (3.0.7)\n",
"Installing collected packages: multidict, frozenlist, yarl, asynctest, async-timeout, aiosignal, aiohttp, gcsfs\n",
"Successfully installed aiohttp-3.8.1 aiosignal-1.2.0 async-timeout-4.0.2 asynctest-0.13.0 frozenlist-1.3.0 gcsfs-2022.2.0 multidict-6.0.2 yarl-1.7.2\n"
]
}
]
},
{
"cell_type": "code",
"source": [
"import zarr\n",
"\n",
"# this URL is dead now because I removed the data. It was just a random array.\n",
"zarr.open(\"gs://macosko_public/test.zarr\").info"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 275
},
"id": "KIUhRYVmTUL_",
"outputId": "f9320d62-205d-44a5-a30d-11165deec137"
},
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<table class=\"zarr-info\"><tbody><tr><th style=\"text-align: left\">Type</th><td style=\"text-align: left\">zarr.core.Array</td></tr><tr><th style=\"text-align: left\">Data type</th><td style=\"text-align: left\">int64</td></tr><tr><th style=\"text-align: left\">Shape</th><td style=\"text-align: left\">(1000, 100)</td></tr><tr><th style=\"text-align: left\">Chunk shape</th><td style=\"text-align: left\">(1000, 100)</td></tr><tr><th style=\"text-align: left\">Order</th><td style=\"text-align: left\">C</td></tr><tr><th style=\"text-align: left\">Read-only</th><td style=\"text-align: left\">False</td></tr><tr><th style=\"text-align: left\">Compressor</th><td style=\"text-align: left\">Blosc(cname='lz4', clevel=5, shuffle=SHUFFLE, blocksize=0)</td></tr><tr><th style=\"text-align: left\">Store type</th><td style=\"text-align: left\">zarr.storage.FSStore</td></tr><tr><th style=\"text-align: left\">No. bytes</th><td style=\"text-align: left\">800000 (781.2K)</td></tr><tr><th style=\"text-align: left\">No. bytes stored</th><td style=\"text-align: left\">31782 (31.0K)</td></tr><tr><th style=\"text-align: left\">Storage ratio</th><td style=\"text-align: left\">25.2</td></tr><tr><th style=\"text-align: left\">Chunks initialized</th><td style=\"text-align: left\">1/1</td></tr></tbody></table>"
],
"text/plain": [
"Type : zarr.core.Array\n",
"Data type : int64\n",
"Shape : (1000, 100)\n",
"Chunk shape : (1000, 100)\n",
"Order : C\n",
"Read-only : False\n",
"Compressor : Blosc(cname='lz4', clevel=5, shuffle=SHUFFLE, blocksize=0)\n",
"Store type : zarr.storage.FSStore\n",
"No. bytes : 800000 (781.2K)\n",
"No. bytes stored : 31782 (31.0K)\n",
"Storage ratio : 25.2\n",
"Chunks initialized : 1/1"
]
},
"metadata": {},
"execution_count": 9
}
]
},
{
"cell_type": "code",
"source": [
"# I was curious where this notebook was running, looks like it's nearby\n",
"!curl ipinfo.io"
],
"metadata": {
"id": "7Olysh2MVQEs",
"outputId": "469a1f4f-5547-4925-f90c-7e3ed5abecb4",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"{\n",
" \"ip\": \"35.221.31.83\",\n",
" \"hostname\": \"83.31.221.35.bc.googleusercontent.com\",\n",
" \"city\": \"Washington\",\n",
" \"region\": \"Washington, D.C.\",\n",
" \"country\": \"US\",\n",
" \"loc\": \"38.8951,-77.0364\",\n",
" \"org\": \"AS396982 Google LLC\",\n",
" \"postal\": \"20004\",\n",
" \"timezone\": \"America/New_York\",\n",
" \"readme\": \"https://ipinfo.io/missingauth\"\n",
"}"
]
}
]
},
{
"cell_type": "code",
"source": [
""
],
"metadata": {
"id": "ZQfcfc27To7e"
},
"execution_count": null,
"outputs": []
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment