Skip to content

Instantly share code, notes, and snippets.

@magland
Created October 31, 2018 18:15
Show Gist options
  • Save magland/fb2a879975f6e1d44cc624297c1b6656 to your computer and use it in GitHub Desktop.
Save magland/fb2a879975f6e1d44cc624297c1b6656 to your computer and use it in GitHub Desktop.
kbucket_technical_info.ipynb
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "kbucket_technical_info.ipynb",
"version": "0.3.2",
"provenance": [],
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/magland/fb2a879975f6e1d44cc624297c1b6656/kbucket_technical_info.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"metadata": {
"id": "vuiRGFstMDd5",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"This is some documentation about the kbucket shares hosted on gordon.\n",
"\n",
"Dylan is managing 4 docker containers that are running on gordon. Two of them are kbucket shares (for downloading data), and two of them are running the corresponding upload servers.\n",
"\n",
"The two shares are called spikeforest1 and spikeforest2.\n",
"\n",
"Info for spikeforest1 [is found here](https://kbucketgui.herokuapp.com/?share=7317cea8265b) where we can see that the listen url for downloading is http://132.249.245.245:24351\n",
"\n",
"The listen url for uploading is http://132.249.245.245:24341, but the clients can determine that automatically by probing the node info for the corresponding share.\n",
"\n",
"The download servers connect out (through websocket) to the top-level kbucket hub at https://kbucket.flatironinstitute.org which is hosted at the flatiron building.\n",
"\n",
"The docker containers are instances of an image hosted [on dockerhub](https://hub.docker.com/r/magland/kbucket/) which is built from [the kbucket source repo](https://github.com/flatironinstitute/kbucket/blob/master/Dockerfile)\n",
"\n",
"(Note: even though some scripts in scripts_inside_docker/ are being added to the docker image, I believe that Dylan is not actually using those.)\n",
"\n",
"KBucket can be configured to point to these servers for upload and download as demonstrated below.\n",
"\n"
]
},
{
"metadata": {
"id": "w4lRxCW_L9VA",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"# Do not run this cell unless you are running on a hosted runtime without the\n",
"# packages already installed\n",
"%%capture\n",
"!pip install kbucket==0.11.21\n",
"!pip install spikeforest==0.1.6"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "8tbSBW2QOQR-",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"# Import the global kbucket client\n",
"from kbucket import client as kb"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "GCcJ47irObjF",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"The easiest way to download data from kbucket is to give the explicit kbucket URL as follows:"
]
},
{
"metadata": {
"id": "IQ3MIzfBOZUh",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"path=kb.realizeFile('kbucket://spikeforest.spikeforest1/abc.txt')"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "6zexuzXIOvrL",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
},
"outputId": "9fc90851-e998-4859-e992-8d8e9dffa9a9"
},
"cell_type": "code",
"source": [
"%%bash -s \"$path\"\n",
"cat $1"
],
"execution_count": 39,
"outputs": [
{
"output_type": "stream",
"text": [
"test\n"
],
"name": "stdout"
}
]
},
{
"metadata": {
"id": "EV1g5dJ7POkc",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"You might be wondering... how did the URL resolve to this kbucket share, which has share id \"7317cea8265b\" from the spikeforest.spikeforest1. This mapping is achieved via the pairio pairing:"
]
},
{
"metadata": {
"id": "KJEKVlDuO2ki",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
},
"outputId": "d803d9dc-caa8-4595-e5c4-09d585ed0434"
},
"cell_type": "code",
"source": [
"from pairio import client as pa\n",
"id=pa.get(collection='spikeforest',key='spikeforest1')\n",
"print(id)"
],
"execution_count": 40,
"outputs": [
{
"output_type": "stream",
"text": [
"7317cea8265b\n"
],
"name": "stdout"
}
]
},
{
"metadata": {
"id": "EiQjkwZNPvxU",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"This is a special convenience convention for the kbucket URL when using the python or JavaScript clients. Note that we could have instead used the direct `kbucket://7317cea8265b/abc.txt` address."
]
},
{
"metadata": {
"id": "10Fxd6tzQJuS",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"The pairio database is also very useful for storing secret tokens that may be unlocked via a passcode. For example, let's say we had a secret token called \"s3cr3T-y7y7ju873355\" that we wanted to unlock via password \"shortpass\". We could store this in pairio via"
]
},
{
"metadata": {
"id": "5pkFdKdoRICH",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 50
},
"outputId": "7ce2a78c-19cd-439b-82e0-b07b945c1017"
},
"cell_type": "code",
"source": [
"# This commented line was done once with permission to write to pairio under user spikeforest\n",
"# pa.set(key=dict(name='example_secret',password='shortpass'),value='s3cr3T-y7y7ju873355')\n",
"\n",
"# Note that the password is \"shortpass\"\n",
"token=pa.get(\n",
" collection='spikeforest',\n",
" key=dict(name='example_secret',password=getpass())\n",
")\n",
"print('token='+token)"
],
"execution_count": 41,
"outputs": [
{
"output_type": "stream",
"text": [
"··········\n",
"token=s3cr3T-y7y7ju873355\n"
],
"name": "stdout"
}
]
},
{
"metadata": {
"id": "wom2lTK0SfxM",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"This mechanism is used internally by spikeforest.kbucketConfigRemote() in order to set the proper credentials in order to write to pairio under the spikeforest user and to upload files and objects to the spikeforest.spikeforest1 (or 2) kbucket shares. So when we do the following (and enter the master password for the spikeforest system) we are essentially logging in to be able to write spikeforest data."
]
},
{
"metadata": {
"id": "STYCeQ0aRgUT",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 50
},
"outputId": "1283784a-9129-4ef0-ccdf-fb885d785087"
},
"cell_type": "code",
"source": [
"import spikeforest as sf\n",
"from getpass import getpass\n",
"sf.kbucketConfigRemote(share_id='spikeforest.spikeforest1',write=True)"
],
"execution_count": 42,
"outputs": [
{
"output_type": "stream",
"text": [
"Enter the spikeforest password··········\n",
"Pairio user set to spikeforest. Test succeeded.\n"
],
"name": "stdout"
}
]
},
{
"metadata": {
"id": "jiNms6hrTF8-",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"Now let's test uploading an object to the spikeforest1 kbucket share:"
]
},
{
"metadata": {
"id": "c2edIbKrRkVE",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
},
"outputId": "93124e45-db51-4ba5-8f9a-d8de401ffc18"
},
"cell_type": "code",
"source": [
"kb.saveObject(\n",
" key=dict(some='key',withsome='lookup-information'),\n",
" object=dict(\n",
" some='object',\n",
" withsome='data',\n",
" including_urls='kbucket://spikeforest.spikeforest1/abc.txt'\n",
" )\n",
")"
],
"execution_count": 43,
"outputs": [
{
"output_type": "stream",
"text": [
"Already on server.\n"
],
"name": "stdout"
}
]
},
{
"metadata": {
"id": "zdjpvKgRTmFT",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"At a later time (for now you must wait around 10 seconds), we can retrieve that object, either from python or JavaScript:"
]
},
{
"metadata": {
"id": "yxKpSeCETh8M",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 54
},
"outputId": "64b4fbfe-0b24-4348-b362-31ee8dbb80f2"
},
"cell_type": "code",
"source": [
"obj=kb.loadObject(\n",
" key=dict(some='key',withsome='lookup-information')\n",
")\n",
"print(obj)"
],
"execution_count": 44,
"outputs": [
{
"output_type": "stream",
"text": [
"{'some': 'object', 'withsome': 'data', 'including_urls': 'kbucket://spikeforest.spikeforest1/abc.txt'}\n"
],
"name": "stdout"
}
]
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment