Skip to content

Instantly share code, notes, and snippets.

@jbenet
Created January 23, 2012 05:57
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jbenet/1661024 to your computer and use it in GitHub Desktop.
Save jbenet/1661024 to your computer and use it in GitHub Desktop.
d.ocks.org markdown test

datastore

simple, unified API for multiple data stores

datastore is a generic layer of abstraction for data store and database access. It is a simple API with the aim to enable application development in a datastore-agnostic way, allowing datastores to be swapped seamlessly without changing application code. Thus, one can leverage different datastores with different strengths without committing the application to one datastore throughout its lifetime. It looks like this:

+---------------+
|  application  |    <--- No cumbersome SQL or Mongo specific queries!
+---------------+
        |            <--- simple datastore API calls
+---------------+
|   datastore   |    <--- datastore implementation for underlying db
+---------------+
        |            <--- database specific calls
+---------------+
|  various dbs  |    <--- MySQL, Redis, MongoDB, FS, ...
+---------------+

In addition, grouped datastores significantly simplify interesting data access patterns (such as caching and sharding).

About

Install

For now, until datastore is well-tested and added to pypi:

git clone https://github.com/jbenet/datastore/
cd datastore
sudo python setup.py install

Documentation

The documentation can be found at: http://datastore.readthedocs.org/en/latest/

License

datastore is under the MIT License.

Contact

datastore is written by Juan Batiz-Benet. It was originally part of py-dronestore. On December 2011, it was re-written as a standalone project.

Project Homepage: https://github.com/jbenet/datastore

Feel free to contact me. But please file issues in github first. Cheers!

Examples

Hello World

>>> import datastore
>>> ds = datastore.basic.DictDatastore()
>>>
>>> hello = datastore.Key('hello')
>>> ds.put(hello, 'world')
>>> ds.contains(hello)
True
>>> ds.get(hello)
'world'
>>> ds.delete(hello)
>>> ds.get(hello)
None

Hello memcache

>>> import pylibmc
>>> import datastore
>>> from datastore.impl.memcached import MemcachedDatastore
>>> mc = pylibmc.Client(['127.0.0.1'])
>>> ds = MemcachedDatastore(mc)
>>>
>>> hello = datastore.Key('hello')
>>> ds.put(hello, 'world')
>>> ds.contains(hello)
True
>>> ds.get(hello)
'world'
>>> ds.delete(hello)
>>> ds.get(hello)
None

Hello mongo

>>> import pymongo
>>> import datastore
>>> from datastore.impl.mongo import MongoDatastore
>>>
>>> conn = pymongo.Connection()
>>> ds = MongoDatastore(conn.test_db)
>>>
>>> hello = datastore.Key('hello')
>>> ds.put(hello, 'world')
>>> ds.contains(hello)
True
>>> ds.get(hello)
'world'
>>> ds.delete(hello)
>>> ds.get(hello)
None

Hello redis

>>> import redis
>>> import datastore
>>> from datastore.impl.redis import RedisDatastore
>>> r = redis.Redis()
>>> ds = RedisDatastore(r)
>>>
>>> hello = datastore.Key('hello')
>>> ds.put(hello, 'world')
>>> ds.contains(hello)
True
>>> ds.get(hello)
'world'
>>> ds.delete(hello)
>>> ds.get(hello)
None

Hello filesystem

>>> import datastore
>>> from datastore.impl.filesystem import FileSystemDatastore
>>>
>>> ds = FileSystemDatastore('/tmp/.test_datastore')
>>>
>>> hello = datastore.Key('hello')
>>> ds.put(hello, 'world')
>>> ds.contains(hello)
True
>>> ds.get(hello)
'world'
>>> ds.delete(hello)
>>> ds.get(hello)
None

Hello git

>>> import datastore
>>> from datastore.impl.git import GitDatastore
>>>
>>> ds = GitDatastore('/tmp/.test_datastore')
>>>
>>> hello = datastore.Key('hello')
>>> ds.put(hello, 'world')
>>> ds.contains(hello)
True
>>> ds.get(hello)
'world'
>>> ds.delete(hello)
>>> ds.get(hello)
None

Hello Tiered Access

>>> import pymongo
>>> import datastore
>>>
>>> from datastore.impl.mongo import MongoDatastore
>>> from datastore.impl.lrucache import LRUCache
>>> from datastore.impl.filesystem import FileSystemDatastore
>>>
>>> conn = pymongo.Connection()
>>> mongo = MongoDatastore(conn.test_db)
>>>
>>> cache = LRUCache(1000)
>>> fs = FileSystemDatastore('/tmp/.test_db')
>>>
>>> ds = datastore.TieredDatastore([cache, mongo, fs])
>>>
>>> hello = datastore.Key('hello')
>>> ds.put(hello, 'world')
>>> ds.contains(hello)
True
>>> ds.get(hello)
'world'
>>> ds.delete(hello)
>>> ds.get(hello)
None

Hello Sharding

>>> import datastore
>>>
>>> shards = [datastore.DictDatastore() for i in range(0, 10)]
>>>
>>> ds = datastore.ShardedDatastore(shards)
>>>
>>> hello = datastore.Key('hello')
>>> ds.put(hello, 'world')
>>> ds.contains(hello)
True
>>> ds.get(hello)
'world'
>>> ds.delete(hello)
>>> ds.get(hello)
None

API

The datastore API places an emphasis on simplicity and elegance. Only four core methods must be implemented (get, put, delete, query).

get(key)

Return the object named by key or None if it does not exist.

Args:
  key: Key naming the object to retrieve

Returns:
  object or None

put(key, value)

Stores the object value named by key. How to serialize and store objects is up to the underlying datastore. It is recommended to use simple objects (strings, numbers, lists, dicts).

Args:
  key: Key naming `value`
  value: the object to store.

delete(key)

Removes the object named by key.

Args:
  key: Key naming the object to remove.

query(query):

Returns an iterable of objects matching criteria expressed in query Implementations of query will be the largest differentiating factor amongst datastores. All datastores must implement query, even using query's worst case scenario, see Query class for details.

Args:
  query: Query object describing the objects to return.

Returns:
  iterable cursor with all objects matching criteria

Specialized Features

Datastore implementors are free to implement specialized features, pertinent only to a subset of datastores, with the understanding that these should aim for generality and will most likely not be implemented across other datastores.

When implementings such features, please remember the goal of this project: simple, unified API for multiple data stores. When making heavy use of a particular library's specific functionality, perhpas one should not use datastore and should directly use that library.

Key

A Key represents the unique identifier of an object.

Our Key scheme is inspired by file systems and the Google App Engine key model.

Keys are meant to be unique across a system. Keys are hierarchical, incorporating more and more specific namespaces. Thus keys can be deemed 'children' or 'ancestors' of other keys.

Key('/Comedy')
Key('/Comedy/MontyPython')

Also, every namespace can be parametrized to embed relevant object information. For example, the Key name (most specific namespace) could include the object type:

Key('/Comedy/MontyPython/Actor:JohnCleese')
Key('/Comedy/MontyPython/Sketch:CheeseShop')
Key('/Comedy/MontyPython/Sketch:CheeseShop/Character:Mousebender')

DroneStore

distributed version control for application data

Dronestore is a library that keeps objects and their attributes versioned to allow merging with different versions of the object at a later date. Upon merging two object versions, attribute values are selected according to given rules (e.g. most recent, maximum). Thus, multiple disconnected machines can modify the same object and sync changes at a later date.

Install

sudo python setup.py install

License

Dronestore is under the MIT License.

Hello World

>>> import dronestore
>>> from dronestore import StringAttribute
>>> from dronestore.merge import LatestStrategy
>>>
>>> class MyModel(dronestore.Model):
...   first = StringAttribute(strategy=LatestStrategy)
...   second = StringAttribute(strategy=LatestStrategy)
...
>>> foo = MyModel('FooBar')
>>> foo.first = 'Hello'
>>> foo.commit()
>>>
>>> bar = MyModel('FooBar')
>>> bar.second = 'World'
>>> bar.commit()
>>>
>>> foo.merge(bar)
>>> print foo.first, foo.second
Hello World

Index

Check out these projects:

  • dronestore -- distributed version control for application data
  • datastore -- simple, unified API for multiple data stores
  • TeXchat -- simple web chat service that renders TeX math
h1 {
font-family: Monaco;
}

TeXchat

About

TeXchat is a simple web chat service that renders TeX math.

The goal is to make it trivially easy to have a conversation with someone else in the web, using mathematical symbols.

Currently, the prettiest math rendering is TeX math, which http://www.mathjax.org/ has ellegantly brought to the web. This is coupled with a simple chat service on top of socket.io.

If you have any suggestions, feedback, issues, etc: please start a topic in the [http://github.com/jbenet/TeXchat/issues](github issues) page. Feel free to contact the author directly, but github is preferable.

TeXchat is built by [http://github.com/jbenet](Juan Batiz-Benet).

Install

git clone https://github.com/jbenet/TeXchat/ texchat
cd texchat
npm install
node backend/server.js

Then go to http://localhost:8080/ in your favorite web browser.

License

TeXchat is under the MIT License. All dependency libraries are each under their own license.

Contact

Project Homepage: https://github.com/jbenet/TeXchat

Feel free to contact me. But please file issues in github first. Cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment