Skip to content

Instantly share code, notes, and snippets.

@eode
Created December 15, 2018 00:20
Show Gist options
  • Save eode/0eb4d65a3a3428c041ea13fa7fb605fc to your computer and use it in GitHub Desktop.
Save eode/0eb4d65a3a3428c041ea13fa7fb605fc to your computer and use it in GitHub Desktop.
t4 markdown docs, autogenerated, first run

Welcome to T4’s documentation!

class t4.Bucket(bucket_uri)

Implements Bucket interface for T4.

Bucket.call(key)

Shorthand for deserialize(key)

Bucket.config(config_url='https://t4.quiltdata.com/config.json', quiet=False)

Updates this bucket’s search endpoint based on a federation config.

Bucket.delete(key)

Deletes a key from the bucket.

Parameters: key (str) – key to delete

Returns: None

Raises: if delete fails

Bucket.deserialize(key)

Deserializes object at key from bucket.

Parameters: key (str) – key in bucket to get

Returns: deserialized object

Raises:

  • KeyError if key does not exist
  • if deserialization fails

Bucket.fetch(key, path)

Fetches file (or files) at key to path.

If key ends in ‘/’, then all files with the prefix key will match and will: be stored in a directory at path.

Otherwise, only one file will be fetched and it will be stored at path.

Parameters:

  • key (str) – key in bucket to fetch
  • path (str) – path in local filesystem to store file or files fetched

Returns: None

Raises:

  • if path doesn’t exist
  • if download fails

Bucket.get_meta(key)

Gets the metadata associated with a key in bucket.

Parameters: key (str) – key in bucket to get meta for

Returns: dict of meta

Raises: if download fails

Bucket.keys()

Lists all keys in the bucket.

Returns: list of strings

Bucket.put(key, obj, meta=None)

Stores obj at key in bucket, optionally with user-provided metadata.

Parameters:

  • key (str) – key in bucket to put object to
  • obj (serializable) – serializable object to store at key
  • meta (dict) – optional user-provided metadata to store

Bucket.put_dir(key, directory)

Stores all files under directory under the prefix key.

Parameters:

  • key (str) – prefix to store files under in bucket
  • directory (str) – path to local directory to grab files from

Returns: None

Raises:

  • if directory isn’t a valid local directory
  • if writing to bucket fails

Bucket.put_file(key, path)

Stores file at path to key in bucket.

Parameters:

  • key (str) – key in bucket to store file at
  • path (str) – string representing local path to file

Returns: None

Raises:

  • if no file exists at path
  • if copy fails

Bucket.search(query)

Execute a search against the configured search endpoint.

query: query string to search

Returns either the request object (in case of an error) or: > a list of objects with the following keys:

key: key of the object
version_id: version_id of object version
operation: Create or Delete
meta: metadata attached to object
size: size of object in bytes
text: indexed text of object
source: source document for object (what is actually stored in ElasticSeach)
time: timestamp for operation

Bucket.select(key, query, raw=False)

Selects data from an S3 object.

Parameters:

  • key (str) – key to query in bucket
  • query (str) – query to execute (SQL by default)
  • query_type (str) – other query type accepted by S3 service
  • raw (bool) – return the raw (but parsed) response

Returns: pandas.DataFrame with results of query

Bucket.set_meta(key, meta)

Sets user metadata on key in bucket.

Parameters:

  • key (str) – key in bucket to set meta for
  • meta (dict) – value to set user metadata to

Returns: None

Raises: if put to bucket fails

class t4.Package

In-memory representation of a package

Package.contains(logical_key)

Checks whether the package contains a specified logical_key.

Returns: True or False

Package.getitem(logical_key)

Filters the package based on prefix, and returns either a new Package: or a PackageEntry.

Parameters: prefix (str) – prefix to filter on

Returns:

PackageEntry if prefix matches a logical_key exactly otherwise Package

Package.repr(max_lines=20)

String representation of the Package.

classmethod Package.browse(name=None, registry=None, pkg_hash=None)

Load a package into memory from a registry without making a local copy of the manifest.

Parameters:

  • name (string) – name of package to load
  • registry (string) – location of registry to load package from
  • pkg_hash (string) – top hash of package version to load

Package.build(name=None, registry=None, message=None)

Serializes this package to a registry.

Parameters:

  • name – optional name for package
  • registry – registry to build to defaults to local registry
  • message – the commit message of the package

Returns: the top hash as a string

Package.delete(logical_key)

Returns the package with logical_key removed.

Returns: self

Raises: KeyError – when logical_key is not present to be deleted

Package.diff(other_pkg)

Returns three lists – added, modified, deleted.

Added: present in other_pkg but not in self. Modified: present in both, but different. Deleted: present in self, but not other_pkg.

Parameters: other_pkg – Package to diff

Returns: added, modified, deleted (all lists of logical keys)

Package.dump(writable_file)

Serializes this package to a writable file-like object.

Parameters:

writable_file – file-like object to write serialized package.

Returns: None

Raises:

  • fail to create file
  • fail to finish write

Package.fetch(dest)

Copy all descendants to dest. Descendants are written under their logical names relative to self. So if p[a] has two children, p[a][b] and p[a][c], then p[a].fetch(“mydir”) will produce the following:

mydir/: b c

Parameters: dest – where to put the files (locally)

Returns: None

Package.get(logical_key)

Gets object from local_key and returns its physical path. Equivalent to self[logical_key].get().

Parameters: logical_key (string) – logical key of the object to get

Returns: Physical path as a string.

Raises:

  • KeyError – when logical_key is not present in the package
  • ValueError – if the logical_key points to a Package rather than PackageEntry.

Package.get_meta()

Returns user metadata for this Package.

classmethod Package.install(name, registry, pkg_hash=None, dest=None, dest_registry=None)

Installs a named package to the local registry and downloads its files.

Parameters:

  • name (str) – Name of package to install.
  • registry (str) – Registry where package is located.
  • pkg_hash (str) – Hash of package to install. Defaults to latest.
  • dest (str) – Local path to download files to.
  • dest_registry (str) – Registry to install package to. Defaults to local registry.

Returns: A new Package that points to files on your local machine.

Package.keys()

Returns logical keys in the package.

classmethod Package.load(readable_file)

Loads a package from a readable file-like object.

Parameters:

readable_file – readable file-like object to deserialize package from

Returns: a new Package object

Raises:

  • file not found
  • json decode error
  • invalid package exception

Package.manifest

Returns a generator of the dicts that make up the serialied package.

Package.push(name, dest, registry=None, message=None)

Copies objects to path, then creates a new package that points to those objects. Copies each object in this package to path according to logical key structure, then adds to the registry a serialized version of this package with physical_keys that point to the new copies. :param name: name for package in registry :param dest: where to copy the objects in the package :param registry: registry where to create the new package :param message: the commit message for the new package

Returns: A new package that points to the copied objects

Package.set(logical_key, entry, meta=None)

Returns self with the object at logical_key set to entry.

Parameters:

  • logical_key (string) – logical key to update
  • entry (PackageEntry OR string) – new entry to place at logical_key in the package if entry is a string, it is treated as a URL, and an entry is created based on it
  • meta (dict) – user level metadata dict to attach to entry

Returns: self

Package.set_dir(lkey, path)

Adds all files from path to the package.

Recursively enumerates every file in path, and adds them to: the package according to their relative location to path.

Parameters:

  • lkey (string) – prefix to add to every logical key, use ‘/’ for the root of the package.
  • path (string) – path to scan for files to add to package.

Returns: self

Raises: when path doesn’t exist

Package.set_meta(meta)

Sets user metadata on this Package.

Package.top_hash()

Returns the top hash of the package.

Note that physical keys are not hashed because the package has: the same semantics regardless of where the bytes come from.

Returns: A string that represents the top hash of the package

Package.update(new_keys_dict, meta=None, prefix=None)

Updates the package with the keys and values in new_keys_dict.

If a metadata dict is provided, it is attached to and overwrites metadata for all entries in new_keys_dict.

Parameters:

  • new_dict (dict) – dict of logical keys to update.
  • meta (dict) – metadata dict to attach to every input entry.
  • prefix (string) – a prefix string to prepend to every logical key.

Returns: self

classmethod Package.validate_package_name(name)

Verify that a package name is two alphanumerics strings separated by a slash.

Package.walk()

Generator that traverses all entries in the package tree and returns tuples of (key, entry), with keys in alphabetical order.

t4.config(*autoconfig_url, **config_values)

Set or read the T4 configuration

To retrieve the current config, call directly, without arguments:: python >>> import t4 as he >>> he.config()

To trigger autoconfiguration, call with just the navigator URL:: python >>> he.config('https://example.com')

To set config values, call with one or more key=value pairs:: python >>> he.config(navigator_url='http://example.com', ... elastic_search_url='http://example.com/queries')

When setting config values, unrecognized values are rejected. Acceptable config values can be found in t4.util.CONFIG_TEMPLATE

Parameters:

  • autoconfig_url – URL indicating a location to configure from
  • **config_values – key=value pairs to set in the config

Returns: HeliumConfig object (an ordered Mapping)

t4.copy(src, dest)

Copies src object from T4 to dest

Either of src and dest may be S3 paths (starting with s3://) or local file paths (starting with file:///).

Parameters:

  • src (str) – a path to retrieve
  • dest (str) – a path to write to

t4.delete(target)

Delete an object.

Parameters: target (str) – URI of the object to delete

t4.delete_package(name, registry=None)

Delete a package. Deletes only the manifest entries and not the underlying files.

Parameters:

  • name (str) – Name of the package
  • registry (str) – The registry the package will be removed from

t4.get(src)

Retrieves src object from T4 and loads it into memory.

An optional version may be specified.

Parameters: src (str) – A URI specifying the object to retrieve

Returns: (data, metadata). Does not work on all objects.

Return type: tuple

t4.list_packages(registry=None)

Lists Packages in the registry.

Returns a list of all named packages in a registry. If the registry is None, default to the local registry.

Parameters:

registry (string) – location of registry to load package from.

Returns: A list of strings containing the names of the packages

t4.ls(target, recursive=False)

List data from the specified path.

Parameters:

  • target (str) – URI to list
  • recursive (bool) – show subdirectories and their contents as well

Returns:

Return value structure has not yet been permanently decided Currently, it’s a tuple of list objects, containing the following: result[0]

directory info

result[1]: file/object info

result[2]: delete markers

Return type: list

t4.put(obj, dest, meta=None)

Write an in-memory object to the specified T4 dest

You may pass a dict to meta to store it with obj at dest.

See User Docs for more info on object Serialization and Metadata.

Parameters:

  • obj – a serializable object
  • dest (str) – A URI
  • meta (dict) – Optional. metadata dict to store with obj at dest

t4.search(query)

Searches your bucket. query can contain plaintext, and can also contain clauses like $key:”$value” that search for exact matches on specific keys.

Returns either the request object (in case of an error) or a list of objects with the following keys:: key: key of the object version_id: version_id of object version operation: Create or Delete meta: metadata attached to object size: size of object in bytes text: indexed text of object source: source document for object (what is actually stored in ElasticSeach) time: timestamp for operation

Indices and tables

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment