Skip to content

Instantly share code, notes, and snippets.

@polettix
Created February 19, 2023 19:59
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save polettix/1c41fd155657d93e74fa25815863213f to your computer and use it in GitHub Desktop.
Save polettix/1c41fd155657d93e74fa25815863213f to your computer and use it in GitHub Desktop.

NAME

Data::Resolver::Asset - represtantion of resolved asset

SYNOPSIS

use Data::Resolver::Asset;

# can create assets from files, filehandles, raw data. Add a key
# in the latter two cases if we want to eventually get a file.
my $asset_from_file = Data::Resolver::Asset->new(file => $path);
my $asset_from_fh   = Data::Resolver::Asset->new(
   fh => $fh, key => 'foo-bar.pdf');
my $asset_from_raw  = Data::Resolver::Asset->new(
   raw => $raw_octets, key => 'galook.png');

# get a filehandle
my $filehandle = $asset->filehandle; # also ->$fh

# if filehandles from in-memory scalars are not viable...
my $file_filehandle = $asset->filehandle(not_from_memory => 1);

# binmode can be passed too
my $utf8_fh = $asset->filehandle(binmode => ':encoding(utf8)');

# get a file
my $path = $asset->file;
my $alt_path = $asset->save_as('/path/to/somefile.txt');

# get a reference to raw data
my $rref = $asset->raw_ref;

# get the raw data
my $raw_octets = $asset->raw_data;

# get slightly less raw data
my $characters_not_octets = $asset->decoded_as_utf8;
my $chars_not_octets = $asset->decoded_as($some_encoding);

# parse JSON on the fly
my $data_structure = $asset->parsed_as_json;

# initializing with a filehandle and then getting a filehandle back makes
# the asset unuseable for further actions
my $asset_from_fh   = Data::Resolver::Asset->new(
   fh => $fh, key => 'foo-bar.pdf');
my $fh_back = $asset->fh;
die 'whatever' if ! $asset->is_useable; # dies
$asset->assert_useable;                 # dies too

# call ->file or ->raw_ref as the first method to cache data if you need
# the object to stay useable

DESCRIPTION

This class, based on Moo, provides a representation for a resolved asset. As such, it's not generally meant to be used directly, although it can be leveraged to provide transformation across different representations (in-memory data, file in the filesystem, filehandles) in case of need.

Input formats

An asset supports three main input interfaces: file, filehandle, and raw_data (provided either as a reference to the raw data, or as a plain string of octets).

Output formats

The class provides a wide range of ways to access the data in the asset.

At a basic level there are the following:

  • A "file", as a path in the filesystem.
  • A "filehandle", which can be opened either from in-memory data or from the filesystem (with some control over it, see "filehandle").
  • "raw_data" as a plain scalar. If it has too much data, it's better to avoid too much copying around with:
  • "raw_ref" as a reference to a buffer with the raw data.

The class also supports some additional convenience accessors for interpreting the raw data in a specific way:

  • "decoded_as" helps getting a string of characters decoded according to a specific encoding standard instead of a stream of octets;
  • "decoded_as_utf8" provides back a string of characters obtained by decoding the raw data as UTF-8
  • "parsed_as_json" decodes the data as JSON and provides the resulting data structure back (JSON::PP is used for parsing).

Useability

When an object is created starting from a filehandle, getting the filehandle as the first representation makes the object unuseable. This is because the requesting code might consume some of the filehandle and this class is not meant to do any kind of synchronization to this regard.

If it is important that the object continues to be useable, it's necessary to first cache the data from the filehandle, either in-memory (by calling "raw_ref" or "raw_data") or in the filesystem (with "file"). All following calls to "filehandle" will leverage the cache from this point on. It's possible to query the useability status with "is_useable" or the exception-throwing counterpart "assert_useable".

Setting a key

When it's necessary to get a "file" back, it might be important that a "key" is set in "new". This is inferred directly from the filename in case the object is initialized with a file; otherwise, the saved file will have a random name without a specific extension, which might upset some libraries when they expect to infer the file type from the file name.

In these cases, it's possible to pass a value for key upon initialization with "new".

Exceptions

Errors generally lead to an exception. By default it is thrown with Carp's croak function; it's possible to optionally use Ouch by passing a true value to the ouch initialization option, or by setting the package variable $Data::Resolver::RoleComplain::use_ouch.

If the value passed/set is an array reference, it is used to call Ouch's import method, passing the array contents as the initialization list. Otherwise, any true value just loads the module. In both cases, Ouch's ouch function will be used to throw excptions.

INTERFACE

The module provides an object-oriented interface, supporting the following methods.

assert_useable

$asset->assert_useable; # might throw an exception

Throw an exception if the asset object is not useable (see "Useability" and "Exceptions").

complain

$asset->complain($code, $message, $data);

Throw an exception, possibly through Ouch. See "Exceptions".

decoded_as

my $characters_not_octets = $asset->decoded_as($specific_encoding);

Decode the raw data according to the $specific_encoding and get back a string of characters (not plain octets).

decoded_as_utf8

my $characters_not_octets = $asset->decoded_as_utf8;

Decode the raw data according to the UTF-8 encoding and get back a string of characters (not plain octets).

fh

Alias for "filehandle".

file

my $path = $asset->file;

Get the asset as a file path in the filesystem.

If the asset is initialized with something different than a file, the file name will be determined according to "key":

  • if defined, it is used as the file name, saved in a temporary directory;
  • otherwise, a temporary file name is used.

See also "Setting a key".

filehandle

my $fh = $asset->filehandle(%options);  # OR
my $fh = $asset->filehandle(\%options);

Get a filehandle suitable for reading data.

Supported options:

  • binmode, a string to call bindmode() on the filehandle;
  • not_from_memory, a boolean option that prevents generating a filehandle by opening the "raw_ref", if available. This might be important if your consumer needs a filehandle from the filesystem and would complain otherwise.

Getting a filehandle from an instance that has been initialized with a filehandle might lead to "Useability" problems. If you plan to reuse the asset object instance multiple times, it's better to first call either "file" or "raw_ref" (depending on your preference about where to position the cache).

key

my $key = $asset->key;

Get the key associated with the asset. In case the data need to be saved on the disk, this is used as the last part of the full path (i.e. the file name part).

See "Setting a key".

is_useable

my $bool = $asset->is_useable;

Test whether the object is useable or not; see "Useability".

new

my $asset = Data::Resolver::Asset->new(%args);   # OR
my $asset = Data::Resolver::Asset->new(\%args);

Construct a new object instance. The constructor can receive either key-value pairs, or a reference to to a hash.

Supported keys:

  • file, a path to a file in the filesystem. Initializing with a file also allows automatic inference of the key ("key"), using the basename of the file.

  • filehandle, a handle that is supposed to support reading from. When initializing an object with a filehandle, make sure to read the notes in "Useability".

  • key, a string that provides a filename in case a file is needed. This is inferred automatically when the object is initialized with a file, but SHOULD be passed in the constructor if a "file" will be needed later and it is important that the file has a specific filename (that is the same as the key). This might be e.g. important if a library uses the filename to infer other metadata about the file, e.g. a specific graphic format.

    See also "Setting a key".

  • raw, a buffer of raw data. It is assumed to contain octets/bytes, not decoded characters; no effort is done to check this, though, so don't pass non-raw data.

    The buffer can be passed either as a plain scalar, or as a reference to a scalar; the latter approach helps avoiding too much copying around.

  • ouch, an option to turn on using Ouch instead of complaining with plain Carp.

    Any true value will activate Ouch; passing a reference to an array allows calling Ouch's import method with the provided elements, for finer tuning of the import (e.g. to pass option :trytiny_var).

    See also "Exceptions" for alternatives ways of setting this behaviour.

There is no effort to check that only one representation is passed, nor that different representations are consistent with one another. Just don't do it.

not_found

$asset->not_found($something);

Wrapper around "complain" to raise an error 404 regarding the unavailability of $something.

parsed_as_json

my $perl_data_structure = $asset->parsed_as_json

Treat the raw data as a JSON string and get back the data it represents.

raw_data

my $octets = $self->raw_data;

Get the raw data representation (octets, not characters).

raw_ref

my $ref_to_octets = $self->raw_ref;

Get a reference to the raw data representation (see "raw_data"). Getting a reference helps avoiding copying data around, which might be important with big assets.

save_as

my $save_path = $self->save_as($some_path);

Save a copy of the data at the path indicated by $some_path. The function returns the path itself, for easy chaining.

use_ouch

my $bool = $self->use_ouch;

Read-only accessor indicating whether Ouch is used for "Exceptions".

AUTHOR

Flavio Poletti flavio@polettix.it

COPYRIGHT AND LICENSE

Copyright 2023 by Flavio Poletti flavio@polettix.it

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment