Skip to content

Instantly share code, notes, and snippets.

@netsensei
Last active September 5, 2017 12:28
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save netsensei/5a8728455d7799f2a47abce2703041f6 to your computer and use it in GitHub Desktop.
Save netsensei/5a8728455d7799f2a47abce2703041f6 to your computer and use it in GitHub Desktop.
Installing the Datahub::Factory::Arthub modules

Installing the Datahub::Factory::Arthub modules

Introduction

The Datahub::Factory is a Catmandu based toolkit which allows easy and efficient setup and management of ETL pipelines. A pipeline transforms and transports data between two systems. The set of primary use cases for which this toolkit was conceived is situated within the GLAM (Galleries, Libraries, Archives & Museums) domain.

Out of the box, the Datahub::Factory is a generic, extensible toolkit. While you can use the importer and exporter modules that are included with the core app, you can extend the functionality with your own custom modules.

The Arthub Flanders platform is a digital platform governed by the Flemish Art Collection non-profit. The Datahub::Factory is used to ingest metadata from a wide array of content providers into the platform.

In order to accomodate to the particularities of the platform, a separate set of modules called Datahub::Factory::Arthub was created. These modules contain specific pre-processing logic proper to the ecosystem.

Prerequisites

You will need the Datahub::Factory and all it's dependencies installed. We'll assume you have a functional Perl environment and all system dependencies were met.

You will need OpenSSL. On mac with Homebrew: $ brew install openssl on Ubuntu: apt-get install openssl.

Installation

On a mac, you'll need to run this command if you installed OpenSSL through homebrew. Make sure you change the paths so they point to the appropriate locations.

$ OPENSSL_INCLUDE=/usr/local/Cellar/openssl/1.0.2l/include OPENSSL_LIB=/usr/local/Cellar/openssl/1.0.2l/lib/ cpanm --notest  Datahub::Factory::Arthub

On other platforms:

$ cpanm --notest Datahub::Factory::Arthub

This should install these modules:

  • Datahub::Factory::Arthub
  • Datahub::Factory
  • Catmandu
  • ... All the sub-dependencies.

Installation of the Arthub modules from Git

Running the bleeding edge version from Github via Carton. If carton has not been installed already:

$ cpanm --notest Carton

Then:

$ git clone https://github.com/thedatahub/Datahub-Factory-Arthub
$ cd Datahub-Factory-Arthub
$ carton install
$ carton exec dhconveyor transport -p <my-pipeline.ini>

You can also install the Arthub modules manually in a Perl environment.

$ git clone https://github.com/thedatahub/Datahub-Factory-Arthub
$ cd Datahub-Factory-Arthub
$ cpanm --notest --installdeps .
$ perl Build.pl
$ ./Build && ./Build install
$ dhconveyor transport -p <my-pipeline.ini>

Installation of the Datahub Factory from Git WITHOUT the Arthub

$ git clone https://github.com/thedatahub/Datahub-Factory
$ cd Datahub-Factory
$ cpanm --notest --installdeps .
$ perl Build.pl
$ ./Build && ./Build install
$ dhconveyor transport -p <my-pipeline.ini>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment