Skip to content

Instantly share code, notes, and snippets.

@sevein

sevein/README.md Secret

Last active May 25, 2017 10:20
Show Gist options
  • Save sevein/e0b1d036721435add3cd to your computer and use it in GitHub Desktop.
Save sevein/e0b1d036721435add3cd to your computer and use it in GitHub Desktop.
Binder + Vagrant

Set up Vagrant environment

$ mkdir binder
$ wget http://git.io/vfWwe -O Vagrantfile
$ vagrant up

Install Archivematica

Log in

$ vagrant ssh archivematica

Add PPA repositories

$ sudo add-apt-repository ppa:archivematica/release && \
  sudo add-apt-repository ppa:archivematica/externals && \
  (sudo wget -O - http://packages.elasticsearch.org/GPG-KEY-elasticsearch | sudo apt-key add -) && \
  sudo add-apt-repository "deb http://packages.elasticsearch.org/elasticsearch/0.90/debian stable main"

Enable the APT multiverse repository

$ sudo sed -i "/^# deb.*multiverse/ s/^# //" /etc/apt/sources.list

Update APT cache and distro

$ sudo apt-get update && sudo apt-get dist-upgrade

Install necessary packages from APT

$ sudo apt-get install \
  archivematica-storage-service \
  elasticsearch \
  archivematica-mcp-server \
  archivematica-mcp-client \
  archivematica-dashboard

Run the following commands

$ sudo wget -q https://raw.githubusercontent.com/artefactual/archivematica/stable/1.3.x/localDevSetup/apache/apache.default -O /etc/apache2/sites-available/default.conf && \
  sudo rm -f /etc/apache2/sites-enabled/000-default.conf && \
  sudo ln -s /etc/apache2/sites-available/default.conf /etc/apache2/sites-enabled/default.conf && \
  sudo rm -f /etc/nginx/sites-enabled/default && \
  sudo ln -s /etc/nginx/sites-available/storage /etc/nginx/sites-enabled/storage && \
  sudo ln -s /etc/uwsgi/apps-available/storage.ini /etc/uwsgi/apps-enabled/storage.ini && \
  sudo service uwsgi restart && \
  sudo service nginx restart && \
  sudo /etc/init.d/apache2 restart && \
  sudo freshclam && \
  sudo /etc/init.d/clamav-daemon start && \
  sudo /etc/init.d/elasticsearch restart && \
  sudo /etc/init.d/gearman-job-server restart && \
  sudo start archivematica-mcp-server && \
  sudo start archivematica-mcp-client && \
  sudo start fits

Install Binder

Log in

$ vagrant ssh binder

Add PPA repositories

$ sudo add-apt-repository ppa:webupd8team/java && \
  (sudo wget -O - http://packages.elasticsearch.org/GPG-KEY-elasticsearch | sudo apt-key add -) && \
  sudo add-apt-repository "deb http://packages.elasticsearch.org/elasticsearch/1.5/debian stable main" && \
  sudo add-apt-repository ppa:nginx/stable && \
  sudo add-apt-repository ppa:archivematica/externals && \
  sudo add-apt-repository ppa:chris-lea/node.js

Update APT cache and distro

$ sudo apt-get update && sudo apt-get dist-upgrade

Install necessary packages from APT

Be aware that some of the following packages are large in size, e.g. oracle-java8-installer alone will download more than 150MB.

$ sudo apt-get install \
  oracle-java8-installer elasticsearch mysql-server-5.5 nginx \
  memcached gearman-job-server \
  php5-cli php5-fpm php5-curl php5-mysql php5-xsl php5-json php5-ldap \
  php5-memcache php-apc php5-readline \
  imagemagick ghostscript poppler-utils ffmpeg \
  git nodejs build-essential

The previous command installs multiple packages that prompt the user for input as follows:

  • mysql will ask you to set up a password for the root account. You can leave that empty in a development environment. You may be asked multiple times if the password field is left empty.
  • oracle-java8-installer will ask you to confirm the terms of the Oracle Binary Code License Agreement.

Elasticsearch has to be enabled and started manually as follows:

$ sudo update-rc.d elasticsearch defaults 95 10 && \
  sudo /etc/init.d/elasticsearch start

Install necessary packages from NPM

$ sudo npm install -g grunt-cli

Download the sources

$ git clone -b qa/0.8.x http://github.com/artefactual/binder.git $HOME/binder

Build front-end assets

$ cd $HOME/binder/plugins/arDrmcPlugin/frontend/ && \
  sudo chown -R vagrant:vagrant $HOME/.npm && \
  npm install && \
  grunt build

Create MySQL database

$ mysql -hlocalhost -uroot -e "CREATE DATABASE binder CHARACTER SET utf8 COLLATE utf8_unicode_ci;"

Configure PHP5-FPM pool

$ sudo bash -c "curl -Ls https://gist.githubusercontent.com/sevein/e0b1d036721435add3cd/raw/php5-fpm.binder.conf > /etc/php5/fpm/pool.d/binder.conf"
$ sudo restart php5-fpm

Configure Nginx

$ sudo bash -c "curl -Ls https://gist.githubusercontent.com/sevein/e0b1d036721435add3cd/raw/nginx.binder.conf > /etc/nginx/nginx.conf"
$ sudo /etc/init.d/nginx restart

Initialize

$ sudo curl -Ls https://gist.githubusercontent.com/sevein/e0b1d036721435add3cd/raw/atom.config.php > $HOME/binder/config/config.php
$ cd $HOME/binder
$ touch config/propel.ini
$ cat apps/qubit/config/settings.yml.tmpl | sed "/^[[:space:]]\+no_script_name:/ s/false/true/" > apps/qubit/config/settings.yml
$ php symfony tools:purge
$ php symfony binder:bootstrap
$ sudo /etc/init.d/memcached restart
$ cd $HOME/binder && php symfony search:populate

Configuring DIP upload

This is based on: https://ww.archivematica.org/en/docs/archivematica-1.3/admin-manual/installation/dashboard-config/#atom-server-configuration

Open the Archivematica Dashboard and go to Administration » AtoM DIP upload. In the arguments fields, use:

--url="http://192.168.123.123/index.php" \
--email="demo@example.com" \
--password="demo" \
--uuid="%SIPUUID%" \
--rsync-target="archivematica@192.168.123.123:/tmp" \
--version=2 \
--debug

Generate the SSH keys in the Archivematica box

$ vagrant ssh archivematica
$ sudo -H -u archivematica ssh-keygen

Copy the contents of /var/lib/archivematica/.ssh/id_rsa.pub somewhere handy, you will need it later.

Now log in the AtoM box:

$ vagrant ssh binder

And complete the installation running the following commands:

$ sudo apt-get install rssh && \
  sudo useradd -d /home/archivematica -m -s /usr/bin/rssh archivematica && \
  sudo passwd -l archivematica && \
  sudo sed -i "/^#allowrsync/ s/^#//" /etc/rssh.conf

Install the SSH key:

$ sudo mkdir /home/archivematica/.ssh
$ chmod 700 /home/archivematica/.ssh/
$ sudo vim /home/archivematica/.ssh/authorized_keys # Paste here the contents of id_dsa.pub
$ sudo chown -R archivematica:archivematica /home/archivematica/.ssh
$ sudo chmod 600 /home/archivematica/.ssh/authorized_keys 

Back in the Archivematica box, let's test the configuration:

$ vagrant ssh archivematica
$ sudo -u archivematica ssh archivematica@192.168.123.123 rsync -h

... and accept the fingerprint of the host.

TODO

Important

  • Add documentation to configure Binder worker(s)

  • Create Upload to Binder option in Archivematica. Temporary solution: add artwork record from the command line (with provided script) and return slug that the user types later in Archivematica under the option "Upload to AtoM".

  • Fix code that breaks with Elasticsearch 1.x. I think that this is important. Elasticsearch 0.9.13 was released in March 25, 2014.

    • script.disable_dynamic: false is now required.

    • Total size facets in browsers doesn't work. Incompatible Groovy syntax? Illegal Groovy syntax?

    • $query->setFields(...) breaks with non-leaf fields like i18n, should use i18n.*.

    • if (in_array($this->level_of_description_id, $componentLevels)) { ... BLOCK ... } in arElasticSearchInformationObjectPdo breaks badly!

Nice to have

  • We should use Bower instead of NPM to manage front-end deps, that will avoid having to install build-essential, required by d3 in npm :(

  • Rename arDrmcPlugin -> arBinderPlugin

  • config/ProjectConfiguration.class.php also mentions "DRMC"

Other questions

  • What does binder:bootstrap do? Are we managing database migrations properly? Can we pull migrations from AtoM and have them coexist with Binder-specific migrations?
<?php
return array (
'all' =>
array (
'propel' =>
array (
'class' => 'sfPropelDatabase',
'param' =>
array (
'encoding' => 'utf8',
'persistent' => true,
'pooling' => true,
'dsn' => 'mysql:dbname=binder;host=localhost;port=3306',
'username' => 'root'
),
),
),
'dev' =>
array (
'propel' =>
array (
'param' =>
array (
'classname' => 'DebugPDO',
'debug' =>
array (
'realmemoryusage' => true,
'details' =>
array (
'time' =>
array (
'enabled' => true,
),
'slow' =>
array (
'enabled' => true,
'threshold' => 0.10000000000000001,
),
'mem' =>
array (
'enabled' => true,
),
'mempeak' =>
array (
'enabled' => true,
),
'memdelta' =>
array (
'enabled' => true,
),
),
),
),
),
),
'test' =>
array (
'propel' =>
array (
'param' =>
array (
'classname' => 'DebugPDO',
),
),
),
);
user vagrant;
worker_processes 1;
pid /run/nginx.pid;
events {
worker_connections 768;
# multi_accept on;
}
http {
##
# Basic Settings
##
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048;
# server_tokens off;
# server_names_hash_bucket_size 64;
# server_name_in_redirect off;
include /etc/nginx/mime.types;
default_type application/octet-stream;
##
# SSL Settings
##
ssl_protocols TLSv1 TLSv1.1 TLSv1.2; # Dropping SSLv3, ref: POODLE
ssl_prefer_server_ciphers on;
##
# Logging Settings
##
access_log /var/log/nginx/access.log;
error_log /var/log/nginx/error.log;
##
# Virtual Host Configs
##
include /etc/nginx/conf.d/*.conf;
upstream binder_backend {
server unix:/var/run/php5-fpm.binder.sock;
}
server {
listen 80;
root /home/vagrant/binder;
server_name _;
set $alt_request_uri $request_uri;
location /drmc/ {
error_page 404 = @drmc;
log_not_found off;
set $alt_request_uri /drmc/index;
}
location @drmc {
rewrite ^/drmc/(.*)$ /index.php last;
}
location / {
try_files $uri /index.php?$args;
}
location /sf/ {
alias /home/vagrant/atom/vendor/symfony/data/web/sf/;
}
location ~ /\. {
deny all;
return 404;
}
location ~* (\.yml|\.ini|\.tmpl)$ {
deny all;
return 404;
}
location ~* /(?:uploads|files)/.*\.php$ {
deny all;
return 404;
}
location ~ /uploads/r/(.*)/conf { }
location ~ ^/uploads/r/(.*)$ {
include /etc/nginx/fastcgi_params;
set $index /index.php;
fastcgi_param SCRIPT_FILENAME $document_root$index;
fastcgi_param SCRIPT_NAME $index;
fastcgi_pass binder_backend;
}
location ~ /private/ {
internal;
root /home/vagrant/atom/;
}
location ~ ^/(index|qubit_dev)\.php(/|$) {
include /etc/nginx/fastcgi_params;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
fastcgi_param REQUEST_URI $alt_request_uri;
fastcgi_split_path_info ^(.+\.php)(/.*)$;
fastcgi_pass binder_backend;
}
location ~* \.php$ {
deny all;
return 404;
}
}
}
[binder]
user = vagrant
group = vagrant
listen = /var/run/php5-fpm.binder.sock
listen.owner = vagrant
listen.group = vagrant
listen.mode = 0600
# The following directives should be tweaked based in your hardware resources
pm = dynamic
pm.max_children = 30
pm.start_servers = 10
pm.min_spare_servers = 10
pm.max_spare_servers = 10
pm.max_requests = 200
chdir = /
# Some defaults for your PHP production environment
# A full list here: http://www.php.net/manual/en/ini.list.php
php_admin_value[expose_php] = off
php_admin_value[allow_url_fopen] = on
php_admin_value[memory_limit] = 512M
php_admin_value[max_execution_time] = 120
php_admin_value[post_max_size] = 72M
php_admin_value[upload_max_filesize] = 64M
php_admin_value[max_file_uploads] = 10
php_admin_value[cgi.fix_pathinfo] = 0
php_admin_value[display_errors] = off
php_admin_value[display_startup_errors] = off
php_admin_value[html_errors] = off
php_admin_value[session.use_only_cookies] = 0
php_admin_value[apc.enabled] = 0
php_admin_value[opcache.enable] = 0
env[ATOM_DEBUG_IP] = "192.168.123.1,10.0.2.2,127.0.0.1"
env[ATOM_READ_ONLY] = "off"
# -*- mode: ruby -*-
# vi: set ft=ruby :
Vagrant.configure(2) do |config|
config.vm.define "binder" do |m|
m.vm.box = "ubuntu/trusty64"
m.vm.hostname = "binder"
m.vm.network "private_network", ip: "192.168.123.123"
m.vm.provider "virtualbox" do |vb|
vb.memory = "2048"
vb.cpus = 2
end
end
config.vm.define "archivematica" do |m|
m.vm.box = "ubuntu/trusty64"
m.vm.hostname = "archivematica"
m.vm.network "private_network", ip: "192.168.123.124"
m.vm.provider "virtualbox" do |vb|
vb.memory = "2048"
vb.cpus = 2
end
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment