Skip to content

Instantly share code, notes, and snippets.

@robkooper
Last active October 23, 2019 20:30
Show Gist options
  • Save robkooper/be71815e78ddbc45d1636c6c8aab477e to your computer and use it in GitHub Desktop.
Save robkooper/be71815e78ddbc45d1636c6c8aab477e to your computer and use it in GitHub Desktop.
Deploying Clowder

Deploying Clowder

Quick document with what is needed and the steps that have to be taken to deploy a new clowder instance.

There are two options:

Docker Compose

This is the easiest way, but will not share resources across instances of clowder resulting in duplication and wasted resources.

This will get clowder up and running. This assumes you have docker and docker-compose installed

Grab the 4 files from below, and modify as required

  • docker-compose.yml : the main docker setup file (same as in develop branch as of writing)
  • docker-compose.override.yml : changes to the docker-compose file, for example where to write the data
  • docker-compose.extractors.yml : list of extractors you want to start with clowder
  • .env : any environment variables changed that are used in the docker-compose files

Once you have the containers created, make the folder(s):

mkdir -p /home/clowder/volumes/{custom,data,elasticsearch,mongo,rabbitmq,traefik}
chmod 777 /home/clowder/volumes/{custom,data,elasticsearch,mongo,rabbitmq,traefik} 

Start clowder:

docker-compose -f docker-compose.yml \
               -f docker-compose.override.yml \
               -f docker-compose.extractors.yml \
               up -d

Wait and connect to the server: https://clowder-docker.ncsa.illinois.edu

.env

This has most of the configuration options that you will want to change.

SHOW CODE
# This file will override the configation options in the docker-compose
# file. Copy this file to the same folder as docker-compose as .env

# ----------------------------------------------------------------------
# GENERAL CONFIGURATION
# ----------------------------------------------------------------------

# project name (-p flag for docker-compose)
#COMPOSE_PROJECT_NAME=dev

# ----------------------------------------------------------------------
# TRAEFIK CONFIGURATION
# ----------------------------------------------------------------------

# hostname of server
TRAEFIK_HOST=Host:clowder-docker.ncsa.illinois.edu;

# only allow access from localhost and NCSA
#TRAEFIK_IPFILTER=172.16.0.0/12, 141.142.0.0/16

# Run traffik on port 80 (http) and port 443 (https)
TRAEFIK_HTTP_PORT=80
TRAEFIK_HTTPS_PORT=443
TRAEFIK_HTTPS_OPTIONS=TLS

# enable SSL cerificate generation
TRAEFIK_ACME_ENABLE=true

# Use you real email address here to be notified if cert expires
TRAEFIK_ACME_EMAIL=devnull@example.com

# Always use https, trafic to http is redirected to https
TRAEFIK_HTTP_REDIRECT=Redirect.EntryPoint:https

# ----------------------------------------------------------------------
# CLOWDER CONFIGURATION
# ----------------------------------------------------------------------

# what version of clowder to use
CLOWDER_VERSION=develop

# path for clowder
#CLOWDER_CONTEXT=/clowder/

# list of initial admins
#CLOWDER_ADMINS=admin@example.com

# require approval of the clowder admins before user can login
#CLOWDER_REGISTER=true

# secret used to encrypt cookies for example
CLOWDER_SECRET=this_is_something_you_should_change

# admin key to clowder
CLOWDER_KEY=super_secret

# use SSL for login pages (set this if you enable ACME)
CLOWDER_SSL=true

# should clowder send email (false means send email using smtp server)
SMTP_MOCK=false

# name of the smtp server that will handle the emails from clowder
SMTP_SERVER=smtp

# ----------------------------------------------------------------------
# RABBITMQ CONFIGURATION
# ----------------------------------------------------------------------

# RabbitMQ username and password
#RABBITMQ_DEFAULT_USER=clowder
#RABBITMQ_DEFAULT_PASS=cats

# create the correct URI with above username and password
#RABBITMQ_URI=amqp://clowder:cats@rabbitmq/%2F

# exchange to be used
#RABBITMQ_EXCHANGE=clowder

# in case of external rabbitmq, the url to clowder
RABBITMQ_CLOWDERURL=https://clowder-docker.ncsa.illinois.edu/

docker-compose.yaml

This is the same file that is included in the clowder repository at https://github.com/clowder-framework/clowder/blob/master/docker-compose.yml.

SHOW CODE
version: '3.5'

services:
  # ----------------------------------------------------------------------
  # SINGLE ENTRYPOINT
  # ----------------------------------------------------------------------
  # webserver to handle all traffic. This can use let's encrypt to generate a SSL cert.
  traefik:
    image: traefik:1.7
    command:
      - --loglevel=INFO
      - --api
      # Entrypoints
      - --defaultentrypoints=https,http
      - --entryPoints=Name:http Address::80 ${TRAEFIK_HTTP_REDIRECT:-""}
      - --entryPoints=Name:https Address::443 ${TRAEFIK_HTTPS_OPTIONS:-TLS}
      # Configuration for acme (https://letsencrypt.org/)
      - --acme=${TRAEFIK_ACME_ENABLE:-false}
      #- --acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory
      - --acme.email=${TRAEFIK_ACME_EMAIL:-""}
      - --acme.entrypoint=https
      - --acme.onhostrule=true
      - --acme.storage=/config/acme.json
      - --acme.httpchallenge.entrypoint=http
      - --acme.storage=/config/acme.json
      - --acme.acmelogging=true
      # DOCKER
      - --docker=true
      - --docker.endpoint=unix:///var/run/docker.sock
      - --docker.exposedbydefault=false
      - --docker.watch=true
    restart: unless-stopped
    networks:
      - clowder
    ports:
      - "${TRAEFIK_HTTP_PORT-8000}:80"
      - "${TRAEFIK_HTTPS_PORT-8443}:443"
    labels:
      - "traefik.enable=true"
      - "traefik.backend=traefik"
      - "traefik.port=8080"
      - "traefik.frontend.rule=${TRAEFIK_HOST:-}PathPrefixStrip: /traefik"
      - "traefik.website.frontend.whiteList.sourceRange=${TRAEFIK_IPFILTER:-172.16.0.0/12}"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - traefik:/config

  # ----------------------------------------------------------------------
  # CLOWDER APPLICATION
  # ----------------------------------------------------------------------

  # main clowder application
  clowder:
    image: clowder/clowder:${CLOWDER_VERSION:-latest}
    restart: unless-stopped
    networks:
      - clowder
    depends_on:
      - mongo
    environment:
      - CLOWDER_ADMINS=${CLOWDER_ADMINS:-admin@example.com}
      - CLOWDER_REGISTER=${CLOWDER_REGISTER:-false}
      - CLOWDER_CONTEXT=${CLOWDER_CONTEXT:-/}
      - CLOWDER_SSL=${CLOWDER_SSL:-false}
      - RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
      - RABBITMQ_EXCHANGE=${RABBITMQ_EXCHANGE:-clowder}
      - RABBITMQ_CLOWDERURL=${RABBITMQ_CLOWDERURL:-http://clowder:9000}
      - SMTP_MOCK=${SMTP_MOCK:-true}
      - SMTP_SERVER=${SMTP_SERVER:-smtp}
    labels:
      - "traefik.enable=true"
      - "traefik.backend=clowder"
      - "traefik.port=9000"
      - "traefik.frontend.rule=${TRAEFIK_HOST:-}PathPrefix: ${CLOWDER_CONTEXT:-/}"
    volumes:
      - clowder-custom:/home/clowder/custom
      - clowder-data:/home/clowder/data

  # ----------------------------------------------------------------------
  # CLOWDER DEPENDENCIES
  # ----------------------------------------------------------------------

  # database to hold metadata (required)
  mongo:
    image: mongo:3.4
    restart: unless-stopped
    networks:
      - clowder
    volumes:
      - mongo:/data/db

  # message broker (optional but needed for extractors)
  rabbitmq:
    image: rabbitmq:management-alpine
    restart: unless-stopped
    networks:
      - clowder
    environment:
      - RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS=-rabbitmq_management path_prefix "/rabbitmq"
      - RABBITMQ_DEFAULT_USER=${RABBITMQ_DEFAULT_USER:-guest}
      - RABBITMQ_DEFAULT_PASS=${RABBITMQ_DEFAULT_PASS:-guest}
    labels:
      - "traefik.enable=true"
      - "traefik.backend=rabbitmq"
      - "traefik.port=15672"
      - "traefik.frontend.rule=${TRAEFIK_HOST:-}PathPrefix: /rabbitmq"
      - "traefik.website.frontend.whiteList.sourceRange=${TRAEFIK_IPFILTER:-172.16.0.0/12}"
    volumes:
      - rabbitmq:/var/lib/rabbitmq

  # search index (optional, needed for search and sorting future) 
  elasticsearch:
    image: elasticsearch:2
    command: elasticsearch -Des.cluster.name="clowder"
    networks:
      - clowder
    restart: unless-stopped
    environment:
      - cluster.name=clowder
    volumes:
      - elasticsearch:/usr/share/elasticsearch/data

  # monitor clowder extractors
  monitor:
    image: clowder/extractors-monitor:${CLOWDER_VERSION:-latest}
    restart: unless-stopped
    networks:
      - clowder
    depends_on:
      - rabbitmq
    environment:
      - RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
      - RABBITMQ_MGMT_PORT=15672
      - RABBITMQ_MGMT_PATH=/rabbitmq
    labels:
      - "traefik.enable=true"
      - "traefik.backend=monitor"
      - "traefik.port=9999"
      - "traefik.frontend.rule=${TRAEFIK_FRONTEND_RULE:-}PathPrefixStrip:/monitor"

# ----------------------------------------------------------------------
# NETWORK FOR CONTAINER COMMUNICATION
# ----------------------------------------------------------------------
networks:
  clowder:

# ----------------------------------------------------------------------
# VOLUMES FOR PERSISTENT STORAGE
# ----------------------------------------------------------------------
volumes:
  traefik:
  clowder-data:
  clowder-custom:
  mongo:
  rabbitmq:
  elasticsearch:

docker-compose.override.yaml

This is used to change the docker-compose file and have the data be stord in persistent volumes in the local folder.

SHOW CODE
version: "3.5"

volumes:
  traefik:
    driver_opts:
      type: none
      o: bind
      device: /home/clowder/volumes/traefik
  clowder-data:
    driver_opts:
      type: none
      o: bind
      device: /home/clowder/volumes/data
  clowder-custom:
    driver_opts:
      type: none
      o: bind
      device: /home/clowder/volumes/custom
  mongo:
    driver_opts:
      type: none
      o: bind
      device: /home/clowder/volumes/mongo
  rabbitmq:
    driver_opts:
      type: none
      o: bind
      device: /home/clowder/volumes/rabbitmq
  elasticsearch:
    driver_opts:
      type: none
      o: bind
      device: /home/clowder/volumes/elasticsearch

docker-compose.extractors.yaml

This is a list of all extractors that are started, you can add addtional extractors here.

SHOW CODE
version: '3.5'

# to use the extractors start with
# docker-compose -f docker-compose.yml -f docker-compose.override.yml -f docker-compose.extractors.yml up -d

services:
  # ----------------------------------------------------------------------
  # EXTRACTORS
  # ----------------------------------------------------------------------

  # extract checksum
  filedigest:
    image: clowder/extractors-digest:latest
    restart: unless-stopped
    networks:
      - clowder
    depends_on:
      - rabbitmq
      - clowder
    environment:
      - RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}

  # extract preview image
  imagepreview:
    image: clowder/extractors-image-preview:latest
    restart: unless-stopped
    networks:
      - clowder
    depends_on:
      - rabbitmq
      - clowder
    environment:
      - RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}

  # extract image metadata
  imagemetadata:
    image: clowder/extractors-image-metadata:latest
    restart: unless-stopped
    networks:
      - clowder
    depends_on:
      - rabbitmq
      - clowder
    environment:
      - RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}

  # extract preview image from audio spectrogram
  audiopreview:
    image: clowder/extractors-audio-preview:${CLOWDER_VERSION:-latest}
    restart: unless-stopped
    networks:
      - clowder
    depends_on:
      - rabbitmq
      - clowder
    environment:
      - RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}

  # extract pdf preview image
  pdfpreview:
    image: clowder/extractors-pdf-preview:${CLOWDER_VERSION:-latest}
    restart: unless-stopped
    networks:
      - clowder
    depends_on:
      - rabbitmq
      - clowder
    environment:
      - RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}

  # extract video preview image as well as smaller video
  videopreview:
    image: clowder/extractors-video-preview:${CLOWDER_VERSION:-latest}
    restart: unless-stopped
    networks:
      - clowder
    depends_on:
      - rabbitmq
      - clowder
    environment:
      - RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}

Kubernetes

This is work in progress and is not yet ready for production.

This assumes you are using helm to deploy the application to kubernetes. To begin with you will need to add the repository that houses all of the applications developed at NCSA using:

helm repo add ncsa https://opensource.ncsa.illinois.edu/charts/

Next you can deploy the actual application using:

helm upgrade --install clowder ncsa/clowder

To see the values and the documentation for the helm chart you can use:

helm inspect ncsa/clowder

Shared Infrastructure

This is a more complex setup, but has the advantages of allowing to leverage of shared infrastructure. Once the kubernetes setup is more stable this will be significanly easier to install.

Assumption is that we are leveraging of a replicated mongo database, shared rabbitMQ and shared extractors, and potentially shared elasticsearch. What is needed is to install clowder, and change the default custom.conf file to point to the elasticsearch (in the future this might be changed from the .env file)

Grab the 3 files from below, and modify as required

  • docker-compose.yml : the main docker setup file (same as in develop branch as of writing)
  • docker-compose.override.yml : changes to the docker-compose file, for example where to write the data
  • .env : any environment variables changed that are used in the docker-compose files

Once you have the containers created, make the folder(s):

mkdir -p /home/clowder/volumes/{custom,data,elasticsearch,mongo,rabbitmq,traefik}
chmod 777 /home/clowder/volumes/{custom,data,elasticsearch,mongo,rabbitmq,traefik} 

Start clowder:

docker-compose -f docker-compose.yml \
               -f docker-compose.override.yml \
               -f docker-compose.extractors.yml \
               up -d

Wait and connect to the server: https://clowder-docker.ncsa.illinois.edu

.env

This has most of the configuration options that you will want to change.

SHOW CODE
# This file will override the configation options in the docker-compose
# file. Copy this file to the same folder as docker-compose as .env

# ----------------------------------------------------------------------
# GENERAL CONFIGURATION
# ----------------------------------------------------------------------

# project name (-p flag for docker-compose)
#COMPOSE_PROJECT_NAME=dev

# ----------------------------------------------------------------------
# TRAEFIK CONFIGURATION
# ----------------------------------------------------------------------

# hostname of server
TRAEFIK_HOST=Host:clowder-docker.ncsa.illinois.edu;

# only allow access from localhost and NCSA
#TRAEFIK_IPFILTER=172.16.0.0/12, 141.142.0.0/16

# Run traffik on port 80 (http) and port 443 (https)
TRAEFIK_HTTP_PORT=80
TRAEFIK_HTTPS_PORT=443
TRAEFIK_HTTPS_OPTIONS=TLS

# enable SSL cerificate generation
TRAEFIK_ACME_ENABLE=true

# Use you real email address here to be notified if cert expires
TRAEFIK_ACME_EMAIL=devnull@example.com

# Always use https, trafic to http is redirected to https
TRAEFIK_HTTP_REDIRECT=Redirect.EntryPoint:https

# ----------------------------------------------------------------------
# CLOWDER CONFIGURATION
# ----------------------------------------------------------------------

# what version of clowder to use
CLOWDER_VERSION=develop

# path for clowder
#CLOWDER_CONTEXT=/clowder/

# list of initial admins
#CLOWDER_ADMINS=admin@example.com

# require approval of the clowder admins before user can login
#CLOWDER_REGISTER=true

# secret used to encrypt cookies for example
CLOWDER_SECRET=this_is_something_you_should_change

# admin key to clowder
CLOWDER_KEY=super_secret

# use SSL for login pages (set this if you enable ACME)
CLOWDER_SSL=true

# should clowder send email (false means send email using smtp server)
SMTP_MOCK=false

# name of the smtp server that will handle the emails from clowder
SMTP_SERVER=smtp

# ----------------------------------------------------------------------
# RABBITMQ CONFIGURATION
# ----------------------------------------------------------------------

# create the correct URI with above username and password
RABBITMQ_URI=amqp://clowder:cats@rabbitmq/%2F

# exchange to be used
RABBITMQ_EXCHANGE=clowder

# in case of external rabbitmq, the url to clowder
RABBITMQ_CLOWDERURL=https://clowder-docker.ncsa.illinois.edu/

# ----------------------------------------------------------------------
# MONGODB CONFIGURATION
# ----------------------------------------------------------------------

docker-compose.yaml

This is the based on the file that is included in the clowder repository at https://github.com/clowder-framework/clowder/blob/master/docker-compose.yml.

SHOW CODE
version: '3.5'

services:
  # ----------------------------------------------------------------------
  # SINGLE ENTRYPOINT
  # ----------------------------------------------------------------------
  # webserver to handle all traffic. This can use let's encrypt to generate a SSL cert.
  traefik:
    image: traefik:1.7
    command:
      - --loglevel=INFO
      - --api
      # Entrypoints
      - --defaultentrypoints=https,http
      - --entryPoints=Name:http Address::80 ${TRAEFIK_HTTP_REDIRECT:-""}
      - --entryPoints=Name:https Address::443 ${TRAEFIK_HTTPS_OPTIONS:-TLS}
      # Configuration for acme (https://letsencrypt.org/)
      - --acme=${TRAEFIK_ACME_ENABLE:-false}
      #- --acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory
      - --acme.email=${TRAEFIK_ACME_EMAIL:-""}
      - --acme.entrypoint=https
      - --acme.onhostrule=true
      - --acme.storage=/config/acme.json
      - --acme.httpchallenge.entrypoint=http
      - --acme.storage=/config/acme.json
      - --acme.acmelogging=true
      # DOCKER
      - --docker=true
      - --docker.endpoint=unix:///var/run/docker.sock
      - --docker.exposedbydefault=false
      - --docker.watch=true
    restart: unless-stopped
    networks:
      - clowder
    ports:
      - "${TRAEFIK_HTTP_PORT-8000}:80"
      - "${TRAEFIK_HTTPS_PORT-8443}:443"
    labels:
      - "traefik.enable=true"
      - "traefik.backend=traefik"
      - "traefik.port=8080"
      - "traefik.frontend.rule=${TRAEFIK_HOST:-}PathPrefixStrip: /traefik"
      - "traefik.website.frontend.whiteList.sourceRange=${TRAEFIK_IPFILTER:-172.16.0.0/12}"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - traefik:/config

  # ----------------------------------------------------------------------
  # CLOWDER APPLICATION
  # ----------------------------------------------------------------------

  # main clowder application
  clowder:
    image: clowder/clowder:${CLOWDER_VERSION:-latest}
    restart: unless-stopped
    networks:
      - clowder
    environment:
      - CLOWDER_ADMINS=${CLOWDER_ADMINS:-admin@example.com}
      - CLOWDER_REGISTER=${CLOWDER_REGISTER:-false}
      - CLOWDER_CONTEXT=${CLOWDER_CONTEXT:-/}
      - CLOWDER_SSL=${CLOWDER_SSL:-false}
      - RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
      - RABBITMQ_EXCHANGE=${RABBITMQ_EXCHANGE:-clowder}
      - RABBITMQ_CLOWDERURL=${RABBITMQ_CLOWDERURL:-http://clowder:9000}
      - SMTP_MOCK=${SMTP_MOCK:-true}
      - SMTP_SERVER=${SMTP_SERVER:-smtp}
      - MONGO_URI=${MONGO_URI:-mongodb://mongo:27017/clowder}
    labels:
      - "traefik.enable=true"
      - "traefik.backend=clowder"
      - "traefik.port=9000"
      - "traefik.frontend.rule=${TRAEFIK_HOST:-}PathPrefix: ${CLOWDER_CONTEXT:-/}"
    volumes:
      - clowder-custom:/home/clowder/custom
      - clowder-data:/home/clowder/data

# ----------------------------------------------------------------------
# NETWORK FOR CONTAINER COMMUNICATION
# ----------------------------------------------------------------------
networks:
  clowder:

# ----------------------------------------------------------------------
# VOLUMES FOR PERSISTENT STORAGE
# ----------------------------------------------------------------------
volumes:
  traefik:
  clowder-data:
  clowder-custom:

docker-compose.override.yaml

This is used to change the docker-compose file and have the data be stord in persistent volumes in the local folder.

SHOW CODE
version: "3.5"

volumes:
  traefik:
    driver_opts:
      type: none
      o: bind
      device: /home/clowder/volumes/traefik
  clowder-data:
    driver_opts:
      type: none
      o: bind
      device: /home/clowder/volumes/data
  clowder-custom:
    driver_opts:
      type: none
      o: bind
      device: /home/clowder/volumes/custom
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment