Skip to content

Instantly share code, notes, and snippets.

@kung-foo
Last active August 29, 2015 14:06
Show Gist options
  • Save kung-foo/0fc29b844e2412e0b86a to your computer and use it in GitHub Desktop.
Save kung-foo/0fc29b844e2412e0b86a to your computer and use it in GitHub Desktop.
Elasticsearch + marvel + wikipedia river
# build image
docker build -t es-wiki .

# access bash to poke around
docker run --rm -it es-wiki bash

# start image in background
ID=$(docker run -d -P es-wiki)

# find the address
IP=$(docker inspect -f '{{ .NetworkSettings.IPAddress }}' $ID)

# activate wikipedia river
curl -XPUT $IP:9200/_river/wikipedia/_meta -d '
{
    "type" : "wikipedia",
    "wikipedia" : {
        "url" : "http://dumps.wikimedia.org/simplewiki/latest/simplewiki-latest-pages-articles.xml.bz2"
    }
}'

# {"_index":"_river","_type":"wikipedia","_id":"_meta","_version":1,"created":true}

# access web ui if you are running docker locally: http://localhost:9200/_plugin/marvel
# or remote by finding the nat'd port:
docker port $ID 9200

# http://DOCKER_HOST:NAT_PORT/_plugin/marvel

# kill and remove container
docker rm -f $ID
FROM dockerfile/ubuntu
ENV DEBIAN_FRONTEND noninteractive
RUN sed -i 's/http:\/\/archive\.ubuntu\.com\/ubuntu/mirror:\/\/mirrors\.ubuntu.com\/mirrors\.txt/g' /etc/apt/sources.list
# Install Java.
RUN \
echo debconf shared/accepted-oracle-license-v1-1 select true | debconf-set-selections && \
echo debconf shared/accepted-oracle-license-v1-1 seen true | debconf-set-selections && \
add-apt-repository -y ppa:webupd8team/java && \
apt-get update && \
apt-get install -y oracle-java7-installer --no-install-recommends
WORKDIR /elasticsearch
# Install ElasticSearch
ENV ES_BASE elasticsearch-1.3.2
RUN curl -L https://download.elasticsearch.org/elasticsearch/elasticsearch/$ES_BASE.tar.gz | \
tar xz --strip-components=1 -C .
ADD elasticsearch.yml /elasticsearch/config/elasticsearch.yml
# Define default command.
CMD ["/elasticsearch/bin/elasticsearch"]
EXPOSE 9200
RUN mkdir -p /data/plugins && mkdir -p /data/data && mkdir -p /data/logs
RUN /elasticsearch/bin/plugin -install elasticsearch/elasticsearch-river-wikipedia/2.3.0
RUN /elasticsearch/bin/plugin -install elasticsearch/marvel/latest
# Define working directory.
WORKDIR /data
path:
data: /data/data
logs: /data/log
plugins: /data/plugins
work: /data/work
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment