Skip to content

Instantly share code, notes, and snippets.

@joshuarobinson
Created July 1, 2019 09:57
Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save joshuarobinson/2d6142d1ff3750376d9559e259cd94e0 to your computer and use it in GitHub Desktop.
Save joshuarobinson/2d6142d1ff3750376d9559e259cd94e0 to your computer and use it in GitHub Desktop.
FROM openjdk:8-slim
ARG HADOOP_VERSION=3.2.0
RUN apt-get update && apt-get install -y curl --no-install-recommends \
&& rm -rf /var/lib/apt/lists/*
# Download and extract the Hadoop binary package.
RUN curl https://archive.apache.org/dist/hadoop/core/hadoop-$HADOOP_VERSION/hadoop-$HADOOP_VERSION.tar.gz \
| tar xvz -C /opt/ \
&& ln -s /opt/hadoop-$HADOOP_VERSION /opt/hadoop \
&& rm -r /opt/hadoop/share/doc
# Add S3a jars to the classpath using this hack.
RUN ln -s /opt/hadoop/share/hadoop/tools/lib/hadoop-aws* /opt/hadoop/share/hadoop/common/lib/ && \
ln -s /opt/hadoop/share/hadoop/tools/lib/aws-java-sdk* /opt/hadoop/share/hadoop/common/lib/
# Set necessary environment variables.
ENV HADOOP_HOME="/opt/hadoop"
ENV PATH="/opt/spark/bin:/opt/hadoop/bin:${PATH}"
# Download and install the standalone metastore binary.
RUN curl http://apache.uvigo.es/hive/hive-standalone-metastore-3.0.0/hive-standalone-metastore-3.0.0-bin.tar.gz \
| tar xvz -C /opt/ \
&& ln -s /opt/apache-hive-metastore-3.0.0-bin /opt/hive-metastore
# Download and install the mysql connector.
RUN curl -L https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.47.tar.gz \
| tar xvz -C /opt/ \
&& ln -s /opt/mysql-connector-java-5.1.47/mysql-connector-java-5.1.47.jar /opt/hadoop/share/hadoop/common/lib/ \
&& ln -s /opt/mysql-connector-java-5.1.47/mysql-connector-java-5.1.47.jar /opt/hive-metastore/lib/
COPY metastore-site.xml /opt/hive-metastore/conf/
COPY core-site.xml /opt/hadoop/etc/hadoop/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment