Skip to content

Instantly share code, notes, and snippets.

@danpaldev
Forked from derlin/DockerCassandra-InitDb.md
Last active October 9, 2020 17:27
Show Gist options
  • Save danpaldev/a9f25d48dddaf851015db0fa0938d4e2 to your computer and use it in GitHub Desktop.
Save danpaldev/a9f25d48dddaf851015db0fa0938d4e2 to your computer and use it in GitHub Desktop.
Dockerfile and entrypoint example in order to easily initialize a Cassandra container using *.sh/*.cql scripts in `/docker-entrypoint-initdb.d`

Initializing a Cassandra Docker container with keyspace and data

This gist shows you how to easily create a cassandra image with initial keyspace and values populated.

It is very generic: the entrypoint.sh is able to execute any cql file located in /docker-entrypoint-initdb.d/, a bit like what you do to initialize a MySQL container.

You can add any *.sh or *.cql scripts inside /docker-entrypoint-initdb.d, but note that:

  • *.sh files will be executed BEFORE launching cassandra
  • *.cql files will be executed (with cqlsh -f) AFTER cassandra started

Files are executed in name order (ls * | sort)

How to use

  1. download the Dockerfile and entrypoint.sh
  2. edit the Dockerfile in order to copy your init scripts inside /docker-entrypoint-initdb.d/
  3. build the image: docker build -t my-cassandra-image .
  4. run the image: docker run --rm -p 9042:9042 --name cassandra-container -d my-cassandra-image

Note that the scripts in /docker-entrypoint.sh will only be called on startup. If you decide to persist the data using a volume, this will work all right: the scripts won't be executed when you boot your container a second time. By using a volumne, I mean, e.g.:

docker run --rm -d \
    -p 9042:9042 \
    -v $PWD/data:/var/lib/cassandra \
    --name cassandra-container \
    my-cassandra-image
# NOTE: will also work with other cassandra version tags
FROM cassandra:3.11
# Fix UTF-8 accents in init scripts
ENV LANG C.UTF-8
# Here, you can add any *.sh or *.cql scripts inside /docker-entrypoint-initdb.d
# *.sh files will be executed BEFORE launching cassandra
# *.cql files will be executed with cqlsh -f AFTER cassandra started
# Files are executed in name order (ls * | sort)
COPY *.cql /docker-entrypoint-initdb.d/
# this is the script that will patch the already existing entrypoint from cassandra image
COPY entrypoint.sh /
#Needed for giving execution permissions to the script
RUN ["chmod", "+x", "/entrypoint.sh"]
# Override ENTRYPOINT, keep CMD
ENTRYPOINT ["/entrypoint.sh"]
CMD ["cassandra", "-f"]
#!/usr/bin/env bash
##
## This script will generate a patched docker-entrypoint.sh that:
## - executes any *.sh script found in /docker-entrypoint-initdb.d
## - boots cassandra up
## - executes any *.cql script found in docker-entrypoint-initdb.d
##
## It is compatible with any cassandra:* image
##
## Create script that executes files found in docker-entrypoint-initdb.d/
cat <<'EOF' >> /run-init-scripts.sh
#!/usr/bin/env bash
LOCK=/var/lib/cassandra/_init.done
INIT_DIR=docker-entrypoint-initdb.d
if [ -f "$LOCK" ]; then
echo "@@ Initialization already performed."
exit 0
fi
cd $INIT_DIR
echo "@@ Executing bash scripts found in $INIT_DIR"
# execute scripts found in INIT_DIR
for f in $(find . -type f -name "*.sh" -executable -print | sort); do
echo "$0: sourcing $f"
. "$f"
echo "$0: $f executed."
done
# wait for cassandra to be ready and execute cql in background
(
while ! cqlsh -e 'describe cluster' > /dev/null 2>&1; do sleep 6; done
echo "$0: Cassandra cluster ready: executing cql scripts found in $INIT_DIR"
for f in $(find . -type f -name "*.cql" -print | sort); do
echo "$0: running $f"
cqlsh -f "$f"
echo "$0: $f executed"
done
# mark things as initialized (in case /var/lib/cassandra was mapped to a local folder)
touch $LOCK
) &
EOF
## Patch existing entrypoint to call our script in the background
# This has been inspired by https://www.thetopsites.net/article/51594713.shtml
EP=/patched-entrypoint.sh
sed '$ d' /docker-entrypoint.sh > $EP
cat <<'EOF' >> $EP
/run-init-scripts.sh &
exec "$@"
EOF
# Make both scripts executable
chmod +x /run-init-scripts.sh
chmod +x $EP
# Call the new entrypoint
$EP "$@"
-- Here, you can execute any CQL commands, e.g.
CREATE KEYSPACE some_keyspace WITH REPLICATION = {'class': 'SimpleStrategy', 'replication_factor': 1};
CREATE TABLE some_keyspace.some_table (
id int,
month text,
timestamp timestamp,
value text,
PRIMARY KEY ((id, month), timestamp)
) WITH CLUSTERING ORDER BY (timestamp ASC);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment