Skip to content

Instantly share code, notes, and snippets.

@coderjo
Last active December 29, 2015 12:29
Show Gist options
  • Save coderjo/7670756 to your computer and use it in GitHub Desktop.
Save coderjo/7670756 to your computer and use it in GitHub Desktop.
ArchiveTeam project worker EC2 user-data script.
This is a user-data script for starting up ec2 instances to run an ArchiveTeam project downloader.
steps:
1. save the file to your drive.
2. change where it says YOURNICKHERE to the name you want to show up on the tracker.
3. pick the 32-bit Debian Wheezy 7.2 AMI for the region you plan to use.
4. put the ENTIRE contents of your file as the user-data field.
- If you are using the command-line ec2 tools, you can say --user-data-file YOURFILENAME
- If you are using the web console, on page 3 ("Configure Instance"), expand the "Advanced Details" section and either paste the contents into the "user data" box, or pick "as file" then click browse to select your file.
5. finish setting up your instance. Don't forget to configure a security group to allow you SSH access.
The account you will be using to SSH in is named admin, using the SSH key you selected while creating the instance.
You can tell a node to gracefully stop and then shut down by running the script (as "admin") /home/admin/stop-and-shutdown
From nobody Sun Nov 24 03:27:28 2013
Content-Type: multipart/mixed; boundary="===============1397231222=="
MIME-Version: 1.0
Number-Attachments: 2
--===============1397231222==
Content-Type: text/x-shellscript
MIME-Version: 1.0
Content-Disposition: attachment; filename="archiveteamstart"
#!/bin/sh
# user-data script for ec2 starts up instances of an ArchiveTeam seesaw project.
# this is intended to be used with a Debian Wheezy 7.2 AMI
# see https://wiki.debian.org/Cloud/AmazonEC2Image/Wheezy#A7.2
# note that this is in a mime container because the AMIs have both
# a standalone ec2-run-user-data script and cloud-init, which will
# both attempt to run the script if it starts with the normal shebang.
# the idea comes from a script by ByMe
# at: http://internalexception.tl.byme.at/ec2.user-data.hyves.txt
###### NOTE: change these to match your requirements!
downloader="YOURNICKHERE"
project="hyves-grab"
items=2
webserveropts="--disable-web-server"
###### END config
# add the archiveteam user
adduser --system --group --shell /bin/bash archiveteam
# save off the options from above
cat > /home/archiveteam/pipelineconfig << :EOF
#!/bin/bash
downloader="${downloader}"
project="${project}"
items=${items}
webserveropts="${webserveropts}"
:EOF
chmod 0644 /home/archiveteam/pipelineconfig
chown archiveteam:archiveteam /home/archiveteam/pipelineconfig
# start script for the pipeline
cat > /home/archiveteam/start-pipeline << ":EOF"
#!/bin/bash
# this script is to be run as archiveteam
cd ~
if pgrep -c run-pipeline >/dev/null; then
echo the pipeline is already running. exiting
exit 1
fi
# wait for our signal to get moving (really only happens on first boot)
while [ ! -f /home/archiveteam/okgo ]; do
sleep 20
done
. ~/pipelineconfig
# if we don't yet have the project
if [ ! -d "${project}" ]; then
# fetch the project
git clone "https://github.com/ArchiveTeam/${project}.git"
cd "${project}"
# build wget-lua, if we need it
[ -f get-wget-lua.sh ] && ./get-wget-lua.sh
fi
# ensure we're in the project directory
cd "~/${project}"
# remove stale stop file
[ -f STOP ] && rm STOP
# ensure we're up to date
git pull
# start the pipeline under screen
screen -d -m -S pipeline run-pipeline pipeline.py --concurrent ${items} ${webserveropts} "${downloader}"
exit 0
:EOF
chmod +x /home/archiveteam/start-pipeline
chown archiveteam:archiveteam /home/archiveteam/start-pipeline
# create a script to cause it to stop (so we can just run this via ssh)
cat > /home/archiveteam/stop-pipeline << ":EOF"
#!/bin/sh
. ~/pipelineconfig
cd ~/${project}
touch STOP
:EOF
chmod 0700 /home/archiveteam/stop-pipeline
chown archiveteam:archiveteam /home/archiveteam/stop-pipeline
# create an admin script that waits for the pipeline to exit and then shut down the system
cat > /home/admin/shutdownloop << :EOF
#!/bin/bash
while pgrep -c run-pipeline; do
sleep 60
done
sudo shutdown -hP now
:EOF
chmod 0700 /home/admin/shutdownloop
chown admin:admin /home/admin/shutdownloop
# create an admin script to stop and shutdown
cat > /home/admin/stop-and-shutdown << :EOF
#!/bin/sh
# tell the pipeline to stop gracefully
sudo -u archiveteam /home/archiveteam/stop-pipeline
# and now spawn off the background script that watches for it to exit and then shutdown
nohup /home/admin/shutdownloop >/dev/null 2>/dev/null </dev/null &
:EOF
chmod 0700 /home/admin/stop-and-shutdown
chown admin:admin /home/admin/stop-and-shutdown
# turn off atime, barrier (would set data=writeback, but we can't do that with a remount)
mount -o remount,noatime,barrier=0 /
sed -i 's/defaults/defaults,noatime,barrier=0/' /etc/fstab
# Update the system
apt-get -q update && apt-get -qy upgrade && apt-get -qy install unattended-upgrades
# install dependencies (run as one line because we want a failure to not create the okgo file and prevent the runner from starting)
apt-get install -qy build-essential git-core libgnutls-dev lua5.1 liblua5.1-0 liblua5.1-0-dev screen python-pip bzip2 screen python-pip openssh-server curl zlib1g-dev wget && pip install seesaw && touch /home/archiveteam/okgo
# everything is fine. exit with 0.
exit 0
--===============1397231222==
Content-Type: text/cloud-boothook
MIME-Version: 1.0
Content-Disposition: attachment; filename="archiveteamstart"
#!/bin/sh
# rewrite rc.local before we need it
cat > /etc/rc.local << ":EOF"
#!/bin/sh -e
# wait for the script to be executable
while [ ! -x /home/archiveteam/start-pipeline ]; do
sleep 20
done
su -c /home/archiveteam/start-pipeline archiveteam &
exit 0
:EOF
chmod +x /etc/rc.local
exit 0
--===============1397231222==
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment