Skip to content

Instantly share code, notes, and snippets.

@ashmchiu
Last active August 23, 2024 04:46
Show Gist options
  • Save ashmchiu/797f80d9d4c1d674b9868c0a01b633c0 to your computer and use it in GitHub Desktop.
Save ashmchiu/797f80d9d4c1d674b9868c0a01b633c0 to your computer and use it in GitHub Desktop.
EECS Course Website Cron Job (Jekyll Site)

[NEW] SU24: Beginning of semester setup

Due to updates in university policies, course websites need to run behind Calnet auth. The best way to do so is to run our current site at https://inst.eecs.berkeley.edu/~cs161/$SEMESTER. To set up, after cloning the new semester site,

  1. Log into the AWS Project 3 server.
  2. Make a new SSH key pair. You can do this by running
    ssh-keygen -t ed25519 -C "cs161-staff@berkeley.edu"
    in the ~/.ssh/ folder. Please name this $SEM_161_deploy to avoid overwriting another key. Add the public key (PK) to your semester's list of deployment keys. The nice thing about deployment keys is that they are specific to a repository, not a user. Do not add the public key to your personal keys unless you want other people to have the ability to inadvertently modify anything your GitHub can (least privilege).
    • Notably, only grant this deployment key read access. Do all your website updates as you normally would from your local machine. After this set up, you do not need to access the Project 3 server for anything except for Project 3 deployment/accessing grades.
  3. cd course-website within the Project 3 server.
  4. If you ls course-website, you should see something like this
    deploy-site.sh  su24-site 
    • deploy-site.sh: This is the Bash script that when run, pulls down the most recent changes to the course site repository, builds the Jekyll site, and then securely copies the HTML over to the instructional eecs server as the public, student facing HTML under https://inst.eecs.berkeley.edu/~cs161/$SEMESTER.
    • su24-site: This is the previous semester's repository (or at time of writing, is the current lol).
  5. (Optional) rm -rf the previous semester's repository so as not to overload the server (goodbye).
  6. Clone the new semester's repository into the Project 3 server under the course-website directory. Remember to clone over SSH (since you just added the public key as a deployment key). This allows you to git pull from the Project 3 server, but not push. You may need to run the pull as GIT_SSH_COMMAND="ssh -i ~/.ssh/$SEM_161_deploy" git clone <clone link> if you get a fatal no access error.
  7. In deploy-site.sh, add a semester key following the template SEMESTER_KEYS["$SEM"]="~/.ssh/$SEM_161_deploy" where SEMESTER to (sp|su|fa)[0-9][0-9] (mimicking your repo name). If you removed the previous semester's repository in Step 5, please remove that semnester's respective semester key line.
  8. Now edit the cron table:
    crontab -e
    
    # Now edit the semester in this line
    */5 * * * * /home/cs161/course-website/deploy-site.sh 2>&1 | logger -t cs161-$SEMESTER
    All you need to do is change the semester. This is mainly helpful for logging (since we run other courses' cron jobs as well, so you can isolate issues). This cron job runs the deploy-site.sh script every five minutes, indefinitely.
  9. Finally, you'll need to access the cs161@hiveX... instructional account. Navigate to ~/public_html and make a folder for the new semester. If so, please also run
    chmod -R a+rX $SEMESTER
    to ensure that all the HTML in that semester is accessible/readable to the public.
  10. In the instructional account, add a .htaccess page (ensuring right permissions) that has the following contents
ErrorDocument /~cs161/$SEMESTER/404.html

This ensure that any 404 error that occurs within /~cs161/$SEMESTER/ will lead to the 404.html page. You may need to change its permissions with chmod 644 .htaccess.

Logging Tips

To debug, on the Project 3 server, you can run a combination of

   grep CRON /var/log/syslog
   grep $SEMESTER /var/log/syslog

This expects your tag to just be $SEMESTER, as opposed to cs161-$SEMESTER. The first command will show you when the cron job was run

Jun  4 23:25:01 ec2-54-219-214-175 CRON[1842862]: (cs161) CMD (/home/cs161/course-website/deploy-site.sh 2>&1 | logger -t su24)
Jun  4 23:30:01 ec2-54-219-214-175 CRON[1842949]: (cs161) CMD (/home/cs161/course-website/deploy-site.sh 2>&1 | logger -t su24)
Jun  4 23:35:01 ec2-54-219-214-175 CRON[1843049]: (cs161) CMD (/home/cs161/course-website/deploy-site.sh 2>&1 | logger -t su24)
Jun  4 23:40:01 ec2-54-219-214-175 CRON[1843071]: (cs161) CMD (/home/cs161/course-website/deploy-site.sh 2>&1 | logger -t su24)

For instance, from this, we can see that the cron job ran four times (5 minutes apart), since the CMD is the deploy-site.sh script.

The second command will show any of the logs from the bash script (since you added logger -t $SEMESTER to the cron table). For example,

Jun  5 00:10:01 ec2-54-219-214-175 CRON[1843532]: (cs161) CMD (/home/cs161/course-website/deploy-site.sh 2>&1 | logger -t su24)
Jun  5 00:10:01 ec2-54-219-214-175 su24: Starting [SYN]...
Jun  5 00:10:02 ec2-54-219-214-175 su24: From github.com:cs161-staff/su24-site
Jun  5 00:10:02 ec2-54-219-214-175 su24:  * branch            main       -> FETCH_HEAD
Jun  5 00:10:03 ec2-54-219-214-175 su24: Already up to date.
Jun  5 00:10:03 ec2-54-219-214-175 su24: Pulled the current repository
Jun  5 00:10:03 ec2-54-219-214-175 su24: Bundle complete! 2 Gemfile dependencies, 95 gems now installed.
Jun  5 00:10:03 ec2-54-219-214-175 su24: Use `bundle info [gemname]` to see where a bundled gem is installed.
Jun  5 00:10:03 ec2-54-219-214-175 su24: Installed ruby dependencies
Jun  5 00:10:04 ec2-54-219-214-175 su24: /home/cs161/.gem/gems/jekyll-3.9.5/lib/jekyll.rb:28: warning: csv was loaded from the standard library, but will no longer be part of the default gems since Ruby 3.4.0. Add csv to your Gemfile or gemspec. Also contact author of jekyll-3.9.5 to add csv into its gemspec.
Jun  5 00:10:04 ec2-54-219-214-175 su24: Configuration file: /home/cs161/course-website/su24-site/_config.yml
Jun  5 00:10:04 ec2-54-219-214-175 su24: To use retry middleware with Faraday v2.0+, install `faraday-retry` gem
Jun  5 00:10:05 ec2-54-219-214-175 su24:             Source: /home/cs161/course-website/su24-site
Jun  5 00:10:05 ec2-54-219-214-175 su24:        Destination: /home/cs161/course-website/su24-site/_site
Jun  5 00:10:05 ec2-54-219-214-175 su24:  Incremental build: disabled. Enable with --incremental
Jun  5 00:10:05 ec2-54-219-214-175 su24:       Generating... 
Jun  5 00:10:05 ec2-54-219-214-175 su24:       Remote Theme: Using theme just-the-docs/just-the-docs
Jun  5 00:10:16 ec2-54-219-214-175 su24:                     done in 10.992 seconds.
Jun  5 00:10:16 ec2-54-219-214-175 su24:  Auto-regeneration: disabled. Use --watch to enable.
Jun  5 00:10:16 ec2-54-219-214-175 su24: Built site
Jun  5 00:10:25 ec2-54-219-214-175 su24: Copied HTML files
Jun  5 00:10:26 ec2-54-219-214-175 su24: Added perms
Jun  5 00:10:26 ec2-54-219-214-175 su24: ... [ACK] Finished!

We can walk through the different logs here in case of error as this will output all the echo'd messages from the deploy-site.sh script as well as the output of the commands run in the script.

Note: If you need to jump to the cs188 account via cs161 in the future to run any command, run this: ssh -i <insert 188 hive key> -J cs161@instgw.eecs.berkeley.edu -l cs188 cory.eecs.berkeley.edu "<insert command here>".

This will execute the command as cs188 using cs161 as a bastion jump.

# this cron job runs the bash script `deploy-site.sh` every five minutes, and pipes all output to logging with tag cs161-su24
*/5 * * * * /home/cs161/course-website/deploy-site.sh 2>&1 | logger -t cs161-su24
#!/bin/bash
### GITHUB KEYS
declare -A SEMESTER_KEYS
SEMESTER_KEYS["su24"]="~/.ssh/su24_deployment_key"
# Feel free to add more semesters here if you want to keep deployments up
SITE_BASE_DIR=/home/cs161/course-website
### HIVE KEY: the key you'll need to proxyjump into the hive (this is the same across all semesters)
DEPLOYMENT_KEY=~/.ssh/id_ed25519
echo "Starting [SYN]..."
deploy_site() {
SEMESTER=$1
SITE_DIR="$SITE_BASE_DIR/$SEMESTER-site"
SSH_KEY="${SEMESTER_KEYS[$SEMESTER]}"
if [ ! -d "$SITE_DIR" ]; then
echo "[FAILED] Site directory $SITE_DIR does not exist"
exit 1
fi
cd "$SITE_DIR"
GIT_SSH_COMMAND="ssh -i $SSH_KEY" git pull origin main
if [ $? -eq 0 ]; then
echo "[$SEMESTER] Pulled the current repository"
else
echo "[FAILED][$SEMESTER] Could not pull the current repository"
exit 1
fi
/snap/bin/ruby /snap/ruby/350/bin/bundle install
if [ $? -eq 0 ]; then
echo "[$SEMESTER] Installed ruby dependencies"
else
echo "[FAILED][$SEMESTER] Could not install ruby dependencies"
exit 1
fi
/snap/bin/ruby /snap/ruby/350/bin/bundle exec jekyll build
if [ $? -eq 0 ]; then
echo "[$SEMESTER] Built site"
else
echo "[FAILED][$SEMESTER] Could not build site"
exit 1
fi
scp -i $DEPLOYMENT_KEY -r -o ProxyJump=cs161@instgw.eecs.berkeley.edu _site/* cs161@cory.eecs.berkeley.edu:/home/ff/cs161/public_html/$SEMESTER/
if [ $? -eq 0 ]; then
echo "[$SEMESTER] Copied HTML files"
else
echo "[FAILED][$SEMESTER] Could not copy HTML files"
exit 1
fi
ssh -i $DEPLOYMENT_KEY -J cs161@instgw.eecs.berkeley.edu -l cs161 cory.eecs.berkeley.edu chmod -R a+rX /home/ff/cs161/public_html/$SEMESTER/*
if [ $? -eq 0 ]; then
echo "[$SEMESTER] Added perms"
else
echo "[FAILED][$SEMESTER] Could not add perms"
exit 1
fi
cd $SITE_BASE_DIR
}
for SEMESTER in "${!SEMESTER_KEYS[@]}"; do
deploy_site "$SEMESTER"
done
echo "... [ACK] Finished!"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment