Skip to content

Instantly share code, notes, and snippets.

@ashmchiu
Last active January 10, 2025 01:26
EECS Course Website Cron Job (Jekyll Site)

[NEW] SU24: Beginning of semester setup

Due to updates in university policies, course websites need to run behind Calnet auth. The best way to do so is to run our current site at https://inst.eecs.berkeley.edu/~cs161/$SEMESTER. To set up, after cloning the new semester site,

  1. Log into the AWS Project 3 server.
  2. Make a new SSH key pair. You can do this by running
     ssh-keygen -t ed25519 -C "cs161-staff+$SEMESTER-161-deploy@berkeley.edu" -f $SEMESTER_161_deploy
    in the ~/.ssh/ folder. Please name this $SEM_161_deploy to avoid overwriting another key. Add the public key (PK) to your semester's list of deployment keys. The nice thing about deployment keys is that they are specific to a repository, not a user. Do not add the public key to your personal keys unless you want other people to have the ability to inadvertently modify anything your GitHub can (least privilege).
    • Notably, only grant this deployment key read access. Do all your website updates as you normally would from your local machine. After this set up, you do not need to access the Project 3 server for anything except for Project 3 deployment/accessing grades.
  3. cd course-website within the Project 3 server.
  4. If you ls course-website, you should see something like this
    deploy-site.sh  su24-site 
    • deploy-site.sh: This is the Bash script that when run, pulls down the most recent changes to the course site repository, builds the Jekyll site, and then securely copies the HTML over to the instructional eecs server as the public, student facing HTML under https://inst.eecs.berkeley.edu/~cs161/$SEMESTER.
    • su24-site: This is the previous semester's repository (or at time of writing, is the current lol).
  5. (Optional) rm -rf the previous semester's repository so as not to overload the server (goodbye).
  6. Clone the new semester's repository into the Project 3 server under the course-website directory. Remember to clone over SSH (since you just added the public key as a deployment key). This allows you to git pull from the Project 3 server, but not push. You may need to run the pull as GIT_SSH_COMMAND="ssh -i ~/.ssh/$SEM_161_deploy" git clone <clone link> if you get a fatal no access error.
  7. In deploy-site.sh, add a semester key following the template SEMESTER_KEYS["$SEM"]="~/.ssh/$SEM_161_deploy" where SEMESTER to (sp|su|fa)[0-9][0-9] (mimicking your repo name). If you removed the previous semester's repository in Step 5, please remove that semester's respective semester key line.
  8. Now edit the cron table:
    crontab -e
    
    # Now edit the semester in this line
    */5 * * * * /home/cs161/course-website/deploy-site.sh 2>&1 | logger -t cs161-$SEMESTER
    All you need to do is change the semester. This is mainly helpful for logging (since we run other courses' cron jobs as well, so you can isolate issues). This cron job runs the deploy-site.sh script every five minutes, indefinitely.
  9. Finally, you'll need to access the cs161@hiveX... instructional account. If you do not have access, you can run
    ssh -i ~/.ssh/id_ed25519 -J cs161@instgw.eecs.berkeley.edu cs161@cory.eecs.berkeley.edu
    for CS161. You can run
    ssh -i ~/.ssh/hive_188 -J cs161@instgw.eecs.berkeley.edu cs188@cory.eecs.berkeley.edu

Navigate to ~/public_html and make a folder for the new semester. If so, please also run

chmod -R a+rX $SEMESTER

to ensure that all the HTML in that semester is accessible/readable to the public. 10. In the instructional account, add a .htaccess page (ensuring right permissions) that has the following contents

ErrorDocument /~cs161/$SEMESTER/404.html

This ensure that any 404 error that occurs within /~cs161/$SEMESTER/ will lead to the 404.html page. You may need to change its permissions with chmod 644 .htaccess.

Common Errors

If you run into CSS errors (where your site does not have the correct styling), you may see errors such as

Loading failed for the <script> with source “https://inst.eecs.berkeley.edu/assets/js/vendor/lunr.min.js”. [sp25:25:49](https://inst.eecs.berkeley.edu/~cs188/sp25/)
Loading failed for the <script> with source “https://inst.eecs.berkeley.edu/assets/js/just-the-docs.js”. [sp25:28:45](https://inst.eecs.berkeley.edu/~cs188/sp25/)
Uncaught ReferenceError: jtd is not defined
<anonymous> https://inst.eecs.berkeley.edu/~cs188/sp25/:62
[sp25:62:3](https://inst.eecs.berkeley.edu/~cs188/sp25/)
Loading failed for the <script> with source “https://inst.eecs.berkeley.edu/assets/katex/katex.min.js”. [sp25:93:48](https://inst.eecs.berkeley.edu/~cs188/sp25/)
Loading failed for the <script> with source “https://inst.eecs.berkeley.edu/assets/katex/contrib/auto-render.min.js”. [sp25:106:86](https://inst.eecs.berkeley.edu/~cs188/sp25/)
Loading failed for the <script> with source “https://inst.eecs.berkeley.edu/assets/js/jquery.min.js”. [sp25:341:40](https://inst.eecs.berkeley.edu/~cs188/sp25/)
Uncaught ReferenceError: $ is not defined
<anonymous> https://inst.eecs.berkeley.edu/~cs188/sp25/:345
[sp25:345:1](https://inst.eecs.berkeley.edu/~cs188/sp25/)
Loading failed for the <script> with source “https://inst.eecs.berkeley.edu/assets/js/tocbot.min.js”. [sp25:357:40](https://inst.eecs.berkeley.edu/~cs188/sp25/)
Uncaught ReferenceError: tocbot is not defined
<anonymous> https://inst.eecs.berkeley.edu/~cs188/sp25/:360
[sp25:360:1](https://inst.eecs.berkeley.edu/~cs188/sp25/)

You may need to update your _config.yml file to make baseurl set to ~cs188/$SEMESTER. You can do so directly in the repository for courses like CS188, that deploy only to inst.eecs, however, if like CS161, isnt.eecs only serves as a mirror for the actual course website (cs161.org), you can uncomment out the two blocks of sed statements that are commented in deploy-site.sh.

Logging Tips

To debug, on the Project 3 server, you can run a combination of

   grep CRON /var/log/syslog
   grep $SEMESTER /var/log/syslog

This expects your tag to just be $SEMESTER, as opposed to cs161-$SEMESTER. The first command will show you when the cron job was run

Jun  4 23:25:01 ec2-54-219-214-175 CRON[1842862]: (cs161) CMD (/home/cs161/course-website/deploy-site.sh 2>&1 | logger -t su24)
Jun  4 23:30:01 ec2-54-219-214-175 CRON[1842949]: (cs161) CMD (/home/cs161/course-website/deploy-site.sh 2>&1 | logger -t su24)
Jun  4 23:35:01 ec2-54-219-214-175 CRON[1843049]: (cs161) CMD (/home/cs161/course-website/deploy-site.sh 2>&1 | logger -t su24)
Jun  4 23:40:01 ec2-54-219-214-175 CRON[1843071]: (cs161) CMD (/home/cs161/course-website/deploy-site.sh 2>&1 | logger -t su24)

For instance, from this, we can see that the cron job ran four times (5 minutes apart), since the CMD is the deploy-site.sh script.

The second command will show any of the logs from the bash script (since you added logger -t $SEMESTER to the cron table). For example,

Jun  5 00:10:01 ec2-54-219-214-175 CRON[1843532]: (cs161) CMD (/home/cs161/course-website/deploy-site.sh 2>&1 | logger -t su24)
Jun  5 00:10:01 ec2-54-219-214-175 su24: Starting [SYN]...
Jun  5 00:10:02 ec2-54-219-214-175 su24: From github.com:cs161-staff/su24-site
Jun  5 00:10:02 ec2-54-219-214-175 su24:  * branch            main       -> FETCH_HEAD
Jun  5 00:10:03 ec2-54-219-214-175 su24: Already up to date.
Jun  5 00:10:03 ec2-54-219-214-175 su24: Pulled the current repository
Jun  5 00:10:03 ec2-54-219-214-175 su24: Bundle complete! 2 Gemfile dependencies, 95 gems now installed.
Jun  5 00:10:03 ec2-54-219-214-175 su24: Use `bundle info [gemname]` to see where a bundled gem is installed.
Jun  5 00:10:03 ec2-54-219-214-175 su24: Installed ruby dependencies
Jun  5 00:10:04 ec2-54-219-214-175 su24: /home/cs161/.gem/gems/jekyll-3.9.5/lib/jekyll.rb:28: warning: csv was loaded from the standard library, but will no longer be part of the default gems since Ruby 3.4.0. Add csv to your Gemfile or gemspec. Also contact author of jekyll-3.9.5 to add csv into its gemspec.
Jun  5 00:10:04 ec2-54-219-214-175 su24: Configuration file: /home/cs161/course-website/su24-site/_config.yml
Jun  5 00:10:04 ec2-54-219-214-175 su24: To use retry middleware with Faraday v2.0+, install `faraday-retry` gem
Jun  5 00:10:05 ec2-54-219-214-175 su24:             Source: /home/cs161/course-website/su24-site
Jun  5 00:10:05 ec2-54-219-214-175 su24:        Destination: /home/cs161/course-website/su24-site/_site
Jun  5 00:10:05 ec2-54-219-214-175 su24:  Incremental build: disabled. Enable with --incremental
Jun  5 00:10:05 ec2-54-219-214-175 su24:       Generating... 
Jun  5 00:10:05 ec2-54-219-214-175 su24:       Remote Theme: Using theme just-the-docs/just-the-docs
Jun  5 00:10:16 ec2-54-219-214-175 su24:                     done in 10.992 seconds.
Jun  5 00:10:16 ec2-54-219-214-175 su24:  Auto-regeneration: disabled. Use --watch to enable.
Jun  5 00:10:16 ec2-54-219-214-175 su24: Built site
Jun  5 00:10:25 ec2-54-219-214-175 su24: Copied HTML files
Jun  5 00:10:26 ec2-54-219-214-175 su24: Added perms
Jun  5 00:10:26 ec2-54-219-214-175 su24: ... [ACK] Finished!

We can walk through the different logs here in case of error as this will output all the echo'd messages from the deploy-site.sh script as well as the output of the commands run in the script.

Note: If you need to jump to the cs188 account via cs161 in the future to run any command, run this: ssh -i <insert 188 hive key> -J cs161@instgw.eecs.berkeley.edu -l cs188 cory.eecs.berkeley.edu "<insert command here>".

This will execute the command as cs188 using cs161 as a bastion jump.

# this cron job runs the bash script `deploy-site.sh` every five minutes, and pipes all output to logging with tag cs161-su24
*/5 * * * * /home/cs161/course-website/deploy-site.sh 2>&1 | logger -t cs161-su24
#!/bin/bash
### GITHUB KEYS
declare -A SEMESTER_KEYS
SEMESTER_KEYS["su24"]="~/.ssh/su24_deployment_key"
# Feel free to add more semesters here if you want to keep deployments up
SITE_BASE_DIR=/home/cs161/course-website
### HIVE KEY: the key you'll need to proxyjump into the hive (this is the same across all semesters)
DEPLOYMENT_KEY=~/.ssh/id_ed25519
echo "Starting [SYN]..."
deploy_site() {
SEMESTER=$1
SITE_DIR="$SITE_BASE_DIR/$SEMESTER-site"
SSH_KEY="${SEMESTER_KEYS[$SEMESTER]}"
if [ ! -d "$SITE_DIR" ]; then
echo "[FAILED] Site directory $SITE_DIR does not exist"
exit 1
fi
cd "$SITE_DIR"
GIT_SSH_COMMAND="ssh -i $SSH_KEY" git pull origin main
if [ $? -eq 0 ]; then
echo "[$SEMESTER] Pulled the current repository"
else
echo "[FAILED][$SEMESTER] Could not pull the current repository"
exit 1
fi
### sed statements
#####
# make sure that in your repository's _config.yml, that the _config.yml file has the following attributes:
# baseurl: ''
# url: $SEMESTER.cs161.org
#####
# sed -i "s|baseurl: ''|baseurl: '/~cs161/$SEMESTER'|g" _config.yml
# sed -i "s|url: https://$SEMESTER.cs161.org|url: https://inst.eecs.berkeley.edu|g" _config.yml
/snap/bin/ruby /snap/ruby/350/bin/bundle install
if [ $? -eq 0 ]; then
echo "[$SEMESTER] Installed ruby dependencies"
else
echo "[FAILED][$SEMESTER] Could not install ruby dependencies"
exit 1
fi
/snap/bin/ruby /snap/ruby/350/bin/bundle exec jekyll build
if [ $? -eq 0 ]; then
echo "[$SEMESTER] Built site"
else
echo "[FAILED][$SEMESTER] Could not build site"
exit 1
fi
scp -i $DEPLOYMENT_KEY -r -o ProxyJump=cs161@instgw.eecs.berkeley.edu _site/* cs161@cory.eecs.berkeley.edu:/home/ff/cs161/public_html/$SEMESTER/
if [ $? -eq 0 ]; then
echo "[$SEMESTER] Copied HTML files"
else
echo "[FAILED][$SEMESTER] Could not copy HTML files"
exit 1
fi
ssh -i $DEPLOYMENT_KEY -J cs161@instgw.eecs.berkeley.edu -l cs161 cory.eecs.berkeley.edu chmod -R a+rX /home/ff/cs161/public_html/$SEMESTER/*
if [ $? -eq 0 ]; then
echo "[$SEMESTER] Added perms"
else
echo "[FAILED][$SEMESTER] Could not add perms"
exit 1
fi
### sed statements
# sed -i "s|baseurl: '/~cs161/$SEMESTER'|baseurl: ''|g" _config.yml
# sed -i "s|url: https://inst.eecs.berkeley.edu|url: https://$SEMESTER.cs161.org|g" _config.yml
cd $SITE_BASE_DIR
}
for SEMESTER in "${!SEMESTER_KEYS[@]}"; do
deploy_site "$SEMESTER"
done
echo "... [ACK] Finished!"
@peyrin
Copy link

peyrin commented Jan 6, 2025

Suggestion: The ssh-keygen command can also include a comment in the email address, and also explicitly define filename output:

ssh-keygen -t ed25519 -C "cs161-staff+sp25-188-deploy@berkeley.edu" -f sp25_188_deploy

@peyrin
Copy link

peyrin commented Jan 6, 2025

Suggestion: For TAs without access to the cs161 or cs188 accounts on the hive, the SSH can be done directly in the box server like this:

ssh -i ~/.ssh/hive_188 -J cs161@instgw.eecs.berkeley.edu cs188@cory.eecs.berkeley.edu

@peyrin
Copy link

peyrin commented Jan 6, 2025

Common bug: If you follow these steps exactly on course-site-template, the website will look weird with no CSS loaded. If you open the console, you'll see errors like this:

Loading failed for the <script> with source “https://inst.eecs.berkeley.edu/assets/js/vendor/lunr.min.js”. [sp25:25:49](https://inst.eecs.berkeley.edu/~cs188/sp25/)
Loading failed for the <script> with source “https://inst.eecs.berkeley.edu/assets/js/just-the-docs.js”. [sp25:28:45](https://inst.eecs.berkeley.edu/~cs188/sp25/)
Uncaught ReferenceError: jtd is not defined
    <anonymous> https://inst.eecs.berkeley.edu/~cs188/sp25/:62
[sp25:62:3](https://inst.eecs.berkeley.edu/~cs188/sp25/)
Loading failed for the <script> with source “https://inst.eecs.berkeley.edu/assets/katex/katex.min.js”. [sp25:93:48](https://inst.eecs.berkeley.edu/~cs188/sp25/)
Loading failed for the <script> with source “https://inst.eecs.berkeley.edu/assets/katex/contrib/auto-render.min.js”. [sp25:106:86](https://inst.eecs.berkeley.edu/~cs188/sp25/)
Loading failed for the <script> with source “https://inst.eecs.berkeley.edu/assets/js/jquery.min.js”. [sp25:341:40](https://inst.eecs.berkeley.edu/~cs188/sp25/)
Uncaught ReferenceError: $ is not defined
    <anonymous> https://inst.eecs.berkeley.edu/~cs188/sp25/:345
[sp25:345:1](https://inst.eecs.berkeley.edu/~cs188/sp25/)
Loading failed for the <script> with source “https://inst.eecs.berkeley.edu/assets/js/tocbot.min.js”. [sp25:357:40](https://inst.eecs.berkeley.edu/~cs188/sp25/)
Uncaught ReferenceError: tocbot is not defined
    <anonymous> https://inst.eecs.berkeley.edu/~cs188/sp25/:360
[sp25:360:1](https://inst.eecs.berkeley.edu/~cs188/sp25/)

Notice it's trying to load JS scripts from https://inst.eecs.berkeley.edu/assets/js/vendor/lunr.min.js, but the correct URL is https://inst.eecs.berkeley.edu/~cs188/sp25/assets/js/vendor/lunr.min.js.

This is because you didn't update baseurl in _config.yml in the website repo. If you update that to something like ~cs188/sp25, the JS will get loaded properly.

This isn't an issue on the CS161 version of deploy-site.sh because you have a sed line to take care of it, but on the version in this Gist, which I think assumes deploying to ~cs188/sp25 only (and not some other domain like cs161.org), I think you have to hard-code _config.yml properly to get things to work.

@ashmchiu
Copy link
Author

ashmchiu commented Jan 6, 2025

@peyrin Nice call out -- I think it already existed in the previous 188 websites, but not in course-site-template. Thanks!

Completely forgot this was a Gist. Might push it out to our general repositories and hopefully it is maintained. Ngl my script probably does not use best practices, so people more proficient in scripting might fnd it helpful to redo if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment