Skip to content

Instantly share code, notes, and snippets.

@n-a-t-e
Last active February 22, 2023 17:59
Show Gist options
  • Save n-a-t-e/45348453f14cc98e009eb19b5267fefb to your computer and use it in GitHub Desktop.
Save n-a-t-e/45348453f14cc98e009eb19b5267fefb to your computer and use it in GitHub Desktop.
ERDDAP dataset logs in GoAccess
sudo cat /var/log/httpd/ssl_access_log* |
# only look at these types of queries
grep -E -i " /erddap/(tabledap|griddap|rss|files|info)/[a-zA-Z0-9_]+[\.\/\?]" |
# ignore these queries
grep -v -E " /erddap/\w+/(index|allDatasets|documentation)" |
# turns requests like /erddap/tabledap/DFO_MEDS_BUOYS.htmlTable?STN_ID&time>=now-1week&distinct() into /DFO_MEDS_BUOYS
sed -r 's/(.*) \/erddap\/\w+\/([a-zA-Z0-9_]+)[\.\/\?].* HTTP/\1 \/\2 HTTP/g' |
sudo goaccess - -o /var/www/html/erddap_logs/index3_no_crawlers.html --log-format=COMMON --ignore-crawlers
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment