Skip to content

Instantly share code, notes, and snippets.

@pradyumnac
Created January 30, 2023 12:34
Show Gist options
  • Save pradyumnac/7c59a8d2f3013fc75ad85ea03ba08bd8 to your computer and use it in GitHub Desktop.
Save pradyumnac/7c59a8d2f3013fc75ad85ea03ba08bd8 to your computer and use it in GitHub Desktop.
Get all google alerts in newsboat url format (Bash)
#!/usr/bin/env bash
# Customisations/granular instructuions need to be followed
# Refer [1], [2], [3]
# Written for linux. May need some customisations for windows
# Dependencies:
# - htmlq: https://github.com/mgdm/htmlq
# - jq: https://github.com/stedolan/jq
# - html
#
# - my chrome cookie stealer - https://github.com/pradyumnac/chrome-cookies
# or
# use chrome extension for manually downloading cookies
# - getcookies.txt - https://chrome.google.com/webstore/detail/get-cookiestxt/bgaddhkoddajcdgocldbbfleckgcbcid?hl=en
#
# [1] Pass in your google alerts url
# if you use multiple google accounts and alerts are set to third account(2 below), pass in
# https://www.google.com/alerts?authuser=2
if [ -z "$1" ];
then
echo
else
url="$1"
fi
# [2]: Using my python app to steal google chrome cookies - for linux/gnome-keyring
# Alternate easier but manual way is to save your cookie text to the below location
# using getcookies extension
chromecookies $url > /tmp/ccookie-alerts
# curlua recreated here
base_url="$(echo $url | grep -Eo '(http|https)://[a-zA-Z1-90.-]+')"
base_name="$(echo $base_url|sed 's,://,,1'|sed 's,/,,1')"
curl -sS $url --user-agent "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36" \
--header "accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9" \
-L \
--referer $base_url \
-c /tmp/curlua-cookie-$base_name.txt \
-b /tmp/ccookie-alerts | \
htmlq ".main-page script" | sed -E 's/.*\{window\.STATE\=(.*);\}.*/\1/g' | \
jq '.[0][0]|.[]| "https://www.google.com/alerts/feeds/\(.[2])/\(.[1][5][0][10]) ~\(.[1][2][0])" ' | \
sed 's/^"\(.*\)"$/\1/' | sed 's/\\\"//g' | sed 's/~/"~/' | sed 's/$/"/' > ~/repos/newsboat/urlcategory/alerts
# [3] The last part is where your newsboat urls file exist. Modify accordingly
# Fetches the script tag and extracts the json from window state assignment
# The \" as as per provided while creating the alert with ", \" is escaping
# removes jq encapsulating ""
rm /tmp/curlua-cookie-$base_name.txt /tmp/ccookie-alerts -f
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment