Skip to content

Instantly share code, notes, and snippets.

@Strubbl
Created March 25, 2022 21:10
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Strubbl/fc9321581f159bc3551575527856cd7b to your computer and use it in GitHub Desktop.
Save Strubbl/fc9321581f159bc3551575527856cd7b to your computer and use it in GitHub Desktop.
send links from RSS feed to archive.org
#!/bin/bash
set -u
start_time=$(date)
feeds=$(xargs -a feeds.txt)
archived_file="archived.txt"
new_articles=""
for f in $feeds
do
links=$(curl -s $f | grep "<link>http://www.tageszeitung-fiktiv.xyz/" | grep -v "<link>http://www.tageszeitung-fiktiv.xyz/</link>" | sed 's/\s*<link>//g; s/<\/link>//g' | xargs)
for i in $links
do
r=$(curl -sL -o /dev/random -w %{url_effective} "$i")
curl -sL "$r" | grep "paywall-string" > /dev/random
is_content_blocked=$?
article_title=$(echo $r | rev | cut -d "/" -f 1 | rev)
grep "$article_title" $archived_file > /dev/random
is_already_fetched=$?
curl -sL "$r" | grep ">Lesbar umsonst bis" > /dev/random
is_limitted_free=$?
if [ $is_content_blocked -ne 0 ] && [ $is_already_fetched -ne 0 ] && [ $is_limitted_free -eq 0 ]
then
echo "$r" >> $archived_file
new_articles="$r\n$new_articles"
fi
done
done
if [ "$new_articles" != "" ]
then
# go install github.com/wabarc/archive.org/cmd/archive.org@latest
/home/strubbl/go/bin/archive.org $(echo -e $new_articles | xargs)
# echo "start time: $start_time"
# echo "end time: $(date)"
fi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment