Skip to content

Instantly share code, notes, and snippets.

@git2samus
Created December 9, 2011 16:32
Show Gist options
  • Save git2samus/1452246 to your computer and use it in GitHub Desktop.
Save git2samus/1452246 to your computer and use it in GitHub Desktop.
curl streaming-api archiver with throttling
#!/bin/bash
endpoint=https://stream.twitter.com/1/statuses/filter.json
proxy=<your-proxy>
auth=<your-username>:<your-password>
data=track=<your-comma-separated-keywords>
while true; do
tmpdir=$(mktemp -d --tmpdir=${1:-$PWD}) || exit
curl "$endpoint" -x "$proxy" -d "$data" -u "$auth" -f | gzip -9 > "$tmpdir"/stream.gz
curl_status=${PIPESTATUS[0]}
if [[ -s "$tmpdir"/stream.gz ]]; then
# Once a valid connection drops, reconnect immediately.
unset tcp_delay_ms http_delay_s
continue
fi
if (($curl_status == 22)); then
unset tcp_delay_ms
if [[ $http_delay_s ]]; then
(($http_delay_s < 240)) && http_delay_s=$(($http_delay_s * 2))
else
http_delay_s=10
fi
delay_s=$http_delay
else
unset http_delay_s
if [[ $tcp_delay_ms ]]; then
(($tcp_delay_ms < 16 * 1000)) && tcp_delay_ms=$(($tcp_delay_ms + 250))
else
tcp_delay_ms=250
fi
delay_s=$(bc <<<"scale=3; $tcp_delay_ms / 1000")
fi
echo "sleep ${delay_s}s"
sleep "$delay_s"
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment