Skip to content

Instantly share code, notes, and snippets.

@BaiGang
Created July 13, 2012 07:37
Show Gist options
  • Save BaiGang/3103412 to your computer and use it in GitHub Desktop.
Save BaiGang/3103412 to your computer and use it in GitHub Desktop.
A one-liner for extracting Baidu and Sogou's hot search keywords
# baidu realtime hotspot
curl http://top.baidu.com/buzz.php?p=top10 \
| perl -MEncode -pi -e '$_=encode_utf8(decode(gb2312=>$_))' \
| grep "td class=\"key\"" | sed -e 's/^.*_blank\">//g' | sed -e 's/<.*$//g'
# sogou top queries
curl http://top.sogou.com/hotword[0-3].html \
| perl -MEncode -pi -e '$_=encode_utf8(decode(gb2312=>$_))' \
| perl -ne 'chomp; my @titles = ($_ =~ /title=\".*?\"/g); for (my $i = 0; $i < scalar @titles; ++$i) {$titles[$i] =~ s/title=//; $titles[$i] =~ s/\"//g; print "$titles[$i]\n"}'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment