Skip to content

Instantly share code, notes, and snippets.

@tripleee
Created April 22, 2018 11:25
Show Gist options
  • Save tripleee/ad9d27b6018af49ebd43fca786f2b687 to your computer and use it in GitHub Desktop.
Save tripleee/ad9d27b6018af49ebd43fca786f2b687 to your computer and use it in GitHub Desktop.
Metasmoke Tumblr hits
#!/usr/bin/env python3
import json, fileinput
tumblrs = dict()
for line in fileinput.input():
data = json.loads(line)
for rec in data:
for field in 'title', 'body':
if '.tumblr.com/' in rec[field]:
splits = rec[field].split('<a href="')
for item in splits[1:]:
url = item.split('"')[0]
if '.tumblr.com/' in url:
tumblr = '/'.join(url.split('/')[0:3])
if tumblr in tumblrs:
tumblrs[tumblr]['count'] += 1
else:
tumblrs[tumblr] = {
'count': 1,
'date': rec['created_at']}
skipped = 0
for tumblr in reversed(sorted(tumblrs, key=lambda rec: (tumblrs[rec]['count'], tumblrs[rec]['date']))):
if tumblrs[tumblr]['date'].startswith(('2015-', '2016-')):
skipped += tumblrs[tumblr]['count']
continue
print('%i %s %s' % (
tumblrs[tumblr]['count'], tumblrs[tumblr]['date'], tumblr))
print('skipped %i' % skipped)
@tripleee
Copy link
Author

Hi, Drew, and sorry for being away over the weekend.

The search I linked to is exhaustive for the last 3 years, and contains a link where you can download the search results as JSON.  But here, for your convenience, is a summary.  This shows the recent Tumblr blog links (the entire result set is 89 spam posts; this excludes the 23 records which are from before 2017, and one which simply links to www.tumblr.com) summarized by number of occurrences and with the latest occurrence date for each.

The majority of these are from what we believe to be a single operation of pharma spammers operating out of India, but I have not filtered by type of spam.

5 2018-03-10T10:54:18.000Z https://healthsupplementzoneusa.tumblr.com
4 2018-04-19T10:52:46.000Z https://reviewcrazybulkusa.tumblr.com
3 2017-11-28T06:15:30.000Z https://buytestoultra.tumblr.com
2 2018-03-19T05:31:34.000Z https://juniviveserum.tumblr.com
2 2018-03-17T11:45:30.000Z https://healthcaresau.tumblr.com
2 2018-02-13T12:55:25.000Z https://suplementodiet.tumblr.com
2 2018-02-06T06:38:31.000Z http://healthsupplementsreviews.tumblr.com
2 2018-01-10T07:13:55.000Z https://testrot3male.tumblr.com
2 2017-12-21T11:00:48.000Z https://testrot3maleenhancement.tumblr.com
2 2017-12-08T11:53:09.000Z https://tone360garcinia.tumblr.com
2 2017-08-14T08:00:53.000Z https://interiorrenovation.tumblr.com
2 2017-07-17T07:20:19.000Z https://seocompanydelhi.tumblr.com
2 2017-06-13T11:27:08.000Z https://tboostmaxsite.tumblr.com
2 2017-01-28T04:49:31.000Z https://zyntixblog.tumblr.com
1 2018-04-10T06:27:55.000Z https://healthsupplementzoneblog.tumblr.com
1 2018-03-24T10:38:47.000Z https://vivraxrev.tumblr.com
1 2018-03-21T06:37:26.000Z https://eyeserummagic.tumblr.com
1 2018-03-20T05:57:30.000Z https://goldenhealthyreviews.tumblr.com
1 2018-03-17T12:31:57.000Z https://zyrectestosterone.tumblr.com
1 2018-03-13T08:47:23.000Z https://junivivecream-fr.tumblr.com
1 2018-02-26T11:03:04.000Z https://yourclarissaalvarez.tumblr.com
1 2018-02-21T09:42:24.000Z https://corawaltney.tumblr.com
1 2018-02-20T11:55:04.000Z https://maxtrimfxreview.tumblr.com
1 2018-02-20T11:36:07.000Z https://dailyhealthview.tumblr.com
1 2018-02-20T09:10:14.000Z https://alinbones.tumblr.com
1 2018-02-17T09:40:12.000Z https://dorothyblackk.tumblr.com
1 2018-02-16T09:13:57.000Z https://maarryblack.tumblr.com
1 2018-02-02T11:58:06.000Z https://buyephamere.tumblr.com
1 2018-01-29T05:10:48.000Z https://pinkdiamondskincarefact.tumblr.com
1 2018-01-27T14:04:13.000Z https://kitgrhm.tumblr.com
1 2018-01-24T06:02:03.000Z https://celuraidextreme.tumblr.com
1 2018-01-24T05:21:45.000Z https://pinkdiamondskincare.tumblr.com
1 2018-01-16T06:13:36.000Z https://supplementskingpro.tumblr.com
1 2018-01-14T09:36:30.000Z http://howtorecoverexcelfilepassword.tumblr.com
1 2018-01-06T09:29:40.000Z https://carolebrewington.tumblr.com
1 2018-01-03T04:45:27.000Z https://jouliagecream.tumblr.com
1 2018-01-02T07:06:20.000Z https://vexanmaleenhancement.tumblr.com
1 2018-01-02T05:37:00.000Z https://fitnessfactsau.tumblr.com
1 2017-12-30T06:35:57.000Z https://maxxboostreview.tumblr.com
1 2017-12-28T07:33:52.000Z https://supplement6274.tumblr.com
1 2017-12-22T09:19:20.000Z https://vastushastraechitech.tumblr.com
1 2017-12-14T06:02:06.000Z https://krasacream.tumblr.com
1 2017-11-29T05:52:36.000Z https://digitalmarketingtactics.tumblr.com
1 2017-11-15T07:50:22.000Z https://testoampxbuy.tumblr.com
1 2017-11-07T05:11:01.000Z https://smartrimforskolin.tumblr.com
1 2017-09-07T12:41:12.000Z https://corintdesign.tumblr.com
1 2017-08-30T09:01:47.000Z https://wholesalemascots.tumblr.com
1 2017-08-22T06:35:50.000Z https://vtrexmaleenhancement.tumblr.com
1 2017-08-04T10:32:36.000Z https://healthyminihubus.tumblr.com
1 2017-08-01T15:21:41.000Z https://myappleinfo.tumblr.com
1 2017-07-23T01:13:33.000Z https://rob-zed-draw.tumblr.com
1 2017-06-06T07:12:37.000Z https://issrahtein.tumblr.com
1 2017-04-25T12:21:27.000Z https://marvinreagle.tumblr.com
1 2017-04-08T11:35:08.000Z https://noxor-fr.tumblr.com
1 2017-04-04T12:28:34.000Z https://headlock-usa.tumblr.com

For your convenience, here's that search URL again: https://metasmoke.erwaysoftware.com/search?body=.tumblr.com%2F&feedback_filter=tp

@tripleee
Copy link
Author

Added the tag #tumblr-deleted to these; and added #drugs to most of them. There are many which aren't actually drugs, too -- random stuff with typically a single spam.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment