Skip to content

Instantly share code, notes, and snippets.

@lorenzhs
Last active September 17, 2018 06:20
Show Gist options
  • Save lorenzhs/864353c202112a38de17ed054f31e67c to your computer and use it in GitHub Desktop.
Save lorenzhs/864353c202112a38de17ed054f31e67c to your computer and use it in GitHub Desktop.
Twitter cryptoscam detection proof of concept. https://twitter.com/TinkerSec/status/961233575516389376
#!/usr/bin/env python3
# encoding: utf-8
# author: Lorenz Hübschle-Schneider
# This is really really simple. Twitter, you have no excuse for not doing something like this!
import codecs
import json
import re
from unicodedata import normalize
eth_regex = re.compile("0x[a-fA-F0-9]{40}")
btc_regex = re.compile("[13][a-km-zA-HJ-NP-Z1-9]{25,34}")
# from: https://stackoverflow.com/a/32558749/3793885
def levenshteinDistance(s1, s2):
if len(s1) > len(s2):
s1, s2 = s2, s1
distances = range(len(s1) + 1)
for i2, c2 in enumerate(s2):
distances_ = [i2+1]
for i1, c1 in enumerate(s1):
if c1 == c2:
distances_.append(distances[i1])
else:
distances_.append(1 + min((distances[i1], distances[i1 + 1], distances_[-1])))
distances = distances_
return distances[-1]
# return edit distance between ascii-normalized versions of the two input strings
def normalized_distance(original, query):
normalized_original = normalize('NFKD', original).encode('ascii', 'ignore')
normalized_query = normalize('NFKD', query).encode('ascii', 'ignore')
return levenshteinDistance(normalized_original, normalized_query)
# compute a score how likely a tweet is to be cryptocurrency scam
# value ranges from 0.0 (probably not) to 1.0 (pretty certain)
def classify_scam(tweet, original_tweet):
score = 0.0
text = tweet["full_text"]
if eth_regex.search(text) or btc_regex.search(text):
score += 0.5
displayname_distance = normalized_distance(original_tweet["user"]["name"],
tweet["user"]["name"])
username_distance = normalized_distance(original_tweet["user"]["screen_name"],
tweet["user"]["screen_name"])
score += 1.0/(displayname_distance + username_distance + 1)
return score
if __name__ == '__main__':
import sys
if len(sys.argv) == 1:
print('Usage: {} twarc-replies-dump.json'.format(sys.argv[0]))
print('')
print('This tool parses the output of "twarc replies <tweet-id>"')
print('See https://github.com/DocNow/twarc for more information on twarc')
sys.exit(1)
filename = sys.argv[1]
with codecs.open(filename, 'r', 'utf8') as inputfile:
lines = inputfile.readlines()
print('Read {} lines'.format(len(lines)))
original_tweet = json.loads(lines[0])
suspects = []
for line in lines[1:]:
tweet = json.loads(line)
score = classify_scam(tweet, original_tweet)
if score > 0.2:
suspects.append((score, tweet))
for (score, tweet) in sorted(suspects, reverse = True):
print('Found a likely scammy tweet, score {}:'.format(score))
print('\tfrom: {user} – {name}'.format(
user = tweet["user"]["screen_name"].encode('utf-8'),
name = tweet["user"]["name"].encode('utf-8')))
print('\ttext: {}'.format(tweet["full_text"].encode('utf-8')))
print('---------------------------------------------------------------')
@lorenzhs
Copy link
Author

lorenzhs commented Feb 8, 2018

Here's the output when applied to the first 753 replies to https://twitter.com/elonmusk/status/961083704230674438:

Read 754 lines
Found a likely scammy tweet, score 1.0:
	from: elonmuski – Elon Musk

	text: @elonmusk Hi guys! I'm donating 250 Ethereum to the ETH community! First 250 transactions with 0.2 ETH sent to the address below will receive 1.0 ETH in the address the 0.2 ETH came from.

0xbdd74eab3839ca7da2992f653f9f9a2992b172dd

The promotion will last 24 hours! Hurry!
---------------------------------------------------------------
Found a likely scammy tweet, score 1.0:
	from: elonmuski – Elon Musk

	text: @elonmusk Hi guys! I'm donating 250 BITCOIN! to the BTC community! First 250 transactions with 0.2  BTC sent to the address below will receive 1.0 BTC in the address the 0.2 BTC came from.

1MHrzbRwb9LtwqRvSCd5zJ1ifDZ7auo2Y3

The promotion will last 24 hours! Hurry!
---------------------------------------------------------------
Found a likely scammy tweet, score 0.833333333333:
	from: eIonmus_ – Elon Musk

	text: @elonmusk Hi guys! I'm donating 300 Ethereum to the ETH community! First 300 transactions with 0.25 ETH sent to the address below will receive 1.0 ETH in the address the 0.25 ETH came from.

0x5ef6529fe12eff3af926a1adb2335d7f16471eb1

The promotion will last 48 hours! Hurry!
---------------------------------------------------------------
Found a likely scammy tweet, score 0.833333333333:
	from: alon_musk – Elon Musk

	text: @elonmusk Hi guys! I'm donating 250 Ethereum to the ETH community! First 250 transactions with 0.2 ETH sent to the address below will receive 1.0 ETH in the address the 0.2 ETH came from.

0x10aF9cd8096EA75a62007b616BC999536CE2A6fB

The promotion will last 24 hours! Hurry!
---------------------------------------------------------------
Found a likely scammy tweet, score 0.75:
	from: elomnosk – Elon Musk

	text: @elonmusk By the way: I'm giving away 125 BTC to my followers. Just send 0.025 BTC to the address below and I'll send you 0.5 BTC back, through the same address you used in the transaction.

19WV26bnvvEcrLnT8Wn9EqpHhAgcS7H8vB

This is my way of thanking all my fans and friends. Thank you!
---------------------------------------------------------------
Found a likely scammy tweet, score 0.7:
	from: eeIIon_musk – Elon Musk

	text: @elonmusk Hi guys! I'm donating 250 Ethereum to the ETH community! First 250 transactions with 0.2 ETH sent to the address below will receive 1.0 ETH in the address the 0.2 ETH came from.

0x90057cd3240625f81992917371a94e5c644da266

The promotion will last 24 hours! Hurry!
---------------------------------------------------------------
Found a likely scammy tweet, score 0.7:
	from: ElloonMusk – Elon Musk

	text: @elonmusk I'm happy and giving to my followers 100 Ethereum, send 0.2 Eth to the address below and you will receive 2.0 Ethereum.

0x46d09749BeA7989e4D68548C446CbEc2dc07992B

Act fast! you don't want to miss out!!
---------------------------------------------------------------
Found a likely scammy tweet, score 0.666666666667:
	from: ElonMuskkkk – Elon Musk

	text: @elonmusk Guys please beware of scammers on my page! The only address to send your Eth is 

0x3f0e13bd489Ad2C962C66Eb7275Ffbe168d5C191

For every 0.2 Eth sent I'll send 2 Eth back! But hurry, this offer is limited!
---------------------------------------------------------------
Found a likely scammy tweet, score 0.538461538462:
	from: VitalikButtiren – Vitalik Buterin

	text: @elonmusk Hi guys! I'm donating 500 ETH to the ETH community.  First 2500 transaction. Just send 0.2 ETH to the address below and you will receive 2.0 ETH.
 
0x7c7565ca3AF76E44031e811Fa896Cd8e0A1ADE88
---------------------------------------------------------------

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment