Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save gabfr/d8be62bb4255f9c222167853a186b764 to your computer and use it in GitHub Desktop.
Save gabfr/d8be62bb4255f9c222167853a186b764 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Detecting Instagram liker bots (or fans?) \n",
"## by comparing who have liked recent posts of a user\n",
"\n",
"If the same set of users are liking all the posts of an Instagram user, this is possibly a bot activity. We will detect those common users in 5 minutes. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Fetch the resent posts from the profile page of the user"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
"import re\n",
"import json\n",
"import requests # pip install requests\n",
"\n",
"profile_pat = re.compile('<script type=\"text/javascript\">window\\._sharedData = (.*);</script>')\n",
"username = 'onurmatik' # instagram username to check\n",
"\n",
"# The exception handling is excluded for brevity\n",
"response = requests.get('https://www.instagram.com/%s/' % username)\n",
"match = profile_pat.findall(response.text)\n",
"parsed = json.loads(match[0])\n",
"shortcodes = [\n",
" m['code'] for m in parsed['entry_data']['ProfilePage'][0]['user']['media']['nodes']\n",
"]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Fetch the likers of each media\n",
"Now we have the shortcodes (IDs) of the latest posts displayed on the user's profile page. It is time to fetch who has liked them. "
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [],
"source": [
"# instagram endpoint that would return the users who have liked a post\n",
"endpoint = 'https://www.instagram.com/graphql/query/?query_id=17864450716183058&variables=%s'\n",
"\n",
"likers = []\n",
"\n",
"for shortcode in shortcodes:\n",
" url = endpoint % json.dumps({\n",
" 'shortcode': shortcode,\n",
" 'first': 1000, # fetch the last 1,000 likers; you should paginate through results to fetch all likers\n",
" })\n",
" response = requests.get(url)\n",
" data = json.loads(response.text)\n",
" likers.append([\n",
" edge['node']['username'] for edge in data['data']['shortcode_media']['edge_liked_by']['edges']\n",
" ])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Detect the users who have liked all of the posts\n",
"By a simple set intersection operation, we have the list of users who are in all the liker lists of each post."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"set.intersection(*map(set, likers))"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 2",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.11"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment