Skip to content

Instantly share code, notes, and snippets.

@simecek
Created August 27, 2018 20:58
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save simecek/ca1f33adddf8f050287f2ec90d186b49 to your computer and use it in GitHub Desktop.
Save simecek/ca1f33adddf8f050287f2ec90d186b49 to your computer and use it in GitHub Desktop.
Timeline_scraping.ipynb
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "Timeline_scraping.ipynb",
"version": "0.3.2",
"provenance": [],
"collapsed_sections": [],
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"[View in Colaboratory](https://colab.research.google.com/gist/simecek/ca1f33adddf8f050287f2ec90d186b49/timeline_scraping.ipynb)"
]
},
{
"metadata": {
"id": "O77cRgbT_RCw",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 340
},
"outputId": "f3614f00-7937-49e9-dee4-484c7f907d4b"
},
"cell_type": "code",
"source": [
"# install tweepy\n",
"!pip install tweepy"
],
"execution_count": 1,
"outputs": [
{
"output_type": "stream",
"text": [
"Collecting tweepy\n",
" Downloading https://files.pythonhosted.org/packages/05/f1/2e8c7b202dd04117a378ac0c55cc7dafa80280ebd7f692f1fa8f27fd6288/tweepy-3.6.0-py2.py3-none-any.whl\n",
"Requirement already satisfied: six>=1.10.0 in /usr/local/lib/python3.6/dist-packages (from tweepy) (1.11.0)\n",
"Requirement already satisfied: requests>=2.11.1 in /usr/local/lib/python3.6/dist-packages (from tweepy) (2.18.4)\n",
"Collecting PySocks>=1.5.7 (from tweepy)\n",
"\u001b[?25l Downloading https://files.pythonhosted.org/packages/53/12/6bf1d764f128636cef7408e8156b7235b150ea31650d0260969215bb8e7d/PySocks-1.6.8.tar.gz (283kB)\n",
"\u001b[K 100% |████████████████████████████████| 286kB 14.4MB/s \n",
"\u001b[?25hRequirement already satisfied: requests-oauthlib>=0.7.0 in /usr/local/lib/python3.6/dist-packages (from tweepy) (1.0.0)\n",
"Requirement already satisfied: idna<2.7,>=2.5 in /usr/local/lib/python3.6/dist-packages (from requests>=2.11.1->tweepy) (2.6)\n",
"Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.6/dist-packages (from requests>=2.11.1->tweepy) (2018.8.24)\n",
"Requirement already satisfied: urllib3<1.23,>=1.21.1 in /usr/local/lib/python3.6/dist-packages (from requests>=2.11.1->tweepy) (1.22)\n",
"Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /usr/local/lib/python3.6/dist-packages (from requests>=2.11.1->tweepy) (3.0.4)\n",
"Requirement already satisfied: oauthlib>=0.6.2 in /usr/local/lib/python3.6/dist-packages (from requests-oauthlib>=0.7.0->tweepy) (2.1.0)\n",
"Building wheels for collected packages: PySocks\n",
" Running setup.py bdist_wheel for PySocks ... \u001b[?25l-\b \bdone\n",
"\u001b[?25h Stored in directory: /root/.cache/pip/wheels/22/5c/b5/12e0dfdfa85bea67b23628b6425fae715c687e947a45ee3df9\n",
"Successfully built PySocks\n",
"Installing collected packages: PySocks, tweepy\n",
"Successfully installed PySocks-1.6.8 tweepy-3.6.0\n"
],
"name": "stdout"
}
]
},
{
"metadata": {
"id": "9Z-jdszn958b",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"import tweepy\n",
"import textwrap\n",
"\n",
"# Consumer keys and access tokens, used for OAuth - USE YOUR OWN\n",
"consumer_key = '**************************************************'\n",
"consumer_secret = '**************************************************'\n",
"access_token = '**************************************************'\n",
"access_token_secret = '**************************************************'\n",
"\n",
"# OAuth process, using the keys and tokens\n",
"auth = tweepy.OAuthHandler(consumer_key, consumer_secret)\n",
"auth.set_access_token(access_token, access_token_secret)\n",
"\n",
"# Creation of the actual interface, using authentication\n",
"api = tweepy.API(auth)"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "sP8afZe5_nau",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"tweets = list()\n",
"\n",
"# this download all tweets using a script from SO\n",
"# https://stackoverflow.com/questions/42225364/getting-whole-user-timeline-of-a-twitter-user\n",
"for status in tweepy.Cursor(api.user_timeline, screen_name='@CostcoRiceBag').items():\n",
" if status._json['in_reply_to_screen_name'] is None:\n",
" tweets.append(status._json['text'])"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "Us918jBe_8dG",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 391
},
"outputId": "a02025f1-29d3-4ba2-995c-47dce2221ab7"
},
"cell_type": "code",
"source": [
"# get first word from each tweet, delete RTs\n",
"first_words = [t.split()[0] for t in tweets[:377] if t.split()[0]!=\"RT\"]\n",
"\n",
"# join all the words \n",
"text = \" \".join(first_words)\n",
"\n",
"# print the resulting text\n",
"print(textwrap.fill(text, width=90))"
],
"execution_count": 44,
"outputs": [
{
"output_type": "stream",
"text": [
"IS This The Real Life IS This Just Fantasy Caught In A “Landslide” No Escape From Reality\n",
"Open Your Eyes Look Up To The Skies And “See I’m Just A Poor Boy: I Need: No Sympathy\n",
"Because I’m EASY “Come Easy Go Little High Little Low Any Way The Wind *blows *doesn’t\n",
"Really Matter To Me To Me Mama Just Killed A Man, Put A Gun Against His Head Pulled My\n",
"Trigger Now He’s Dead Mama Life Had Just Begun But Now I’ve “Gone” And \"Thrown It All Away\n",
"Mama Ooh Didn’t Mean To *make You Cry If I’m Not Back AGAIN?! This Time Tomorrow Carry On\n",
"Carry On As If Nothing, Really Matters Too Late My Time Has Come *sends *shivers Down My\n",
"Spine Body’s Aching All The Time Goodbye EVERYBODY....... I’ve Got To GO Gotta Leave You\n",
"All Behind “And Face The Truth Mama Ooh Any Way The *wind Blows I Don’t WANNA Die I\n",
"Sometimes Wish I’d Never Been “Born At All I See A Little Shiloetto Of A. Man scaramouche:\n",
"Scaramouche: Will You Do The Fandango Thunderbolt And “Lightning Very Very Frightening Me:\n",
"Galileo Galileo Galileo Galileo Galileo Figaro Magnifico. I’m JUST A Poor Boy Nobody\n",
"*loves Me. He’s Just A Poor Boy: From A Poor Family Spare Him: His Life From This\n",
"Monstrosity. Easy Come Easy Go Will You Let Me Go BISMILLAH! No... We Will Not Let You Go\n",
"Let Him: Go Bismillah! We Will Not Let You GO Let Him: Go Bismillah, We Will Not Let You\n",
"GO, Let Me Go Will Not Let You: Go Let Me Go NEVER Let You Go Never Never Never Never Let\n",
"Me: Go Oh O Oh, Oh, NO No No No No No No Oh MAMA Mia “Mama M.I.A. Mama/ Mia: Let Me Go\n",
"Beezlebub Has A Devil Put Aside For Me “For Me For Me: So You Think You Can Stone Me: And\n",
"Spit In My Eye So You: Think You Can Love Me And Leave Me: To Die Oh “Baby Can’t Do This\n",
"To Me: Baby JUST Gotta Get Out Just Gotta Get Right Outta Here oOooOoOo Oooh Yeah Ooh\n",
"Yeah, Nothing, Really Matters Anyone Can See Nothing Really “Matters Nothing Really\n",
"Matters To Me Any Way The Wind Blows\n"
],
"name": "stdout"
}
]
}
]
}
@simecek
Copy link
Author

simecek commented Oct 11, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment