This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#In order for the Shiny app and part of this code to run, you must have installed ggvis and shiny on your computer. | |
#This will require R Base 3.1.0 or higher. | |
#If you are using Ubuntu trusty, this may require you to add an additional line to your /etc/apt/sources.list file | |
#You can do this by following the instructions below: | |
#First, type the following into your terminal: sudo gedit /etc/apt/sources.list | |
# | |
#Then, paste "deb http://cran.wustl.edu/bin/linux/ubuntu trusty/" on a new line at end of current sources.list | |
# text file in Text Editor. | |
# |
We can't make this file beautiful and searchable because it's too large.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
"","Short_Key","Long_Name","Year","Country_GDP","Country_Population","Country_Pop_Dens","Region","Income","Income_Group_GDP","Income_Group_Population","Income_Group_Pop_Dens","Region_GDP","Region_Population","Region_Pop_Dens" | |
"1","AF","Afghanistan",2001,2461666314.78366,21347782,32.6988665257482,"South Asia","Low income",168320464483.076,651625175,43.8567060376896,638745671205.06,1405746086,294.504402860938 | |
"2","AF","Afghanistan",2002,4128818042.61749,22202806,34.0413749750855,"South Asia","Low income",176332502923.58,666831547,44.3116864018666,672494121092.634,1429513552,299.505846646499 | |
"3","AF","Afghanistan",2003,4583648921.74369,23116142,35.4417030802018,"South Asia","Low income",193372966826.881,682231310,45.3350175141828,784882714578.081,1453111368,304.449964769898 | |
"4","AF","Afghanistan",2004,5285461998.97866,24018682,36.8254787421615,"South Asia","Low income",216569844116.367,697754397,46.3669133771839,911124327836.882,1476314527,309.420502370042 | |
"5","AF","Afghanistan",2005,6275076015.72254,24860855,38 |
We can't make this file beautiful and searchable because it's too large.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
"","Title","Earnings_in_Mil","Released","Year","Month","Month_Num","Qtr_Num","Rated","Runtime_Minutes","Metascore","imdbRating","imdbVotes","Language_1","Language_2","Language_3","Language_4","Language_5","Language_6","Language_7","Language_8","Language_9","Country_1","Country_2","Country_3","Country_4","Country_5","Country_6","Country_7","Country_8","Country_9","Country_10","Country_11","Country_12","Country_13","Writer_1","Writer_2","Writer_3","Writer_4","Writer_5","Writer_6","Writer_7","Writer_8","Writer_9","Writer_10","Writer_11","Writer_12","Writer_13","Writer_14","Writer_15","Writer_16","Writer_17","Writer_18","Writer_19","Writer_20","Writer_21","Writer_22","Writer_23","Writer_24","Writer_25","Writer_26","Writer_27","Writer_28","Writer_29","Writer_30","Writer_31","Director_1","Director_2","Director_3","Director_4","Director_5","Director_6","Director_7","Director_8","Director_9","Director_10","Director_11","Director_12","Director_13","Director_14","Director_15","Director_16","Director_17","Director_18"," |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#First load the data and libraries | |
#setwd('Set working directory here.') | |
library(Hmisc) | |
library(ggvis) | |
library(dplyr) | |
library(car) | |
movie_df = read.csv('Movie_DF_wBusinessData.csv') |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import os | |
import pandas as pd | |
#os.chdir('Change your directory here ...') | |
''' | |
Take the output from IMDB_Initial_WebScraping.py and use Excel to separate out the | |
languages, countries, writers, genres, directors, actors, and genres into separate | |
cells by using the "Text to Columns" feature with commas as the delimiter, making sure |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
''' | |
This program was generated to scrape movie data from the IMDB website | |
in conjunction with omdbapi.com | |
''' | |
#Import the necessary libraries | |
import urllib2 | |
from bs4 import BeautifulSoup as bs | |
import pandas as pd | |
import os |