This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#We extract a list of existing acronyms in the section of the Madrid Stock Exchange within yahoo finances in the Spanish version | |
ScrapedAux = sourceCode_tickers.split('components":{"components":')[1].split('],')[0].split(',')[0:-1] | |
print(ScrapedAux[:10]) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
sourceCode_tickers = str(urlopen('https://es.finance.yahoo.com/quote/IGBM.MA/components?p=IGBM.MA').read()) | |
print(sourceCode_tickers) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# List of fields we will scrape | |
list_of_fields = ['Market Cap', 'Enterprise Value', 'Trailing P/E', 'Forward P/E', 'PEG Ratio', 'Price/Sales', 'Price/Book', 'Enterprise Value/Revenue', 'Enterprise Value/EBITDA', 'Fiscal Year Ends', 'Most Recent Quarter', 'Profit Margin', 'Operating Margin', 'Return on Assets', 'Return on Equity', 'Revenue', 'Revenue Per Share', 'Quarterly Revenue Growth', 'Gross Profit', 'EBITDA', 'Net Income Avi to Common', 'Diluted EPS', 'Quarterly Earnings Growth', 'Total Cash', 'Total Cash Per Share', 'Total Debt', 'Total Debt/Equity', 'Current Ratio', 'Book Value Per Share', 'Operating Cash Flow', 'Levered Free Cash Flow', 'Beta', '52-Week Change', 'S&P500 52-Week Change', '52 Week High', '52 Week Low', '50-Day Moving Average', '200-Day Moving Average', 'Avg Vol (3 month)', 'Avg Vol (10 day)', 'Shares Outstanding', 'Float', '% Held by Insiders', '% Held by Institutions', 'Shares Short', 'Short Ratio', 'Short % of Float', 'Shares Short (prior month)', 'Forward Annual Dividend Rate', ' |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#Importing libraries | |
from urllib.request import urlopen | |
import pandas as pd | |
import numpy as np | |
import time | |
import re |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#We generate a list of user 21 movie titles to be able to add it to the functions. | |
#The df ratings is taken and the column movie_title. | |
user_film_21_list = ratings[ratings.user_id==user_21].movie_title.tolist() | |
recomendations_21_user = get_movie_recomendations(user_film_21_list) | |
recomendations_21_user.Titulo.head(20) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
user_21 = 21 | |
ratings[ratings.user_id==user_21].sort_values(by=['rating'], ascending=False) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#A function that returns the correlation vector for a movie | |
def get_similar_movie(movie): | |
corr_matrix = np.corrcoef(ratings_matriz.T) | |
movie_idx = list(movie_index).index(movie) | |
return corr_matrix[movie_idx] | |
#We return movies that are more similar to the tastes of a model user. | |
#If we want to recommend movies to a user, we get the list of movies they've watched and add up the correlations | |
#of those movies with all the others to return the movies with a greater total correlation.. | |
def get_movie_recomendations(user): |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ratings_matriz = ratings.pivot_table(values='rating', index='user_id', columns='movie_title') | |
# We fill with 0 in the Nan values | |
ratings_matriz.fillna(0, inplace=True) | |
movie_index = ratings_matriz.columns | |
ratings_matriz.head() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#load the rantings file | |
ratings = pd.read_table('data/ratings.dat', header=None, sep='::', engine='python', names=['user_id', 'movie_id', 'rating', 'timestamp']) | |
#Deleted the date the rating was created | |
del ratings ['timestamp'] | |
#Add the title of the film | |
ratings = pd.merge(ratings, movies_df, on='movie_id')[['user_id', 'movie_title', 'movie_id','rating']] | |
ratings.head() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# We get the genres of for example the first film, Toy Story | |
toy_story_features = movies_df.loc[0][movie_categories] | |
print(toy_story_features) | |
# We calculate the score of the film against the user through the vector product | |
toy_story_user_predicted_score = dot_product(toy_story_features, user_preferences.values()) | |
toy_story_user_predicted_score | |
#5 |