Skip to content

Instantly share code, notes, and snippets.

@badalnabizade
Created August 10, 2019 21:25
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save badalnabizade/0a636d5535580e78b92a556aa402f9de to your computer and use it in GitHub Desktop.
Save badalnabizade/0a636d5535580e78b92a556aa402f9de to your computer and use it in GitHub Desktop.
movie_names = movies.set_index('movieId')['title'].to_dict()
g=ratings.groupby('movieId')['rating'].count()
#TopMovies are top 3000 movies that got more user reviews than others.
topMovies=g.sort_values(ascending=False).index.values[:3000]
uniq = ratings.movieId.unique() # Ids of unique movies.
name2idx = {o:i for i,o in enumerate(uniq)}
topMovieIdx = np.array([name2idx[o] for o in topMovies]) # Indices of top rated movies in ratings.csv
get_movie_bias = models.Model(m_inp, m_b) # traind movie biases for each movies in ratings.csv
movie_bias = get_movie_bias.predict(topMovieIdx) # lookup weights for top rated movies in movie biases embedding vector.
movie_ratings = [(b[0], movie_names[i]) for i,b in zip(topMovies,movie_bias)] # --> [(movie bias, movie name), ...]
best_movies = sorted(movie_ratings, key=lambda o: o[0], reverse=True)[:48] # Best 48 movies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment