This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<html> | |
<head> | |
<!-- Plotly.js --> | |
<script src="https://cdn.plot.ly/plotly-latest.min.js"></script> | |
<style> | |
#myDIV { | |
border: 1px solid black; | |
background-color: lightblue; | |
width: auto; | |
overflow: auto; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import plotly | |
import plotly.express as px | |
def generate_div(prediction_distribution): | |
""" | |
function to generate div html tags from model prediction distribution dictionary. | |
:param prediction_distribution: dictionary with keys as model name and its values as a dictionary having | |
its classes and values. It should look like: | |
{'1.0': {'Class 1': 23, |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import math | |
import scipy.stats as st | |
def bayesian_rating_products(n, confidence=0.95): | |
""" | |
Function to calculate wilson score for N star rating system. | |
:param n: Array having count of star ratings where ith index represent the votes for that category i.e. [3, 5, 6, 7, 10] | |
here, there are 3 votes for 1-star rating, similarly 5 votes for 2-star rating. | |
:param confidence: Confidence interval | |
:return: Score |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import math | |
import scipy.stats as st | |
def wilson_lower_bound(pos, n, confidence=0.95): | |
""" | |
Function to provide lower bound of wilson score | |
:param pos: No of positive ratings | |
:param n: Total number of ratings | |
:param confidence: Confidence interval, by default is 95 % | |
:return: Wilson Lower bound score |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from sklearn.metrics.pairwise import cosine_similarity | |
def maximal_marginal_relevance(sentence_vector, phrases, embedding_matrix, lambda_constant=0.5, threshold_terms=10): | |
""" | |
Return ranked phrases using MMR. Cosine similarity is used as similarity measure. | |
:param sentence_vector: Query vector | |
:param phrases: list of candidate phrases | |
:param embedding_matrix: matrix having index as phrases and values as vector | |
:param lambda_constant: 0.5 to balance diversity and accuracy. if lambda_constant is high, then higher accuracy. If lambda_constant is low then high diversity. | |
:param threshold_terms: number of terms to include in result set |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def club_similar_keywords(emb_mat, sim_score=0.9): | |
""" | |
:param emb_mat: matrix having vectors with words as index | |
:param sim_score: 0.9 by default | |
:return: returns list of unique words from index after combining words which has similarity score of more than | |
0.9 | |
""" | |
if len(emb_mat) == 0: | |
return 'NA' | |
xx = cosine_similarity(emb_mat) |