Skip to content

Instantly share code, notes, and snippets.

View AaradhyaSaxena's full-sized avatar

Aaradhya Saxena AaradhyaSaxena

View GitHub Profile

Jest-Introduction

Mock functions, also known as spies, are special functions that allow us to track how a particular function is called by external code.

By using mock functions, we can know the following:

  • The number of calls it received.
  • Argument values used on each invocation.
  • The “context” or this value on each invocation.
  • How the function exited and what values were produced.
  • We can also provide an implementation to override the original function behavior. And we can describe specific return values to suit our tests.
@AaradhyaSaxena
AaradhyaSaxena / 1. BDD.md
Last active February 27, 2023 11:31
Cucumber BDD

BDD

BDD is a way for software teams to work that closes the gap between business and tech people by encouraging collaboration between roles to build shared understanding of desired behavior of the system.

Producing system documentation that guides development and is automatically checked against system behavior.

BDD requires 3 things to be done in small rapid iterations:

  1. Take a small upcoming change to. the system ( User Story ).
    • Have a conversation to decide system behavior with concrete examples.
  2. Document those examples in a way that could be automated, and check for agreement.
@AaradhyaSaxena
AaradhyaSaxena / 1. Metrics.md
Last active November 3, 2022 11:59
Machine Learning

Precision and Recall

The terrorist detection task is an imbalanced classification problem: we have two classes we need to identify—terrorists and not terrorists—with one category (non-terrorists) representing the overwhelming majority of the data points. Another imbalanced classification problem occurs in disease detection when the rate of the disease in the public is very low. In both these cases, the positive class—disease or terrorist—greatly outnumbers the negative class. These types of problems are examples of the fairly common case in data science when accuracy is not a good measure for assessing model performance.

Recall

Intuitively, we know that proclaiming all data points as negative (not a terrorist) in the terrorist detection problem isn’t helpful, and, instead, we should focus on identifying the positive cases. The metric our intuition tells us we should maximize is known in statistics as recall, or the ability of a model to find all the relevant cases within a data set.

![ima

@AaradhyaSaxena
AaradhyaSaxena / Top-N Recommendation with Missing Implicit Feedback.md
Last active February 27, 2023 11:30
Research-Papers-Recommendation-Engine

In implicit feedback datasets, non-interaction of a user with an item does not necessarily indicate that an item is irrelevant for the user. Thus, evaluation measures computed on the observed feedback may not accurately reflect performance on the complete data.

Many collaborative filtering approaches attempt to identify user preferences based on explicit feedback such as user ratings. However, implicit feedback, in which a user's preferences are expressed through item interactions such as views or purchases, is often more common than explicit feedback.

In both explicit and implicit feedback systems, the presence of missing data poses a challenge to the evaluation of a recommendation system.

  • In explicit feedback datasets, ratings can be Missing-not-at-Random (MNAR), so systems trained only on observed ratings may give biased predictions.
  • On the other hand, in implicit feedback datasets, non-interaction of a user with an item does not necessarily indicate that the item is irrelevant
@AaradhyaSaxena
AaradhyaSaxena / Alibaba RE.md
Last active November 27, 2023 18:12
Alibaba RE

Recommendation Engine

Building item embeddings for candidate retrieval.

  • In the offline environment, session-level user-item interactions are mined to construct a weighted, bidirectional item graph.
  • The graph is then used to generate item sequences via random walks.
  • Item embeddings are then learned via representation learning (i.e., word2vec skip-gram), doing away with the need for labels.
  • Finally, with the item embeddings, they get the nearest neighbor for each item and store it in their item-to-item (i2) similarity map (i.e., a key-value store).

item-embedding

Personalization - RIL

Introduction

The key idea is to use the opinions and behaviors of users to suggest and personalize relevant and interesting content for them.

Eg: Lets say there are 2 users, one with history of buying Samsung products, and the other Apple products. Then when they type in phone in search, the U1 should see samsung phones at the top and U2 should see iphones at the top of the result.

What we want is to get the User Embeddings, and the product Embeddings of Products from the search result, calculate which product embeddings are closer to a particular user and then based on their similarity score boost the popularity score of the product.

What do I mean by the Embeddings? We put the product/user data in some model and project it into an N-dimensional space. Then a product is represented by a N-dimensional vector which is the co-ordinate of that point in that space. Now, the model is trained such that 2 products which are similar to each other will be closer to each other. The emb

@AaradhyaSaxena
AaradhyaSaxena / Decorators.md
Last active February 27, 2023 11:30
Python

Decorators in Python

A decorator is a design pattern in Python that allows a user to add new functionality to an existing object without modifying its structure. Decorators are usually called before the definition of a function you want to decorate.

def uppercase_decorator(function):
    def wrapper():
        func = function()
        make_uppercase = func.upper()
 return make_uppercase

Django Project

Some initial setup ,auto-generate some code that establishes a Django project.

  • A collection of settings for an instance of Django, including database configuration, Django-specific options and application-specific settings.
  • A project may contain multiple apps.
django-admin startproject mysite

Redis #

Redis is an in-memory data structure store that can be used as a caching engine. Since it keeps data in RAM, Redis can deliver it very quickly.

Keys: The key in Redis is a binary-safe String, with a max size of 512 MB.

Values: Supports String, List, Set, Sorted Set, Hash.

Commands:

  • SET key value
@AaradhyaSaxena
AaradhyaSaxena / gensim.md
Last active February 27, 2023 11:24
gensim

Gensim

list all the keys

list(model.wv.index_to_key)

List n most similar products

sims = model.wv.most_similar('computer', topn=10) # get other similar words