Skip to content

Instantly share code, notes, and snippets.

View BastinRobin's full-sized avatar
🏠
Working from home

Bastin Robin BastinRobin

🏠
Working from home
View GitHub Profile
@BastinRobin
BastinRobin / Question.md
Created October 21, 2021 09:52
Covid 19
  1. Which district has more cases?
  2. Total survived cases across country?
  3. Geographical pattern of pandemic spread?
  4. Rate of survival across districts?
  5. What is the recovery rate across countries, districts, state?
  6. Rate of new cases across districts?
  7. Total active cases in India?
  8. Datewise new cases in India?
  9. To understand the relation between new and death cases
@BastinRobin
BastinRobin / cluster.md
Created May 19, 2021 07:17
Hierarichal Clustering

Implementation of Hierarichal Clustering (Agglomeration Cluster)

Hierarchical clustering algorithms group similar objects into groups called clusters. There are two types of hierarchical clustering algorithms:

  • Agglomerative — Bottom up approach. Start with many small clusters and merge them together to create bigger clusters.
  • Divisive — Top down approach. Start with a single cluster than break it up into smaller clusters.

Create a Hierarical clustering and experiment the following tutorial

@BastinRobin
BastinRobin / ensemble.md
Created May 12, 2021 08:21
Ensemble Technique

Implement Ensemble Techniques On Given Dataste

Dataset This given dataset consists of the following datapoints

  • id - unique ID for excerpt
  • url_legal - URL of source - this is blank in the test set.
  • license - license of source material - this is blank in the test set.
  • excerpt - text to predict reading ease of
  • target - reading ease
@BastinRobin
BastinRobin / kmean.md
Created May 5, 2021 08:36
Implementation Of K-Mean To Analyse Crime in US

# K-Means Clustering - U.S. Crime Data

We'll use k-means to discover clusters in a data set using unsupervised learning. The original data can be found here
https://ufile.io/39bbkph1

From the Unified Crime Reporting Statistics and under the collaboration of the U.S. Department of Justice and the Federal Bureau of Investigation information crime statistics are available for public review. The following data set has information on the crime rates and totals for states across the United States for a wide range of years. The crime reports are divided into two main categories: property and violent crime. Property crime refers to burglary, larceny, and motor related crime while violent crime refers to assault, murder, rape, and robbery. These reports go from 1960 to 2012.

The analysis consists of the following steps.

  • I. Importing necessary libraries and downloading the data
@BastinRobin
BastinRobin / mail.php
Created January 10, 2014 05:05
PHP Mail Snippet [Table View]
<?php
// Compose the mail
$to = "imail@example.co.uk";
$subject = "Enquiry from Example";
$headers = "From: " . strip_tags($_POST['email']) . "\r\n";
$headers .= "Reply-To: ". strip_tags($_POST['email']) . "\r\n";
$headers .= "CC: email@gmail.com\r\n";
$headers .= "MIME-Version: 1.0\r\n";
$headers .= "Content-Type: text/html; charset=ISO-8859-1\r\n";
@BastinRobin
BastinRobin / naive.md
Last active March 24, 2021 04:48
Implement Naive Bayes

Implement Naive Bayes Classifier

Naive Bayes methods are a set of supervised learning algorithms based on applying Bayes' theorem with the “naive” assumption of conditional independence between every pair of features given the value of the class variable.

enter image description here

enter image description here

Use the above dataframe as reference and build a Naive Bayes Classifier using python. Follow the guidelines.

  1. Build a production ready classifier following the API interfaces.
@BastinRobin
BastinRobin / mds.md
Last active March 17, 2021 08:36
Multi-Dimensional Scaling

Singular Value Decomposition

In linear algebra, the singular value decomposition (SVD) is a factorization of a real or complex matrix that generalizes the eigendecomposition of a square normal matrix to any. matrix via an extension of the polar decomposition.

IMAGE ALT TEXT HERE

Multidimensional Scaling

Multidimensional scaling is a powerful technique used to visualize in 2-dimensional space the (dis)similarity among objects. The idea is that we can derive to what extent two objects are similar, based on the geometric distance they exhibit in the 2D plan.

PCA

There is no pca() function in NumPy, but we can easily calculate the Principal Component Analysis step-by-step using NumPy functions. The example below defines a small 3×2 matrix, centers the data in the matrix, calculates the covariance matrix of the centered data, and then the eigendecomposition of the covariance matrix. The eigenvectors and eigenvalues are taken as the principal components and singular values and used to project the original data.

from numpy import array
from numpy import mean

from numpy import cov

@BastinRobin
BastinRobin / PCA.py
Last active March 10, 2021 08:36
Manually Calculate Principal Component Analysis
## PCA
There is no pca() function in NumPy, but we can easily calculate the Principal Component Analysis step-by-step using NumPy functions.
The example below defines a small 3×2 matrix, centers the data in the matrix, calculates the covariance matrix of the centered data, and then the eigendecomposition of the covariance matrix. The eigenvectors and eigenvalues are taken as the principal components and singular values and used to project the original data.
from numpy import array
from numpy import mean
from numpy import cov
@BastinRobin
BastinRobin / Regression.md
Last active March 3, 2021 05:28
Regression Analysis Lab

Regression Analysis

This exercise is created to make you confident in building multiple regression models using spreadsheets.

This dataset contains information collected by the U.S Census Service concerning housing in the area of Boston Mass. It was obtained from the StatLib archive, and has been used extensively throughout the literature to benchmark algorithms. However, these comparisons were primarily done outside of Delve and are thus somewhat suspect. The dataset is small in size with only 506 cases.

  • Download the dataset using this link https://gofile.io/d/YZUprY
  • Use MS Excel to build a Multiple Regression Models
  • Audit the regression summary and elimate features which are not significant.
  • Predict the house price MEDV .