Skip to content

Instantly share code, notes, and snippets.

View mepsrajput's full-sized avatar
🎯
Focusing

Pradeep Singh mepsrajput

🎯
Focusing
View GitHub Profile
@mepsrajput
mepsrajput / web_development_resources.md
Last active February 26, 2020 03:29
Free Online Resources for Web Developers
@mepsrajput
mepsrajput / index.html
Created July 21, 2018 19:50
HTML5_Basic_Markup_Structure_&_CSS_Reset
<!doctype html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="description" content="Website Name">
<meta name="author" content="Pradeep Singh">
<title>WebPage Title</title>
@mepsrajput
mepsrajput / data_science.md
Last active November 20, 2021 06:18
Data Science skills
@mepsrajput
mepsrajput / ds_python_notes.md
Last active January 30, 2020 10:52
Personal Notes of Data Science and ML
@mepsrajput
mepsrajput / Data_Set_Operations.md
Last active November 14, 2021 02:59
My SAS Notes

1. Read Raw Data

1.1 Reading ASCII(Text) Data Set

DATA TEMP; 
   INFILE '/folders/myfolders/World Happiness/practice text dataset.txt' firstobs= 2; 
   INPUT @1 ID @5 Name $ 5-17 Location $;
RUN;
PROC PRINT DATA = TEMP;
RUN;
@mepsrajput
mepsrajput / statistics.md
Last active September 13, 2020 09:11
Statistics Notes for Data Science and ML

Exploratory data analysis

anecdotal evidence: Evidence, often personal, that is collected casually rather than by a well-designed study.

population: A group we are interested in studying. “Population” often refers to a group of people, but the term is used for other subjects, too.

cross-sectional study: A study that collects data about a population at a particular point in time.

cycle: In a repeated cross-sectional study, each repetition of the study is called a cycle.

@mepsrajput
mepsrajput / pyspark.md
Last active December 12, 2021 10:43
PySpark Notes

PySpark Sub Packages

  • pyspark.sql module
  • pyspark.streaming module
  • pyspark.ml package
  • pyspark.mllib package

important classes of pyspark.sql package

  • pyspark.sql.SparkSession: Main entry point for DataFrame and SQL functionality.
  • pyspark.sql.DataFrame: A distributed collection of data grouped into named columns.
  • pyspark.sql.Column: A column expression in a DataFrame.
@mepsrajput
mepsrajput / NLP.md
Last active March 29, 2020 16:13
Natural Language Processing

A high-level standard workflow for any NLP project

Text Document -> Text pre-processing -> Text parsing & Exploratory Data Analysis -> Text Representation & Feature Engineering -> Modeling and/or Pattern Mining -> Evaluation & Deployment

NLP Uses

  1. Machine Translation
  2. Speech Recognition
  3. Sentiment Analysis
@mepsrajput
mepsrajput / blogscraping.py
Created April 4, 2020 04:39 — forked from bradtraversy/blogscraping.py
Simple scraping of a blog
import requests
from bs4 import BeautifulSoup
from csv import writer
response = requests.get('http://codedemos.com/sampleblog/')
soup = BeautifulSoup(response.text, 'html.parser')
posts = soup.find_all(class_='post-preview')