Skip to content

Instantly share code, notes, and snippets.

View mepsrajput's full-sized avatar
🎯
Focusing

Pradeep Singh mepsrajput

🎯
Focusing
View GitHub Profile
@mepsrajput
mepsrajput / index.html
Created July 21, 2018 19:50
HTML5_Basic_Markup_Structure_&_CSS_Reset
<!doctype html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="description" content="Website Name">
<meta name="author" content="Pradeep Singh">
<title>WebPage Title</title>
@mepsrajput
mepsrajput / ds_python_notes.md
Last active January 30, 2020 10:52
Personal Notes of Data Science and ML
@mepsrajput
mepsrajput / web_development_resources.md
Last active February 26, 2020 03:29
Free Online Resources for Web Developers
@mepsrajput
mepsrajput / NLP.md
Last active March 29, 2020 16:13
Natural Language Processing

A high-level standard workflow for any NLP project

Text Document -> Text pre-processing -> Text parsing & Exploratory Data Analysis -> Text Representation & Feature Engineering -> Modeling and/or Pattern Mining -> Evaluation & Deployment

NLP Uses

  1. Machine Translation
  2. Speech Recognition
  3. Sentiment Analysis
@mepsrajput
mepsrajput / blogscraping.py
Created April 4, 2020 04:39 — forked from bradtraversy/blogscraping.py
Simple scraping of a blog
import requests
from bs4 import BeautifulSoup
from csv import writer
response = requests.get('http://codedemos.com/sampleblog/')
soup = BeautifulSoup(response.text, 'html.parser')
posts = soup.find_all(class_='post-preview')

1. Processing A Line of Text

Import the English language class
from spacy.lang.en import English

# Create the nlp object
nlp = English()

MongoDB Cheat Sheet

Show All Databases

show dbs

Show Current Database

@mepsrajput
mepsrajput / as.md
Last active May 5, 2020 11:39
assignment
  • ACCOUNT_TABLE; Data ACCOUNT_TABLE;

    infile DATALINES delimiter=','; INPUT FirstName $ LastName $ Age Gender $;

    DATALINES; x,y,23,Male z,w,45,Female a,b,64,Male

@mepsrajput
mepsrajput / big_data.md
Last active May 30, 2020 10:26
Big Data

Hadoop Vocabulary

Here is a list of some terms associated with Hadoop. You'll learn more about these terms and how they relate to Spark in the rest of the lesson.

  • Hadoop - an ecosystem of tools for big data storage and data analysis. Hadoop is an older system than Spark but is still used by many companies. The major difference between Spark and Hadoop is how they use memory. Hadoop writes intermediate results to disk whereas Spark tries to keep data in memory whenever possible. This makes Spark faster for many use cases.
  • Hadoop MapReduce - a system for processing and analyzing large data sets in parallel.
  • Hadoop YARN - a resource manager that schedules jobs across a cluster. The manager keeps track of what computer resources are available and then assigns those resources to specific tasks.
  • Hadoop Distributed File System (HDFS) - a big data storage system that splits data into chunks and stores the chunks across a cluster of computers.

As Hadoop matured, other tools were developed t