Skip to content

Instantly share code, notes, and snippets.

View parvathysarat's full-sized avatar
🥱

Parvathy Sarat parvathysarat

🥱
View GitHub Profile
@parvathysarat
parvathysarat / bert_qa.ipynb
Created July 22, 2020 19:16
BERT_QA.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@parvathysarat
parvathysarat / ulmfit_classify.ipynb
Last active July 19, 2020 21:25
ULMFiT_classify.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@parvathysarat
parvathysarat / blogpost_generation.ipynb
Created March 12, 2019 18:11
Blogpost_generation.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@parvathysarat
parvathysarat / BlightViolation_ComplianceModel.py
Last active May 17, 2018 09:11
The task at hand is to predict whether a given blight ticket will be paid on time. Blight violations are issued by the city of Detroit to individuals who allow their properties to remain in a deteriorated condition. Data has been obtained from Detroit Open Data Portal.
# coding: utf-8
import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_auc_score
from datetime import datetime
df=pd.read_csv("train.csv")
@parvathysarat
parvathysarat / RandomForest_final.py
Last active March 28, 2020 19:35
This Random Forest model was developed to detect fraud in the login access of a Firewall server. A two-way classification.
import glob,struct,os
import pandas as pd
import numpy as np
#names of the columns
names=["Timestamp","Customer ID","Host","Log file","Log sequence no.","Entry type","Entry identifier","User,if","Reporting IP/host","Source IP,if","Source port,if","Destination IP, if","Destination Port, if","Text field1","Text field2","Text field3","Numeric field1","Numeric field2"]
# defining path to the dataset folder
path=r'C:/Users/PARVATHY SARAT/Desktop/FIREWALL'
@parvathysarat
parvathysarat / Scraping.py
Last active July 4, 2018 10:35
Using Google Maps API to scrape details of safe spots in cities. Google allows us to scrape results off the first 3 pages of its search results. Number of call requests are also restricted per key.
import requests
import pandas as pd
key= " #key "
#iteration to get ids
i=0
#iterate twice, get all the data of 60 search results from 3 pages. Google restricts number of results
#that can be scraped to first three pages of search results
while(i<=2) :
if (i==0):
@parvathysarat
parvathysarat / Income Classification_DecisionTreeClassifier.py
Created September 18, 2017 08:51
Analytics Vidhya workshop problem - Classify income as < or >=50K - accuracy 0.8051716725016891
import pandas as pd
train=pd.read_csv("train.csv")
test=pd.read_csv("test.csv")
train.dtypes
#continuous variables
train.describe()
#categorical variables
categorical=train.dtypes.loc[train.dtypes=="object"].index
categorical
@parvathysarat
parvathysarat / Spot Checking Classifier
Last active September 18, 2017 03:12
Spot checking Decision Tree Classifier for firewall logs. Original data (18 columns) trimmed to 10.
# CART Classification
import pandas as pd
from sklearn import model_selection
from sklearn.tree import DecisionTreeClassifier
dataframe = pd.read_csv("data.csv", names=['ID', 'No.', 'Smth', 'Number', 'Count', 'Count2', 'UDP/TCP', 'RandomNo',
'IP', 'AUDIT/ALLOW/BLOCK'])
array = dataframe.values
X = array[:,0:9]
Y = array[:,9]
seed = 7