Skip to content

Instantly share code, notes, and snippets.

View sudevschiz's full-sized avatar
🎯
Focusing

Sudev sudevschiz

🎯
Focusing
  • Soon enough!
  • Tokyo, Japan
View GitHub Profile
@sudevschiz
sudevschiz / read_from_sql.R
Created December 7, 2018 06:47
How to read directly from .sql files to R. #https://stackoverflow.com/a/44886192/3498721
#Function found in the stackoverflow question.
getSQL <- function(filepath){
con = file(filepath, "r")
sql.string <- ""
while (TRUE){
line <- readLines(con, n = 1)
if ( length(line) == 0 ){
@sudevschiz
sudevschiz / nps_soup.py
Created October 26, 2018 09:47
Code to scrape npsbenchmark.com website
from urllib.request import urlopen
from bs4 import BeautifulSoup
import time
import pandas as pd
import re
def open_list_page(url):
try:
html = urlopen(url).read()
@sudevschiz
sudevschiz / knn_sample.R
Created September 28, 2018 04:31
Simple code for running knn in R
train_vec <- sample(1:nrow(iris),size = 0.7*nrow(iris))
iris[,-length(iris)] <- lapply(iris[,-length(iris)], as.numeric)
tr <- iris[train_vec, -length(iris) ]
tr_label <- iris[train_vec, "Species"]
test <- iris[-train_vec, -length(iris)]
@sudevschiz
sudevschiz / copy_to_gcloud_vm.sh
Created September 4, 2018 07:19
Copy files from local (Windows or linux) to a VM in google cloud compute engine
#Easy way is to navigate to the folder where your file/folders lie and then -
gcloud compute scp FILENAME sudevchirappat@gcmbox:"/home/sudevchirappat/"
#Note : change FILENAME to the file you need upload
#If an entire folder needs to be moved use the recursion flag.
gcloud compute scp --recurse FOLDERNAME sudevchirappat@gcmbox:"/home/sudevchirappat/"
@sudevschiz
sudevschiz / convert_date.R
Created September 4, 2018 07:01
Convert excel time to POSIXct - R Datetime object
#x is the column which has the excel date-time integers.
#Generally looks like 43144 (February 13, 2018)
#Timezone should be the same timezone excel has been opened with. IST in this specific example
as.POSIXct(x*60*60*24, tz = "IST", origin = "1899-12-30")
@sudevschiz
sudevschiz / sentiment_extraction.R
Created August 22, 2018 08:02
Sentiment analysis code
library(coreNLP)
library(qdap)
#Load the textData from csv
textData <- read.csv("comments.csv")
initCoreNLP(libLoc = "C:/Users/schirappat/Documents/R/win-library/3.3/coreNLP/extdata/stanford-corenlp-full-2016-10-31",mem = "8g", type = "english")
#Restart the machine if memory allocation error occurs
@sudevschiz
sudevschiz / twitter_scrape.py
Last active August 10, 2018 11:47
Better OOPed tweets extraction code
import tweepy
import jsonpickle
import json
import unicodecsv as csv
import pandas as pd
import numpy as np
import os
from datetime import date
@sudevschiz
sudevschiz / free_email_provider_domains.txt
Created July 10, 2018 06:25 — forked from tbrianjones/free_email_provider_domains.txt
A list of free email provider domains. Some of these are probably not around anymore. I've combined a dozen lists from around the web. Current "major providers" should all be in here as of the date this is created.
1033edge.com
11mail.com
123.com
123box.net
123india.com
123mail.cl
123qwe.co.uk
150ml.com
15meg4free.com
163.com
@sudevschiz
sudevschiz / ubuntu16_tensorflow_cuda8.sh
Created July 7, 2018 14:43 — forked from ksopyla/ubuntu16_tensorflow_cuda8.sh
How to set up tensorflow with CUDA 8 cuDNN 5.1 in virtualenv with Python 3.5 on Ubuntu 16.04 http://ksopyla.com/2017/02/tensorflow-gpu-virtualenv-python3/
# This is shorthened version of blog post
# http://ksopyla.com/2017/02/tensorflow-gpu-virtualenv-python3/
# update packages
sudo apt-get update
sudo apt-get upgrade
#Add the ppa repo for NVIDIA graphics driver
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt-get update
@sudevschiz
sudevschiz / get_top10_datatable.R
Created June 26, 2018 06:41
Finding a data table metric (Average, Median, SD etc.) of top 10 rows
#This is pretty specific to the use case.
#But a good thing to document this method.
#Hopefully useful sometime in the future
require(data.table)
#dt is the data table available
#First find counts
temp1 <- dt[,.(COUNT = .N),by = Agent]