Skip to content

Instantly share code, notes, and snippets.

View suryadutta's full-sized avatar

Surya Dutta suryadutta

View GitHub Profile
@suryadutta
suryadutta / trino_release_notes_scraper.py
Created December 1, 2022 22:23
Scrapes all release notes and links from Trino release notes
import sys
import requests
from bs4 import BeautifulSoup
from dataclasses import dataclass
from typing import List
SECTIONS_TO_SCRAPE = [
"general",
"security",
@suryadutta
suryadutta / salesTaxByState.JSON
Created September 19, 2018 13:32
State Sales Tax Values
[
{
"State": "Alabama",
"Abbreviation": "AL",
"State Tax Rate": 0.04,
"State Tax Rank": 40,
"Local Tax Rate": 0.051,
"Combined Tax Rate": 0.091,
"Combined Rank": 5
},
@suryadutta
suryadutta / Error Bar Log-log
Created March 28, 2018 19:43
Log Log plots for matplotlib
fig = plt.figure()
ax1 = fig.add_subplot(111)
ax1.errorbar(temp,mag,yerr=magstdev,fmt='o')
ax1.set_xscale("log")
ax1.set_yscale("log")
plt.show()
@suryadutta
suryadutta / binning.py
Created March 3, 2018 20:37
How to bin your data
def calculate_and_save_values(Msamp,Esamp,spin,num_analysis,index,temp,data_filename,corr_filename):
try:
M_mean = np.average(Msamp[-num_analysis:])
E_mean = np.average(Esamp[-num_analysis:])
M_std = np.std(Msamp[-num_analysis:])
E_std = np.std(Esamp[-num_analysis:])
M_std_array = []
@suryadutta
suryadutta / corr_pseudo.py
Created February 22, 2018 01:10
pseudocode to calculate nu
correlation_lengths = []
correlation_uncertainties = []
for temp in temp_range:
# read corr file, extract values for each temp
# perform fit for values vs distance, extract correlation length and uncertainty
# append correlation length and uncertainty to arrays
@suryadutta
suryadutta / async_series.js
Last active December 11, 2017 22:07
Async series
async.series({
function(callback) {
//add function here
callback();
},
}, function(err, results) {
response.redirect('');
});
@suryadutta
suryadutta / async.js
Created December 4, 2017 23:12
async.js
router.get('/', function(req, res, next) {
if (req.query.id) {
asyncStuff.series([
function(callback) {
moduleModel.getPageItems(req.query.id, function(Data) {
pageData = Data[0];
console.log('task 1');
callback();
});
},
@suryadutta
suryadutta / async.js
Created December 4, 2017 23:12
async.js
router.get('/', function(req, res, next) {
if (req.query.id) {
asyncStuff.series([
function(callback) {
moduleModel.getPageItems(req.query.id, function(Data) {
pageData = Data[0];
console.log('task 1');
callback();
});
},
@suryadutta
suryadutta / reduceDTM.r
Created November 14, 2017 03:23
reduce size of DTM to make computation faster
reduceDTM <- function(dtm){
term_tfidf <-
tapply(dtm$v/row_sums(dtm)[dtm$i], dtm$j, mean) *
log2(nDocs(dtm)/col_sums(dtm > 0))
dtm <- dtm[,term_tfidf >= median(term_tfidf)]
dtm <- dtm[row_sums(dtm) > 0,]
return(dtm)
}
@suryadutta
suryadutta / vocab_extract.r
Last active November 13, 2017 02:20
Extract 10,000 useful words from CSV
#install.packages('tm')
library(tm)
#install.packages('slam')
library("slam")
#import data
alldata <- read.csv('stackexchange/20161215StatsPostsMerged.csv', header = TRUE, stringsAsFactors = FALSE)
#make corpus