Skip to content

Instantly share code, notes, and snippets.

shantanuo /
Created Jul 17, 2020
rewrite the code using pandas
import pandas as pd
import numpy as np
df = pd.read_csv("")
df = df.set_index(list(df.columns[:4]))
df = df.stack().reset_index()
df.columns = ["province", "country", "lat", "lon", "date", "n_death"]
shantanuo / spacy_error.txt
Created Jul 12, 2020
install spacy on ARM processor
View spacy_error.txt
# /root/miniforge3/bin/pip install spacy
Collecting spacy
Using cached spacy-2.3.1.tar.gz (5.9 MB)
Installing build dependencies ... error
ERROR: Command errored out with exit status 1:
command: /root/miniforge3/bin/python3.7 /root/miniforge3/lib/python3.7/site-packages/pip install --ignore-installed --no-user --prefix /tmp/pip-build-env-ahxo0t0p/overlay --no-warn-script-location --no-binary :none: --only-binary :none: -i -- setuptools wheel 'cython>=0.25' 'cymem>=2.0.2,<2.1.0' 'preshed>=3.0.2,<3.1.0' 'murmurhash>=0.28.0,<1.1.0' thinc==7.4.1
cwd: None
Complete output (196 lines):
Collecting setuptools
Downloading setuptools-49.1.2-py3-none-any.whl (789 kB)
shantanuo /
Created Jun 4, 2020
Extract google search history
import pandas as pd
import json
with open("BrowserHistory.json", "r") as read_file:
developer = json.load(read_file)
df = pd.DataFrame(developer["Browser History"])
df["UNIXTIME"] = pd.to_datetime(df["time_usec"], unit="us")
shantanuo / bruhadkosh.txt
Created May 4, 2020
elasticsearch commands for better search results of marathi shabdakosh
View bruhadkosh.txt
DELETE bruhadkosh/
PUT bruhadkosh
{ "mappings": {
"properties": {
"kosh": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
shantanuo / out.txt
Created Apr 19, 2020
npm run develop
View out.txt
> amazon-kinesis-video-streams-webrtc@1.0.4 develop /tmp/amazon-kinesis-video-streams-webrtc-sdk-js
> webpack-dev-server --config
Package version: 1.0.4
Starting type checking service...
ℹ 「wds」: Project is running at http://localhost:3001/
ℹ 「wds」: webpack output is served from /
ℹ 「wds」: Content not from webpack is served from /tmp/amazon-kinesis-video-streams-webrtc-sdk-js/examples
Type checking in progress...
ℹ 「wdm」: Hash: 87997616ccb8b085f416
shantanuo / isitfit.txt
Created Jan 28, 2020
isitfit output for test account
View isitfit.txt
(base) root@080ae74773e0:/# isitfit cost analyze
Profiles in AWS credential file:
- default
(use `AWS_PROFILE=myprofile isitfit ...` or `isitfit command --profile=myprofile ...` to skip this prompt)
Profile to use [default]:
Number of days to lookback (between 1 and 90, use `isitfit cost --ndays=7 ...` to skip this prompt) [7]:
EC2 instances, counting in all regions : 100%|███████████████████████████████████████████████████████████████████████████████| 18/18 [00:10<00:00, 2.49it/s]
Cloudtrail events in all regions : 100%|█████████████████████████████████████████████████████████████████████████████████| 1/1 [00:06<00:00, 6.73s/it]
shantanuo /
Created Dec 11, 2019
Find duplicate strings
import pandas as pd
import numpy as np
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.cluster import KMeans
df = pd.read_excel('final_dupes_all.xlsx', sheet_name = 'all_records')
df.columns = [' xyz', ... ' flg_univ ', ]
df['mylen'] = df.college_name.str.len()
shantanuo /
Created Nov 15, 2019
audit trail query to pandas dataframe
import pandas as pd
import numpy as np
import elasticsearch
from elasticsearch import helpers
myquery = 'your kibana query here...'
es_client = elasticsearch.Elasticsearch(
shantanuo /
Created Oct 28, 2019 — forked from treuille/
This demonstrates the st.cache function
import streamlit as st
import pandas as pd
# Reuse this data across runs!
read_and_cache_csv = st.cache(pd.read_csv)
data = read_and_cache_csv(BUCKET + "labels.csv.gz", nrows=1000)
desired_label = st.selectbox('Filter to:', ['car', 'truck'])
st.write(data[data.label == desired_label])
clf = Pipeline([("dct", DictVectorizer()), ("svc", LinearSVC())])
params = {
"svc__C": [1e15, 1e13, 1e11, 1e9, 1e7, 1e5, 1e3, 1e1, 1e-1, 1e-3, 1e-5]
gs = GridSearchCV(clf, params, cv=10, verbose=2, n_jobs=-1), y)
model = gs.best_estimator_
You can’t perform that action at this time.