Skip to content

Instantly share code, notes, and snippets.

View alexeygrigorev's full-sized avatar
:octocat:
Githubbing

Alexey Grigorev alexeygrigorev

:octocat:
Githubbing
View GitHub Profile
@alexeygrigorev
alexeygrigorev / Dockerfile
Created February 15, 2019 12:53
Build libwebp for aws lambda
FROM amazonlinux:2017.03
RUN yum -y install git \
python36 \
python36-pip \
python36-devel \
zip \
gcc \
gcc-c++ \
cmake \
@alexeygrigorev
alexeygrigorev / tqdm_pool.py
Created December 6, 2018 15:36
Track progress of ProcessPoolExecutor with tqdm
from glob import glob
import multiprocessing
from concurrent.futures import ProcessPoolExecutor
import cv2
from PIL import Image
import imagehash
from tqdm import tqdm
@alexeygrigorev
alexeygrigorev / create_python_files.sh
Last active August 16, 2022 15:34
tf.make_tensor_proto
pip install grpcio-tools
wget https://github.com/tensorflow/tensorflow/archive/v1.9.0.zip -O tf-190.zip
unzip tf-190.zip && rm tf-190.zip
wget https://github.com/tensorflow/serving/archive/1.9.0.zip -O tf-serving-190.zip
unzip tf-serving-190.zip && rm tf-serving-190.zip
mv serving-1.9.0/tensorflow_serving tensorflow-1.9.0
@alexeygrigorev
alexeygrigorev / cloudwatch_logs_capturer.py
Created July 18, 2018 12:55
Capturing stdout and sending it to cloudwatch logs
import sys
import time
import threading
from io import StringIO
from multiprocessing import Process
from queue import Queue
import boto3
from botocore.exceptions import ClientError
@alexeygrigorev
alexeygrigorev / mp_capture.py
Created July 16, 2018 08:35
Python stdout sharing between chind & parent processes
import sys
import time
from io import StringIO
import subprocess
from multiprocessing import Process, Pipe
from threading import Thread
@alexeygrigorev
alexeygrigorev / BeanToRecordConverter.java
Created January 17, 2018 13:45
Use reflection to write arbitrary java beans to parquet with Avro
package avro;
import org.apache.avro.Schema;
import org.apache.avro.generic.GenericData;
import org.apache.avro.generic.GenericRecord;
import org.apache.avro.reflect.ReflectData;
import java.util.ArrayList;
import java.util.List;
@alexeygrigorev
alexeygrigorev / bi-kmeans.py
Last active November 16, 2017 20:24
Bisecting K-Means
import heapq
import numpy as np
from sklearn.cluster import KMeans, MiniBatchKMeans
def sklearn_bisecting_kmeans_lineage(X, k, verbose=0):
N, _ = X.shape
labels = np.zeros(N, dtype=np.int)
lineage = np.zeros((k, N), dtype=np.int)
@alexeygrigorev
alexeygrigorev / CountVectorizer.java
Last active February 12, 2017 21:50
Count Vectorizer
import java.io.Serializable;
import java.util.Collections;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import com.google.common.collect.HashMultiset;
import com.google.common.collect.Multiset;
import com.google.common.collect.Multiset.Entry;
import com.google.common.collect.Multisets;
@alexeygrigorev
alexeygrigorev / vimeo-download.py
Created September 17, 2016 09:09
Downloading segmented video from vimeo
import requests
import base64
from tqdm import tqdm
master_json_url = 'https://178skyfiregce-a.akamaihd.net/exp=1474107106~acl=%2F142089577%2F%2A~hmac=0d9becc441fc5385462d53bf59cf019c0184690862f49b414e9a2f1c5bafbe0d/142089577/video/426274424,426274425,426274423,426274422/master.json?base64_init=1'
base_url = master_json_url[:master_json_url.rfind('/', 0, -26) + 1]
resp = requests.get(master_json_url)
content = resp.json()
@alexeygrigorev
alexeygrigorev / bnp-variable-relations.dot
Last active March 27, 2016 11:02
Visualization of linear relationships between numeric variables for BNP Paribas kaggle competition
strict digraph G {
nodesep=1;
center=true; margin=1;
splines=true;
sep=1;
node [height="0.33", width="0.33", fixedsize=true];
edge [len=1.5];
v1 -> v130
v1 -> v131