Skip to content

Instantly share code, notes, and snippets.

View persiyanov's full-sized avatar
🌪️
Focusing

Dmitry Persiyanov persiyanov

🌪️
Focusing
View GitHub Profile
@persiyanov
persiyanov / masked_matmul.py
Created February 8, 2019 12:41
pytorch masked matmul with sparse mask
import torch
import torch.autograd
class MaskedSpMatmul(torch.autograd.Function):
CHUNK_SIZE = 10000
@staticmethod
def forward(ctx, a, b, mask):
"""
@persiyanov
persiyanov / results.md
Last active June 22, 2018 12:25
Word2Vec benchmark without _job_producer with CythonLineSentence
----- MODEL "cython-linesentence-word2vec-window-05-workers-01-size-300" RESULTS -----
       * Vocab time: 126.159779072 sec.
       * Total epoch time: 1181.82512498 sec.
       * Processing speed: 144372.118509 words/sec
       * Avg CPU loads: 0.14, 0.35, 5.27, 94.53, 0.09, 0.23, 0.01, 0.02, 0.02, 0.02, 0.02, 0.01, 0.02, 0.02, 0.33, 0.02
       * Sum CPU load: 101.11282
----- MODEL "cython-linesentence-word2vec-window-05-workers-04-size-300" RESULTS -----
       * Vocab time: 126.206352949 sec.
       * Total epoch time: 305.442888975 sec.
@persiyanov
persiyanov / python_memmap.py
Created March 20, 2018 20:23
python_memmap.py
#!/usr/bin/env python
# -*- coding: utf-8 -*-
#
# Copyright (C) 2010 Radim Rehurek <radimrehurek@seznam.cz>
# Licensed under the GNU LGPL v2.1 - http://www.gnu.org/licenses/lgpl.html
"""Corpus in the Matrix Market format.
This code uses python's struct library to read/write binary data
@persiyanov
persiyanov / gan.ipynb
Created January 4, 2017 14:34
MNIST GAN
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@persiyanov
persiyanov / gan.ipynb
Last active December 30, 2016 16:00
Original GAN on MNIST
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@persiyanov
persiyanov / loadsavelasagne.py
Created November 29, 2016 12:07
load/save weights in lasagne network
# Optionally, you could now dump the network weights to a file like this:
np.savez('model.npz', *lasagne.layers.get_all_param_values(network))
#
# And load them again later on like this:
with np.load('model.npz') as f:
param_values = [f['arr_%d' % i] for i in range(len(f.files))]
lasagne.layers.set_all_param_values(network, param_values)
@persiyanov
persiyanov / frozenlake.py
Last active January 2, 2019 12:34
FrozenLake 8x8 Policy Iteration
import gym
import numpy as np
DISCOUNT = 1.0
STEP_REWARD = 0.0
LOSE_REWARD = 0.0
WIN_REWARD = 1.0
def avg_reward(env, s, a):
avg_reward = 0
@persiyanov
persiyanov / BagOfWordsModel.py
Last active July 12, 2022 16:00
Bag of Words model with ability to save in UCI format (useful for using in BigARTM)
import pandas as pd
from sklearn.feature_extraction.text import CountVectorizer
import logging
class BagOfWordsModel(object):
OUT_FOLDER = 'out'
def __init__(self, id_document_dict, max_features=None, max_df=1.0):
"""Builds bow model.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@persiyanov
persiyanov / howto.md
Last active October 21, 2021 15:35
How-to get Amazon EC2 instance and do machine learning on it. Jupyter 4.0.6 server and Python 2.7.

Goal

Want to move computation on machine with much power. We will set up Anaconda 4.0.0 and XGBoost 0.4 (it is tricky installable).

Preliminaries

Let's start

AWS Console and launching EC2 Instance.