Skip to content

Instantly share code, notes, and snippets.

View aligusnet's full-sized avatar

Alexander Ignatyev aligusnet

View GitHub Profile
@aligusnet
aligusnet / ncdc.sh
Last active December 11, 2021 19:56
Download a weather dataset from the National Climatic Data Center (NCDC, http://www .ncdc.noaa.gov/). Prepare it for examples of "Hadoop: The Definitive Guide" book by Tom White. http://www.amazon.com/Hadoop-Definitive-Guide-Tom-White/dp/1449311520 Usage: ./ncdc.sh 1901 1930 # download wheather datasets for period from 1901 to 1930.
#!/usr/bin/env bash
# global parameters
g_tmp_folder="ncdc_tmp";
g_output_folder="ncdc_data";
g_remote_host="ftp.ncdc.noaa.gov";
g_remote_path="pub/data/noaa";
@aligusnet
aligusnet / logical_cpp.cpp
Last active May 31, 2020 10:27
Logical C++
// alignas C++11, C++14
class alignas(32) Foo {...}
// address &foo is aligned to 32-byes boundary
Foo foo{};
// it is not guarantee that address pFoo is aligned to 32-byes boundary
Foo* pFoo = new Foo();
@aligusnet
aligusnet / Enable-Python.ps1
Created February 18, 2019 22:06
Enable-Python script allows to switch between differen versions of Python installed on Windows machine
# copy the code to $PROFILE.CurrentUserAllHosts
# and restard PowerShell.
# Now you can switch between different versions of Python using EnablePython <Version>
# or short aliases: py36, py37, py38
Function Get-OriginalPath {
if (-not $env:ORIGINAL_PATH) {
$env:ORIGINAL_PATH = $env:PATH
}
return $env:ORIGINAL_PATH
@aligusnet
aligusnet / svd.py
Last active June 30, 2018 11:57
SVD using Python
"""
Principal component analysis (PCA)
>>> np.random.seed(111)
>>> m = np.array([[3, 2], [2, 6]])
>>> frobenius_norm(m)
7.280109889280518
>>> power_iter(m, np.ones((2,1)))
@aligusnet
aligusnet / multi_threading.cpp
Created April 24, 2017 13:16
C++ multi-threading crash course.
#include <iostream>
#include <future>
#include <condition_variable>
#include <mutex>
#include <thread>
#include <queue>
// One Way Channel a.k.a. Message Queue
template <typename T>
@aligusnet
aligusnet / processOutputMulti.hs
Created January 21, 2017 11:37
Process outputs for Multiclass Classification.
import qualified Data.Vector.Storable as V
import qualified Numeric.LinearAlgebra as LA
-- | Process outputs for Multiclass Classification.
-- Takes number of labels and output vector y.
-- Returns matrix of binary outputs (One-vs-All Classification).
-- It is supposed that labels are integerets start at 0.
processOutputMulti :: Int -> Vector -> Matrix
processOutputMulti numLabels y = LA.fromColumns $ map f [0 .. numLabels-1]
@aligusnet
aligusnet / img_lib.sh
Last active December 27, 2015 22:59
The script creates a web library of images
#!/usr/bin/env bash
IFS="$(printf '\n\t')" # eliminate whitespace in pathnames
# function real_path (path)
function real_path () {
[[ $1 = /* ]] && echo "$1" || echo "$PWD/${1#./}"
}
# function build_href (caption, url)
@aligusnet
aligusnet / intersect.py
Created October 6, 2013 06:40
Finding the points of intersecion of two circles.
#!/usr/bin/env python
# Finding the points of intersecion of two circles.
import math
class Point:
def __init__(self, x = 0, y = 0):
self.x = x
self.y = y
@aligusnet
aligusnet / hadoop_pipes.sh
Created September 8, 2013 13:24
Build Hadoop pipes for OS X 10.8. In case of pipes linker error "Cannot find libssl.so" try to replace line #4744 and #4808 and in configure LIBS="-lssl $LIBS" to LIBS="-lcrypto $LIBS". Don't forget to add -lcrypto if you build an app with -lhadooppipes.
#!/usr/bin/env bash
export PLATRORM=darwin-amd64-64
# build and install utils
cd $HADOOP_INSTALL/src/c++/utils
chmod +x configure install-sh
./configure --prefix=$HADOOP_INSTALL/c++/$PLATFORM
make install