Skip to content

Instantly share code, notes, and snippets.

View dsalaj's full-sized avatar

Darjan Salaj dsalaj

View GitHub Profile
dsalaj /
Created June 26, 2020 08:48
Cheatsheet for pyspark
# filter with strings
# [Row(age=2, name='Alice')]
# order with null values at the end
# [Row(name='Tom'), Row(name='Alice'), Row(name=None)]
# filter by null
dsalaj /
Created March 23, 2020 18:05
Different ways of splitting tensorflow dataset
def split_dataset(ds, version=1):
if version == 1:
train_ds = ds.dataset.shard(num_shards=4, index=0)
train_ds.concatenate(ds.dataset.shard(num_shards=4, index=1))
train_ds.concatenate(ds.dataset.shard(num_shards=4, index=2))
valid_ds = ds.dataset.shard(num_shards=4, index=3)
return train_ds, valid_ds
elif version == 2:
def is_val(x, y):
dsalaj /
Created February 21, 2020 09:15
Example of slurm job script. Start with: sbatch
#SBATCH --job-name=GSC # Job name
#SBATCH --mail-type=END,FAIL # Mail events (NONE, BEGIN, END, FAIL, ALL)
#SBATCH # Where to send mail
#SBATCH --output=slurm_out_%j.log # Standard output and error log
#SBATCH --nodes=1
#SBATCH --exclusive
#SBATCH --partition=IGIcrunchers
conda activate venv2
dsalaj /
Created February 7, 2020 09:47
Example of usage with parametrized generator
import tensorflow as tf
x_train = [i for i in range(0, 20, 2)] # even
x_val = [i for i in range(1, 20, 2)] # odd
y_train = [i**2 for i in x_train] # squared
y_val = [i**2 for i in x_val]
def gen_data_epoch(test=False): # parametrized generator
train_data = x_val if test else x_train
label_data = y_val if test else y_train
dsalaj /
Created November 30, 2019 21:39

Keybase proof

I hereby claim:

  • I am dsalaj on github.
  • I am dsalaj ( on keybase.
  • I have a public key ASBBDtsHlfUOlqCUF48dL0qkNY-lWwrLC2dbOWrHjNMYrwo

To claim this, I am signing this object:

dsalaj /
Created November 13, 2019 08:06
Calculate mean of list of arrays with different lengths (useful for plotting progress of incomplete simulation runs)
import numpy as np
x = [1, 2, 3.5, 4]
y = [1, 2, 3, 3, 4, 5, 3]
z = [7, 8]
arrs = [x, y, z]
def tolerant_mean(arrs):
# arrs = [x, y, z]
lens = [len(i) for i in arrs]
dsalaj /
Created September 16, 2019 08:39
Setup python jupyter notebooks for editing over SSH
# Steps for setting up python jupyter notebook for editing over SSH
# this is not a runnable script as different commands need to be executed on different machines
ssh username@remotepc123
# make sure the jupyter is installed
pip install jupyter
# start jupyter on specified port and no-browser mode
jupyter notebook --no-browser --port=8080
# copy the url with token that looks something like this:
dsalaj /
Last active December 9, 2019 08:26
Crossing Threshold Encoding of pixel values to spikes
def find_onset_offset(y, threshold):
Given the input signal `y` with samples,
find the indices where `y` increases and descreases through the value `threshold`.
Return stacked binary arrays of shape `y` indicating onset and offset threshold crossings.
`y` must be 1-D numpy arrays.
if threshold == 1:
equal = y == threshold
transition_touch = np.where(equal)[0]
# # First create and activate conda python3 environment:
# conda create -n video python=3.6
# conda activate video
# # Then install the requirements:
# conda install ffmpeg
# conda install tensorflow-gpu==1.13.1
# pip install tensorflow_datasets
# # The bellow code would still produce an error because of the missing file ("ucf101_labels.txt")
# # So manually download the "ucf101_labels.txt" and put it in place:
# cd "/home/$USER/anaconda3/envs/video/lib/python3.6/site-packages/tensorflow_datasets/video/"
dsalaj /
Last active February 1, 2019 09:33
shell commands used to extract and downsample video to numpy arrray
# install Anaconda to control the environment:
chmod +x
# answer to the installation prompts
# activate environment and install the required libraries
conda create -n vid2frame
conda activate vid2frame
conda install opencv scipy