Skip to content

Instantly share code, notes, and snippets.

View eladn's full-sized avatar

Elad Nachmias eladn

  • CS Faculty @ Technion
View GitHub Profile
@eladn
eladn / subsets_batch_sampler.py
Created October 18, 2023 10:15
PyTorch batch sampler for non-balanced subsets
__author__ = "Elad Nachmias"
__email__ = "eladnah@gmail.com"
__date__ = "2023-10-18"
import dataclasses
import enum
import itertools
import warnings
from collections import defaultdict, Counter
from typing import Optional, Dict, List, Sequence, Iterable, Union, Set, Tuple
@eladn
eladn / deterministic_structured_hash.py
Last active April 20, 2024 11:59
Python util for calculating a deterministic hash for a structured object
__author__ = "Elad Nachmias"
__email__ = "eladnah@gmail.com"
__date__ = "2023-10-18"
import base64
import functools
import hashlib
from typing import Any, Set, Optional, NamedTuple, Dict
import dataclasses
@eladn
eladn / timeout.py
Last active October 18, 2023 10:20
Python util for efficiently executing function with timeout limitation
__author__ = "Elad Nachmias"
__email__ = "eladnah@gmail.com"
__date__ = "2023-10-18"
import ctypes
import math
import signal
import threading
import time
import traceback
@eladn
eladn / find_permutations.cpp
Last active October 10, 2022 13:56
Algorithm for iterating over all permutations of a given string implemented in C++. Efficient mapping from a permutation index (in [0..n!-1]) to the permutation operator using the Factorials Numbering System. It allows a shallow iteration over all permutations using a single loop (no recursion).
#include <iostream>
#include <sstream>
using namespace std;
void permutations(string str) {
// factorials[i] = i! for i in [0..|str|]
uint64_t factorials[str.length() + 1];
factorials[0] = 1;
for(int i = 1; i <= str.length(); i++) {
@eladn
eladn / pretify_time.py
Created November 9, 2020 12:23
Python prettify time auxiliary function
import math
__all__ = ['format_time', 'find_pretty_time_fmt', 'prettify_time']
possible_time_fmts = ('secs', 'mins', 'hours', 'days', 'months', 'years', 'thousand years', 'million years')
time_fmts_divisors = {
'secs': 1, 'mins': 60, 'hours': 60*60, 'days': 60*60*24, 'months': 60*60*24*30,
'years': 60*60*24*365, 'thousand years': 60*60*24*365*1_000,
'million years': 60*60*24*365*1_000_000}
assert set(time_fmts_divisors.keys()) == set(possible_time_fmts)
@eladn
eladn / scatter_attention.py
Created October 19, 2020 19:24
PyTorch scatter attention module (using torch_scatter)
import torch
import torch.nn as nn
from typing import Optional, Tuple
from torch_scatter import scatter_sum, scatter_softmax
__all__ = ['ScatterAttention']
class ScatterAttention(nn.Module):
@eladn
eladn / weave_tensors.py
Last active October 5, 2020 21:23
PyTorch auxiliary functions to (un)weave tensors
import torch
from typing import List, Tuple, Union
__all__ = ['weave_tensors', 'unweave_tensor']
def weave_tensors(
tensors: Union[List[torch.Tensor], Tuple[torch.Tensor, ...]], dim: int = 0):
assert len(tensors) > 0
@eladn
eladn / async_notify_run.py
Created August 6, 2020 11:57
Python async wrapper for NotifyRun package
import multiprocessing as mp
__all__ = ['AsyncNotifyChannel']
class CloseNotifyProcessAction:
pass
@eladn
eladn / pytorch_chunked_random_access_dataset.py
Last active November 29, 2020 16:58
PyTorch scalable shuffle-friendly Dataset. Supports efficient random example access (given its index). Dataset is automatically stored in chunks to support big datasets. Includes dataset writer.
import os
import io
import torch
import dbm
import itertools
import numpy as np
from warnings import warn
from typing import Optional, Mapping, ByteString
from torch.utils.data.dataset import Dataset
@eladn
eladn / trie.py
Created August 1, 2020 10:42
Generic Trie data structure implemented in python (with additional aux methods and tests)
import functools
import dataclasses
from typing import TypeVar, Generic, List, Tuple, Dict, Iterator
__all__ = ['TrieNode']
SequenceElementType = TypeVar('SequenceElementType')
SequenceType = Tuple[SequenceElementType, ...]