Skip to content

Instantly share code, notes, and snippets.

View greed2411's full-sized avatar
:octocat:
chilling at some local minima ...

Jaivarsan greed2411

:octocat:
chilling at some local minima ...
View GitHub Profile
@greed2411
greed2411 / profile_malloc.py
Last active November 7, 2023 06:26
finding memory allocations (and memory leak) in python source code, how many times, size in bytes with tracemalloc
"""
Python script to find the memory allocations
i. over a module/function that is running
ii. between the loops
It is a script so that it can be torn apart for your own use-cases.
Based on this nice article on trying to find memory leaks in python:
https://www.fugue.co/blog/diagnosing-and-fixing-memory-leaks-in-python.html
But the code is non-existent and isn't reproducible as of 2023.
@greed2411
greed2411 / circular_buffer.cpp
Created July 29, 2021 14:30 — forked from edwintcloud/circular_buffer.cpp
Circular Buffer in C++
//===================================================================
// File: circular_buffer.cpp
//
// Desc: A Circular Buffer implementation in C++.
//
// Copyright © 2019 Edwin Cloud. All rights reserved.
//
//===================================================================
//-------------------------------------------------------------------
@greed2411
greed2411 / circular_buffer.cpp
Created July 29, 2021 14:30 — forked from edwintcloud/circular_buffer.cpp
Circular Buffer in C++
//===================================================================
// File: circular_buffer.cpp
//
// Desc: A Circular Buffer implementation in C++.
//
// Copyright © 2019 Edwin Cloud. All rights reserved.
//
//===================================================================
//-------------------------------------------------------------------
@greed2411
greed2411 / letters.txt
Created July 7, 2021 10:59
Split, Iterate & Sort on Tamil Strings using Python3
@greed2411
greed2411 / random_sorted_ints_generator.py
Created July 5, 2021 09:05
Want to generate sorted random numbers optimally? Here you go.
"""
Want to generate sorted random numbers optimally? Here you go.
Traditionally if you wanted a list of sorted random numbers,
you generate `n` random numbers and then sort them. which at worst
case has:
Time Complexity : O(n log n) # for generating & sorting.
Space Complexity: O(n) # for storing all of them in array/set.
@greed2411
greed2411 / random_wo_replacement_generator.py
Last active July 4, 2021 20:21
Python Bloom Filter based random-int-without-replacement-generator-given-a-range.
"""
This gist comes out of frustration that I couldn't have a
random-int-without-replacement-generator-given-a-range.
random.sample, and np.random.choice(replace=False) both fail on really large numbers.
Python crashes saying OOM & segfaults.
Problem was for smaller `n` (<5 million) they optimized for linear/sub-linear
times and linear storage (set, pool-tracking list).
@greed2411
greed2411 / main.py
Last active June 23, 2021 05:16
using asyncio.Queue (from producer & consumer pov) as placeholder to golang-like channels in Python.
"""
Trying to find a way to handle channels-like situations (producer->consumer) in Python
using asyncio.Queue.
borrowed code actually from:
https://web.archive.org/web/20210507031446/https://asyncio.readthedocs.io/en/latest/producer_consumer.html
"""
import asyncio
import random
@greed2411
greed2411 / shap_tree_explainer_xgboost.ipynb
Created January 8, 2020 16:14
shap_tree_explainer_xgboost
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@greed2411
greed2411 / Install NVIDIA Driver and CUDA.md
Created November 24, 2018 00:42 — forked from chaiyujin/Install NVIDIA Driver and CUDA.md
Install NVIDIA CUDA 9.0 on Ubuntu 16.04.4 LTS
@greed2411
greed2411 / distributed-shortest-path-note.md
Created November 20, 2018 08:55 — forked from srirambaskaran/distributed-shortest-path-note.md
A note on implementing community detection using Apache Spark + GraphX

A note on implementing community detection using Apache Spark + GraphX

Girvan Newman Algorithm

This is one of the earliest methods of community detection. This method is simple to understand and can be easily distributed across clusters for faster processing. Key assumption is that the graph is undirected and unweighted. But it is not hard to extend to directed graphs + weighted edges.

The algorithm is fairly straightforward. It defines a new measure called edge betweenness centrality based on which a divisive hierarchical clustering algorithm is designed to find communities. The stopping criteria for this uses a popular metric called modularity which quantifies how cohesive the communities are during the clustering process.

Side note: A bit of search reveled no implementation of this algorithm in a distributed way (mainly because its slow and better algorithms are available?). So, this note would pave way to use this naive algorithm inspite of its high time complexity.