Skip to content

Instantly share code, notes, and snippets.

@maheshakya
maheshakya / SimplePrefixTree.py
Created April 8, 2014 12:31
Implements a simple prefix tree. This has a function which accepts an array of values and returns matching indices of that array for a query.
# a: array of hashes
# h: number of hash bits to be considered
import numpy as np
class Node():
""""
Defines a node in the prefix tree
"""
def __init__(self, value):
self.children = list()
import numpy as np
#Re-implementation of bisect functions of bisect module to suit the application
def bisect_left(a, x):
lo = 0
hi = len(a)
while lo < hi:
mid = (lo+hi)//2
if a[mid] < x:
lo = mid + 1
from scipy import sparse as sp
import pandas as pd
import numpy as np
from scipy.sparse.linalg import svds
import pickle
#loads data from movielens data matrix.
#After extracting the compressed file, you will get a ratings.dat file
#movilelens site(http://grouplens.org/datasets/movielens/) has all information you need to read
#here I have used 10M data set
@maheshakya
maheshakya / GetValues.java
Created May 7, 2014 11:41
This code snippet shows how to retrieve values of check list items, states, lifecycle names, paths of resources in the WSO2 Governance Registry
import org.wso2.carbon.governance.api.exception.GovernanceException;
import org.wso2.carbon.registry.core.Resource;
import java.util.Enumeration;
class SomeClass(){
private final String REGISTRY_LC_NAME = "registry.LC.name";
private final String REGISTRY_LIFECYCLE = "registry.lifecycle.";
private final String REGISTRY_CUSTOM_LIFECYCLE_CHECKLIST = "registry.custom_lifecycle.checklist.option.";
private final String REGISTRY_CUSTOM_LIFECYCLE_VOTE = "registry.custom_lifecycle.votes.option." ;
@maheshakya
maheshakya / LSH_forest_hack.py
Last active November 26, 2015 10:52
This is a rough implementation of LSH forest using sorted arrays and binary search for queries. (Still incomplete)
import numpy as np
from sklearn.metrics import euclidean_distances
#Re-implementation of bisect functions of bisect module to suit the application
def bisect_left(a, x):
lo = 0
hi = len(a)
while lo < hi:
mid = (lo+hi)//2
if a[mid] < x:
"""
Dependencies: Python 2.7 or higher, numpy, scikit-learn
"""
import pandas as pd
import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn import metrics
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
@maheshakya
maheshakya / compare_ANN.py
Last active March 1, 2017 09:30
Comparison of indexing, query time and accury among FLANN, ANNOY and LSH Forest
import time
import numpy as np
from sklearn.datasets.samples_generator import make_blobs
from sklearn.neighbors import LSHForest
from sklearn.neighbors import NearestNeighbors
from sklearn.preprocessing import normalize
from annoy import AnnoyIndex
from pyflann import FLANN
n_iter = 100
@maheshakya
maheshakya / dummy_data_sest.csv
Created May 28, 2015 03:56
Spark Linear regression test
6 148 72 35 0 336 627 50 1
1 85 66 29 0 266 351 31 0
8 183 64 0 0 233 672 32 1
1 89 66 23 94 281 167 21 0
0 137 40 35 168 431 2288 33 1
5 116 74 0 0 256 201 30 0
3 78 50 32 88 310 248 26 1
10 115 0 0 0 353 134 29 0
2 197 70 45 543 305 158 53 1
8 125 96 0 0 0 232 54 1
@maheshakya
maheshakya / compare_ANN_v2.py
Last active November 26, 2015 10:52
Comparison of indexing, query time and accury among FLANN, ANNOY and LSH Fores
import time
import numpy as np
from sklearn.datasets.samples_generator import make_blobs
from sklearn.neighbors import NearestNeighbors
from sklearn.neighbors import LSHForest
from annoy import AnnoyIndex
from pyflann import FLANN
n_iter = 5100
n_neighbors = 10
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.