Skip to content

Instantly share code, notes, and snippets.

View rmitsch's full-sized avatar

Raphael Mitsch rmitsch

View GitHub Profile
@rmitsch
rmitsch / google_dataproc_init_action.sh
Created March 25, 2019 13:44
Custom Google dataproc initialization actions.
apt-get install mysql-server
@rmitsch
rmitsch / distance_matrix_umap_bug.py
Created July 17, 2018 13:10
Exemplary distance matrix for UMAP bug
high_dim_data = [
[ 0. 3.49753522 3.02660794 2.84250465 3.56685389 2.52528988
2.55323161 2.40778388 2.96434891 2.54889821 2.77240632 3.4098677
3.37132917 3.88512522 3.4479446 3.07118824 2.72022264 3.0514949
3.55990871 2.49992525 1.28789316 3.64068884 2.43015883 3.49005344
3.28969545 4.90884015 3.35593523 3.94372099 2.94004997 2.60276568
3.61007435 3.02692458 3.00512055 3.41895422 3.0188693 2.81568659
3.37344153 3.79362238 3.87271647 2.6809203 1.8798772 4.29976882
2.70462412 3.88317653 2.75354735 3.38018601 2.72213461 2.5788172
2.61540864 2.62093943 3.97909614 2.92664878 2.69721918 2.71499576
from pyspark import SparkConf
from pyspark import SparkContext
import numpy.random as rnd
import numpy as np
import os
# os.environ["SPARK_HOME"] = "/usr/local/Cellar/apache-spark/1.5.1/"
os.environ["PYSPARK_PYTHON"] = "/home/raphael/Development/datamining/py3env/bin/python3"
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
import sys
import numpy
import matplotlib.pyplot as plt
def apply_hough_transformation(data, num_dims, alpha_values):
"""
Maps data points to parameter functions.
:param data:
:param num_dims:
import sys
import numpy as np
import matplotlib.pyplot as plt
from numpy.linalg import eigh
import math
# reading csv Data into a File
def read_data(file):
f = open( file, 'r')
data = []
@rmitsch
rmitsch / 3_1_PreDeCon.py
Last active November 7, 2017 12:05
3_1_PreDeCon
"""
Exercise 3-1 for course Data Mining at University of Vienna, winter semester 2017.
Note that this code implementing some definitions for PreDeCon serves demonstration purposes and is not optimized for
performance.
"""
import numpy
import math
@rmitsch
rmitsch / 2_3_feature_selection.py
Last active October 18, 2017 13:37
Implementation of exercise 2-3 for VU Data Mining at University of Vienna.
import numpy
import scipy
import scipy.io.arff
import sklearn.feature_selection as skfs
from sklearn.cluster import KMeans
from sklearn.metrics.cluster import normalized_mutual_info_score
def arff_to_ndarray(path_to_arff_file):
"""