Skip to content

Instantly share code, notes, and snippets.

View karhunenloeve's full-sized avatar
🐍
Parselmouth

karhunenloeve karhunenloeve

🐍
Parselmouth
View GitHub Profile
@karhunenloeve
karhunenloeve / neural_losses.py
Last active March 19, 2022 17:16
Loss functions for neural networks using tensorflow.
import warnings
import tensorflow as tf
import config as cfg
from tensorflow.keras import backend as k
from gtda.homology import VietorisRipsPersistence
from scipy.stats import wasserstein_distance
def dice_loss(y_true, y_pred):
@karhunenloeve
karhunenloeve / ml_tda.md
Last active March 19, 2022 17:16
Short summary of TDA-techniques in ML.

Topological Data Analysis in Machine Learning

In this post we want to focus on the use of topological methods for machine learning. Both the extremely fast online recognition and the processing with unsupervised learning methods in the field of machine learning are based almost exclusively on the optimization of the parameters of a mapping between input and target data. In Literature on Topological Data Analysis we have given an extensive overview of the broad spectrum of literature about TDA. Our focus is to elicit how TDA is used in this very general setting. Reviewing the categorized literature, we find that it manifests itself in three ways in the landscape of machine learning.

Applications of Persistent Homology on Persistent Data

Exploratory data analysis using the Mapper algorithm provides a dimension reduction technique based on theories of TDA and offers a low-dimensional, visually interpretable nerve of a si

@karhunenloeve
karhunenloeve / neural_network.tex
Created March 19, 2022 17:15
TeX code for neural network illustration.
\newcommand{\inputs}{5}
\newcommand{\hiddens}{3}
\newcommand{\outputs}{5}
\begin{tikzpicture}
\foreach \i in {1,...,\inputs}
{
\node[circle,
minimum size = 6mm,
fill=Apricot] (Input-\i) at (0,-\i) {};
@karhunenloeve
karhunenloeve / neural_autoencoder.py
Last active June 8, 2022 14:32
Code for a general class of autoencoders using tensorflow.
import config as cfg
import tensorflow as tf
from tensorflow.keras import Model
from tensorflow.keras.preprocessing import image_dataset_from_directory
from keras.layers import Input, BatchNormalization, Conv2D, Conv2DTranspose, Dense, Add
from collections import OrderedDict
BatchNormalization_settings = cfg.BatchNormalization_settings
Conv_2D_settings = cfg.Conv_2D_settings
@karhunenloeve
karhunenloeve / schema_inference.md
Created March 19, 2022 17:19
Literature and tools on schema inference for databases.

Literature and Tools: Inference of Database Schemas

The problem of extracting a schema from semistructured data is of central importance in the industry of control and monitoring of power plants and power plant components. The Power Plant Identification System (KKS) is a plant identification system for the uniform and systematic identification of systems, facilities and equipment in the electricity and heat supply. However, the listed sensor data are not always consistently assigned to the existing KKS identification system. Furthermore, the KKS is subject to change despite a high degree of standardisation. This creates a data integration problem. The data integration problem is even of international significance, as the same KKS identification system is by no means used for power plants located abroad. In the worst case, these signals cannot be assigned to the known identifiers via their labels. Based on the structure of the data, engineers are currently investigating manually which identifier it is, whic

@karhunenloeve
karhunenloeve / salt_pepper.py
Created March 19, 2022 17:20
Salt and pepper noise layer for neural networks using tensorflow.
import tensorflow as tf
from tensorflow import keras
from keras import backend as K
class SaltAndPepper(keras.layers.Layer):
def __init__(self, units=32, input_dim=32):
super(keras.layers.Linear, self).__init__()
w_init = tf.random_normal_initializer()
self.w = tf.Variable(
@karhunenloeve
karhunenloeve / persistent_homology.md
Created March 19, 2022 17:21
Short survey on persistent homology.

A Brief Survey on Persistent Homology

The indexed family of growing simplicial complexes is known as filtration. It has been shown that persistent homology is – for a d-dimensional simplicial complex – the standard homology over a certain graded module over a polynomial ring [1]. Furthermore, the above mentioned analysis showed that a simple description of the persistent homology of groups over arbitrary fields exists. From this, an algorithm for the computation of persistent homology for any dimensions and over arbitrary fields could be derived [1, §4.2]. The Čech complex is one of the most important simplicial complexes, because it is homotopy equivalent to a cover of a topological space with open balls, or intuitively speaking, it reflects the topology of a triangulable topological space. An efficient algorithm for the Čech complex is given with a runtime of O(nd+3) [2], where n is the number of points and d is the dimension of the simplicial complex. In principle, d depends on the general pos

@karhunenloeve
karhunenloeve / ubuntu1804_cuda110_cudnn804_tf24.sh
Created March 19, 2022 17:23
Install Ubuntu 18.04 with Cuda 11.0, CuDNN 8.04 and Tensorflow 2.4.
#!/usr/bin/env bash
# Get Anaconda 3 for Linux x86_64 architecture.
wget https://repo.anaconda.com/archive/Anaconda3-2020.11-Linux-x86_64.sh
sh Anaconda3-2020.11-Linux-x86_64.sh
# Create Anaconda environment, call it `tf` and activate it.
conda create --name tf --yes python=3.8
conda activate tf
@karhunenloeve
karhunenloeve / geom_inf.md
Created March 19, 2022 17:24
A Book Recommendation on TDA.

A Book Recommendation on TDA

Topological data analysis is already established as a computer science discipline and applies results from the application of homological algebra to the filtration of a point set. The intersection between computer science, mathematics and statistics awakens many techniques that can be applied extremely broadly and across all scientific fields. Therefore, a short note that the book Geometric and Topological Inference by Jean-Daniel Boissonnat, Frédéric Chazal and Mariette Yvinec, provides a complete overview of the current state of the art and algorithmically represents persistent homology, with the required data structures. Furthermore, this book is an elegant introduction to computational topology. It is suitable for an entrance, especially for students with a computer science background, not trained in abstract mathematics.

Now that the NeuRIPS 2020 workshop [Topological Data Analysis and Beyond]

@karhunenloeve
karhunenloeve / ecg_generator.py
Last active April 13, 2022 11:11
Generate automatically ECG-data with associated backgrounds.
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import config as cfg
import neurokit2 as nk
import random
import os
import glob
import uuid