Skip to content

Instantly share code, notes, and snippets.

View Orbifold's full-sized avatar
🍀
Happy. Thinking. Understanding.

Francois Vanderseypen Orbifold

🍀
Happy. Thinking. Understanding.
View GitHub Profile
@Orbifold
Orbifold / TF_linear_estimator.py
Created August 23, 2018 04:59
Using TensorFlow LinearRegression estimator.
#!/usr/bin/env python3
# This demonstrates the usage of input_fn with numpy data
# and estimators.
import tensorflow as tf
tf.enable_eager_execution()
assert tf.executing_eagerly()
import tensorflow.contrib.eager as tfe
# too much info otherwise
@Orbifold
Orbifold / H2OForecasting.r
Created July 23, 2018 04:59
Time series forecasting with H2O.
install.packages("timetk")
install.packages("tidyquant")
library(h2o) # Awesome ML Library
library(timetk) # Toolkit for working with time series in R
library(tidyquant) # Loads tidyverse, financial pkgs, used to get data
beer_sales_tbl <- tq_get("S4248SM144NCEN", get = "economic.data", from = "2010-01-01", to = "2017-10-27")
beer_sales_tbl %>%
ggplot(aes(date, price)) +
# Train Region
@Orbifold
Orbifold / bb84.py
Created July 21, 2018 05:21
Fun and straightforward implementation of the BB84 quantum key distribution protocol.
from numpy import matrix
from math import pow, sqrt
from random import randint
import sys, argparse
class qubit():
def __init__(self,initial_state):
if initial_state:
self.__state = matrix([[0],[1]])
@Orbifold
Orbifold / rsa.py
Created July 20, 2018 17:11
RSA mechanics with pycryptodome
#========================================
# create public and private keys
#========================================
from Crypto.PublicKey import RSA
key = RSA.generate(2048)
private_key = key.exportKey()
with open("./private.pem", "wb") as f:
@Orbifold
Orbifold / tda.py
Created July 15, 2018 04:31
t-Distributed Stochastic Neighbor Embedding (t-SNE) is a (prize-winning) technique for dimensionality reduction that is particularly well suited for the visualization of high-dimensional datasets. The technique can be implemented via Barnes-Hut approximations, allowing it to be applied on large real-world datasets.
import numpy as np
from collections import defaultdict
import json
import itertools
from sklearn import cluster, preprocessing, manifold
from datetime import datetime
class KeplerMapper(object):
def __init__(self, cluster_algorithm=cluster.DBSCAN(eps=0.5,min_samples=3), nr_cubes=10,
overlap_perc=0.1, scaler=preprocessing.MinMaxScaler(), reducer=None, color_function="distance_origin",
@Orbifold
Orbifold / FeatureValuation.ipynb
Last active July 14, 2018 18:35
Using a random forest to valuate features
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@Orbifold
Orbifold / entailment_train.py
Last active June 21, 2018 05:08
Textual entailment training using TensorFlow.
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
import urllib
import sys
import os
import zipfile
glove_vectors_file = "glove.6B.50d.txt"
@Orbifold
Orbifold / tfjs_cosine.html
Created June 12, 2018 09:03
TensorFlow.js learning of the cosine function with realtime loss plot and resulting approximation.
<!DOCTYPE html>
<html lang="en" xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta charset="utf-8" />
<title></title>
<script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.2.1/jquery.min.js"></script>
<script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/lodash.js/4.17.4/lodash.min.js"></script>
<script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/Faker/3.1.0/faker.min.js"></script>
<script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/4.0.0-beta/js/bootstrap.min.js"></script>

Intro

Imbalanced data typically refers to a problem with classification problems where the classes are not represented equally.For example, you may have a 2-class (binary) classification problem with 100 instances (rows). A total of 80 instances are labeled with Class-1 and the remaining 20 instances are labeled with Class-2. This is an imbalanced dataset and the ratio of Class-1 to Class-2 instances is 80:20 or more concisely 4:1. You can have a class imbalance problem on two-class classification problems as well as multi-class classification problems. Most techniques can be used on either.

Most classification data sets do not have exactly equal number of instances in each class, but a small difference often does not matter.

There are problems where a class imbalance is not just common, it is expected. For example, in datasets like those that characterize fraudulent transactions are imbalanced. The vast majority of the transactions will be in the “Not-Fraud” class and a very small minority will be

@Orbifold
Orbifold / SemanticsWithPython.ipynb
Created March 13, 2018 08:23
An intro to using RDFLib and triples in Python
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.