Skip to content

Instantly share code, notes, and snippets.

View lewtun's full-sized avatar
🤫
LLM whispering

lewtun

🤫
LLM whispering
View GitHub Profile
@lewtun
lewtun / datasets-wikiann.ipynb
Last active December 5, 2020 19:02
Snippets to produce dummy data for WikiANN in HF datasets
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@lewtun
lewtun / Makefile
Created August 31, 2020 15:24
Anaconda with make
.PHONY: install
#################################################################################
# GLOBALS #
#################################################################################
SHELL=/bin/bash
CONDA_ACTIVATE=source $$(conda info --base)/etc/profile.d/conda.sh ; conda activate ; conda activate
#################################################################################
# COMMANDS #
@lewtun
lewtun / Makefile
Last active August 13, 2020 20:17
Makefile for nbdev with linting and code formatting
# Copyight 2016 drivendata
# Copyright 2019 fast.ai
# Copyright 2020 Lewis Tunstall
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
# http://www.apache.org/licenses/LICENSE-2.0
sample
class TemplateTransformer(BaseEstimator, TransformerMixin):
""" An example transformer that returns the element-wise square root."""
def __init__(self, demo_param='demo'):
self.demo_param = demo_param
def fit(self, X, y=None):
"""A reference implementation of a fitting function for a transformer."""
X = check_array(X, accept_sparse=True)
self.n_features_ = X.shape[1]
def cv_model(X, y, features, n_fold=5, random_state=45245, params=None):
"""Evaluate a score by cross validation.
Parameters
----------
X : pandas.DataFrame
The data to fit.
y : pandas.DataFrame or pandas.Series
import pandas as pd
# define topological features to track
homology_dimensions = [0, 1, 2]
# calculate persistence diagram
persistence_diagram = ...
# convert NumPy array of triples to DataFrame
persistence_table = pd.DataFrame(
import gtda.diagrams as diagrams
# calculate persistence diagram
persistence_diagram = ...
# define type of amplitude to calculate
amplitude = diagrams.Amplitude(metric="wasserstein")
# calculate amplitude of diagram
persistence_diagram_amplitude = amplitude.fit_transform(persistence_diagram)
import gtda.homology as hl
# represent data as a matrix of pairwise distances
distance_matrix = ...
# define topological features to track
homology_dimensions = [0, 1, 2]
# define simplicial complex to construct
persistence = hl.VietorisRipsPersistence(