Skip to content

Instantly share code, notes, and snippets.

View suvayu's full-sized avatar

Suvayu Ali suvayu

View GitHub Profile
@ines
ines / streamlit_prodigy.py
Created October 3, 2019 20:37
Streamlit + Prodigy
"""
Example of a Streamlit app for an interactive Prodigy dataset viewer that also lets you
run simple training experiments for NER and text classification.
Requires the Prodigy annotation tool to be installed: https://prodi.gy
See here for details on Streamlit: https://streamlit.io.
"""
import streamlit as st
from prodigy.components.db import connect
from prodigy.models.ner import EntityRecognizer, merge_spans, guess_batch_size
@ines
ines / Install
Last active September 21, 2023 17:14
Streamlit + spaCy
pip install streamlit
pip install spacy
python -m spacy download en_core_web_sm
python -m spacy download en_core_web_md
python -m spacy download de_core_news_sm
@harding
harding / qc-upgrade-path.md
Created July 23, 2018 11:44
Description of Tim Ruffing's upgrade path to post-quantum in presence of QC attackers

Background: future fast Quantum Computers (QCs) are hypothesized to be much faster at solving various forms of the Discrete Log Problem (DLP) than classical computers (e.g. what we use now). Bitcoin uses the DLP in what's called a trapdoor function: a function that's easy to compute one way (a private key generating a public key) but hard to compute the other way (using a public key to recover the original private key). Fast QCs break that trapdoor, hypothetically allowing the operator of the QC to steal the bitcoins from anyone whose public key is publicly known.

@vingkan
vingkan / activity.md
Last active July 5, 2021 17:53
Ethical CS: Quantitative Input Influence Activity

Algorithmic Audit: QII

A big moving company gets so many applications that it has started using an automated algorithm to decide who to hire. You have been called in as an independent consultant to determine if the hiring algorithm is biased against women. The algorithm is proprietary so you cannot access its source code. Instead, you will learn how to perform an algorithmic audit to measure potential biases.

In this activity, you will edit the influence.py module.

Applicant Data

Each applicant's data is stored as a list with five elements. Each element is a string representing a different attribute:

@apehex
apehex / fetch_kindle.js
Last active July 7, 2020 14:47 — forked from yangchenyun/fetch_kindle.js
Get back my books from Kindle
/*
* @fileoverview Program to free the content in kindle books as plain HTML.
*
* This is largely based on reverse engineering kindle cloud app
* (https://read.amazon.com) to read book data from webSQL.
*
* Access to kindle library is required to download this book.
*/
// The Kindle Compression Module copied from http://read.amazon.com application

Generating Flame Graphs for Apache Spark

Flame graphs are a nifty debugging tool to determine where CPU time is being spent. Using the Java Flight recorder, you can do this for Java processes without adding significant runtime overhead.

When are flame graphs useful?

Shivaram Venkataraman and I have found these flame recordings to be useful for diagnosing coarse-grained performance problems. We started using them at the suggestion of Josh Rosen, who quickly made one for the Spark scheduler when we were talking to him about why the scheduler caps out at a throughput of a few thousand tasks per second. Josh generated a graph similar to the one below, which illustrates that a significant amount of time is spent in serialization (if you click in the top right hand corner and search for "serialize", you can see that 78.6% of the sampled CPU time was spent in serialization). We used this insight to spee

@karpathy
karpathy / pg-pong.py
Created May 30, 2016 22:50
Training a Neural Network ATARI Pong agent with Policy Gradients from raw pixels
""" Trains an agent with (stochastic) Policy Gradients on Pong. Uses OpenAI Gym. """
import numpy as np
import cPickle as pickle
import gym
# hyperparameters
H = 200 # number of hidden layer neurons
batch_size = 10 # every how many episodes to do a param update?
learning_rate = 1e-4
gamma = 0.99 # discount factor for reward
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@yangchenyun
yangchenyun / fetch_kindle.js
Last active February 19, 2023 10:10
Get back my books from Kindle
/*
* @fileoverview Program to free the content in kindle books as plain HTML.
*
* This is largely based on reverse engineering kindle cloud app
* (https://read.amazon.com) to read book data from webSQL.
*
* Access to kindle library is required to download this book.
*/
// The Kindle Compression Module copied from http://read.amazon.com application
@alexpearce
alexpearce / sqlite3_timing.py
Created October 27, 2014 10:44
Timing operations on an SQLite database.
import timeit
import os
DB_PATH = 'database.db'
# Number of runs to generate
NRUNS = int(1e6)
insert_setup = """import sqlite3
con = sqlite3.connect('{0}')
con.execute('CREATE TABLE runs (run INTEGER PRIMARY KEY)')