Skip to content

Instantly share code, notes, and snippets.

View j-min's full-sized avatar

Jaemin Cho j-min

View GitHub Profile
@yoavg
yoavg / stochastic-critique.md
Last active November 9, 2023 04:32
A criticism of Stochastic Parrots

A criticism of "On the Dangers of Stochastic Parrots: Can Languae Models be Too Big"

Yoav Goldberg, Jan 23, 2021.

The FAccT paper "On the Dangers of Stochastic Parrots: Can Languae Models be Too Big" by Bender, Gebru, McMillan-Major and Shmitchell has been the center of a controversary recently. The final version is now out, and, owing a lot to this controversary, would undoubtly become very widely read. I read an earlier draft of the paper, and I think that the new and updated final version is much improved in many ways: kudos for the authors for this upgrade. I also agree with and endorse most of the content. This is important stuff, you should read it.

However, I do find some aspects of the paper (and the resulting discourse around it and around technology) to be problematic. These weren't clear to me when initially reading the first draft several months ago, but they became very clear to me now. These points are for the most part

@thomwolf
thomwolf / top-k-top-p.py
Last active September 10, 2024 05:44
Sample the next token from a probability distribution using top-k and/or nucleus (top-p) sampling
def top_k_top_p_filtering(logits, top_k=0, top_p=0.0, filter_value=-float('Inf')):
""" Filter a distribution of logits using top-k and/or nucleus (top-p) filtering
Args:
logits: logits distribution shape (vocabulary size)
top_k >0: keep only top k tokens with highest probability (top-k filtering).
top_p >0.0: keep the top tokens with cumulative probability >= top_p (nucleus filtering).
Nucleus filtering is described in Holtzman et al. (http://arxiv.org/abs/1904.09751)
"""
assert logits.dim() == 1 # batch size 1 for now - could be updated for more but the code would be less clear
top_k = min(top_k, logits.size(-1)) # Safety check
@InnovArul
InnovArul / tied_linear.py
Last active April 11, 2024 11:01
tied linear layer experiment
import torch, torch.nn as nn, torch.nn.functional as F
import numpy as np
import torch.optim as optim
# tied autoencoder using off the shelf nn modules
class TiedAutoEncoderOffTheShelf(nn.Module):
def __init__(self, inp, out, weight):
super().__init__()
self.encoder = nn.Linear(inp, out, bias=False)
self.decoder = nn.Linear(out, inp, bias=False)
@vadimkantorov
vadimkantorov / compact_bilinear_pooling.py
Last active September 22, 2021 07:51
Compact Bilinear Pooling in PyTorch using the new FFT support
# References:
# [1] Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding, Fukui et al., https://arxiv.org/abs/1606.01847
# [2] Compact Bilinear Pooling, Gao et al., https://arxiv.org/abs/1511.06062
# [3] Fast and Scalable Polynomial Kernels via Explicit Feature Maps, Pham and Pagh, https://chbrown.github.io/kdd-2013-usb/kdd/p239.pdf
# [4] Fastfood — Approximating Kernel Expansions in Loglinear Time, Le et al., https://arxiv.org/abs/1408.3060
# [5] Original implementation in Caffe: https://github.com/gy20073/compact_bilinear_pooling
# TODO: migrate to use of new native complex64 types
# TODO: change strided x coo matmul to torch.matmul(): M[sparse_coo] @ M[strided] -> M[strided]
@VikingPenguinYT
VikingPenguinYT / dropout_bayesian_approximation_tensorflow.py
Last active August 6, 2024 16:09
Implementing Dropout as a Bayesian Approximation in TensorFlow
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
from tensorflow.contrib.distributions import Bernoulli
class VariationalDense:
"""Variational Dense Layer Class"""
def __init__(self, n_in, n_out, model_prob, model_lam):
self.model_prob = model_prob

A Tour of PyTorch Internals (Part I)

The fundamental unit in PyTorch is the Tensor. This post will serve as an overview for how we implement Tensors in PyTorch, such that the user can interact with it from the Python shell. In particular, we want to answer four main questions:

  1. How does PyTorch extend the Python interpreter to define a Tensor type that can be manipulated from Python code?
  2. How does PyTorch wrap the C libraries that actually define the Tensor's properties and methods?
  3. How does PyTorch cwrap work to generate code for Tensor methods?
  4. How does PyTorch's build system take all of these components to compile and generate a workable application?

Extending the Python Interpreter

PyTorch defines a new package torch. In this post we will consider the ._C module. This module is known as an "extension module" - a Python module written in C. Such modules allow us to define new built-in object types (e.g. the Tensor) and to call C/C++ functions.

@ririw
ririw / RWA.py
Created April 16, 2017 10:22
Recurrent Weighted Average RNN in pytorch
# An implementation of "Machine Learning on Sequential Data Using a Recurrent Weighted Average" using pytorch
# https://arxiv.org/pdf/1703.01253.pdf
#
#
# This is a RNN (recurrent neural network) type that uses a weighted average of values seen in the past, rather
# than a separate running state.
#
# Check the test code at the bottom for an example of usage, where you can compare it's performance
# against LSTM and GRU, at a classification task from the paper. It handily beats both the LSTM and
# GRU :)
@volkancirik
volkancirik / treernn.py
Last active October 9, 2018 13:01
Pytorch TreeRNN
"""
TreeLSTM[1] implementation in Pytorch
Based on dynet benchmarks :
https://github.com/neulab/dynet-benchmark/blob/master/dynet-py/treenn.py
https://github.com/neulab/dynet-benchmark/blob/master/chainer/treenn.py
Other References:
https://github.com/pytorch/examples/tree/master/word_language_model
https://github.com/pfnet/chainer/blob/29c67fe1f2140fa8637201505b4c5e8556fad809/chainer/functions/activation/slstm.py
https://github.com/stanfordnlp/treelstm
@teamdandelion
teamdandelion / labels_1024.tsv
Last active February 6, 2024 08:33
TensorBoard: TF Dev Summit Tutorial
We can make this file beautiful and searchable if this error is corrected: No tabs found in this TSV file in line 0.
7
2
1
0
4
1
4
9
5
9
@spitis
spitis / bnlstm.py
Created February 2, 2017 03:05
Batch normalized LSTM Cell for Tensorflow
"""adapted from https://github.com/OlavHN/bnlstm to store separate population statistics per state"""
import tensorflow as tf, numpy as np
RNNCell = tf.nn.rnn_cell.RNNCell
class BNLSTMCell(RNNCell):
'''Batch normalized LSTM as described in arxiv.org/abs/1603.09025'''
def __init__(self, num_units, is_training_tensor, max_bn_steps, initial_scale=0.1, activation=tf.tanh, decay=0.95):
"""
* max bn steps is the maximum number of steps for which to store separate population stats
"""