Skip to content

Instantly share code, notes, and snippets.

🎯
Focusing

(Bill) Yuchen Lin yuchenlin

🎯
Focusing
View GitHub Profile
View sentence_ppl_calculator.py
from transformers import GPT2Tokenizer, GPT2LMHeadModel
import torch
from torch.nn import CrossEntropyLoss
from tqdm import trange
max_length = 24
batch_size = 200
@yuchenlin
yuchenlin / gpt_sent_prob.py
Last active Apr 24, 2020
Compute sentence probability using GPT-2 with huggingface transformers
View gpt_sent_prob.py
import torch
from transformers import OpenAIGPTTokenizer, OpenAIGPTLMHeadModel
from transformers import GPT2Tokenizer, GPT2LMHeadModel
import numpy as np
from scipy.special import softmax
def model_init(model_string, cuda):
if model_string.startswith("gpt2"):
tokenizer = GPT2Tokenizer.from_pretrained(model_string)
model = GPT2LMHeadModel.from_pretrained(model_string)
@yuchenlin
yuchenlin / masked_word_prediction_bert.py
Last active Feb 17, 2020
A simple example script for predicting masked words in a sentence using BERT.
View masked_word_prediction_bert.py
import torch
from transformers import BertTokenizer, BertModel, BertForMaskedLM
import logging
logging.basicConfig(level=logging.INFO)# OPTIONAL
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertForMaskedLM.from_pretrained('bert-base-uncased')
model.eval()
@chuanconggao
chuanconggao / prefixspan.py
Last active Jun 2, 2020
The original minimal 15 lines implementation of PrefixSpan. Full library at https://github.com/chuanconggao/PrefixSpan-py.
View prefixspan.py
from collections import defaultdict
def frequent_rec(patt, mdb):
results.append((len(mdb), patt))
occurs = defaultdict(list)
for (i, startpos) in mdb:
seq = db[i]
for j in range(startpos + 1, len(seq)):
l = occurs[seq[j]]
@peterjc123
peterjc123 / build.ps1
Last active Nov 12, 2018
Setup script for Windows PyTorch
View build.ps1
# Prerequisites
# 1. MSVC 2017 C++ Build Tools
# 2. CMAKE 3.0 or up
# 3. 64 bits of Windows
# 4. Anaconda / MiniConda 64 bits
# Prerequisites for CUDA
# 1. CUDA 8.0 or up
# 2. NVTX( in CUDA as Visual Studio Integration. if fail to install, you can extract
# the CUDA installer exe and found the NVTX installer under the CUDAVisualStudioIntegration)
View install-gcc-5.4.0.sh
#!/bin/bash
# this script installs GCC 5.4.0
# to use it navigate to your home directory and type:
# sh install-gcc-5.4.0.sh
# download and install gcc 4.9.3
wget https://github.com/gcc-mirror/gcc/archive/gcc-5_4_0-release.tar.gz
tar xzf gcc-5_4_0-release.tar.gz
cd gcc-5_4_0-release
@Tushar-N
Tushar-N / pad_packed_demo.py
Last active May 18, 2020
How to use pad_packed_sequence in pytorch<1.1.0
View pad_packed_demo.py
import torch
import torch.nn as nn
from torch.nn.utils.rnn import pack_padded_sequence, pad_packed_sequence
seqs = ['gigantic_string','tiny_str','medium_str']
# make <pad> idx 0
vocab = ['<pad>'] + sorted(set(''.join(seqs)))
# make model
@WeiTang114
WeiTang114 / nvv.sh
Created Mar 13, 2017
Show username after each process in nvidia-smi.
View nvv.sh
#!/bin/bash
# Show username after each process in nvidia-smi
# like:
# ...
# +------------------------------------------------------+
# | Processes: GPU Memory |
# | GPU PID Type Process name Usage |
# |======================================================|
# | 0 150752 C python 830MiB | User: user1
# | 1 2185 C /usr/bin/python 1090MiB | User: user2
@nathanielove
nathanielove / how-to-setup-shadowsocks-on-your-ubuntu-server.md
Created Nov 1, 2016
How to setup Shadowsocks on your Ubuntu server
View how-to-setup-shadowsocks-on-your-ubuntu-server.md

How to setup Shadowsocks on your Ubuntu server

Your school or company network may block the access to a few specific websites. To solve this problem, I'd highly recommend Shadowsocks, since it is the easiest proxy tool I've ever found, and it's FREE (of course iff you have your own server running).

First, ssh to your server, and make sure you have Python and pip installed. If you have Python but not pip, install it using the following command

$ sudo apt-get install python3-pip
@tmdavid
tmdavid / word_embedding_vis.py
Last active Sep 20, 2019
Visualize word embeddings, using tsne.
View word_embedding_vis.py
"""
Visualize word embeddings, using tsne.
First computes cosine distance of the 100 closests words, and then shows a clustering graph
of the first 11 closest words (the first one is always the word)
IT REQUIRES GLOVE MODEL.txt
line 31: glove_file = '../TBIR/glove.840B.300d.txt' MODIFY with the appropiate path
To Use it, you can just type: python word_embedding_vis.py <list of words space separated>
e.g: python word_embedding_vis.py cake word embedding music
"""
You can’t perform that action at this time.