Skip to content

Instantly share code, notes, and snippets.

View qinhongwei's full-sized avatar

Hongwei Qin qinhongwei

View GitHub Profile
@luighifeodrippe
luighifeodrippe / script.js
Last active February 22, 2024 19:35 — forked from gd3kr/script.js
Download a JSON List of twitter bookmarks
/* Enhancements to the Twitter Scraping Script:
This update to the script introduces a more robust mechanism for extracting detailed interaction data from tweets as they are scraped from Twitter. Previously, the script focused on collecting basic content such as the tweet's text. Now, it has been augmented to include a comprehensive extraction of interaction metrics, including replies, reposts, likes, bookmarks, and views, for each tweet.
Key Changes:
1. Improved Data Extraction:
- The script now searches through all elements within a tweet that have an `aria-label` attribute, filtering for labels that contain key interaction terms (replies, reposts, likes, bookmarks, views). This ensures that only relevant `aria-labels` are considered for data extraction.
2. Flexible Interaction Data Parsing:
@gd3kr
gd3kr / script.js
Created February 15, 2024 06:30
Download a JSON List of twitter bookmarks
/*
the twitter api is stupid. it is stupid and bad and expensive. hence, this.
Literally just paste this in the JS console on the bookmarks tab and the script will automatically scroll to the bottom of your bookmarks and keep a track of them as it goes.
When finished, it downloads a JSON file containing the raw text content of every bookmark.
for now it stores just the text inside the tweet itself, but if you're reading this why don't you go ahead and try to also store other information (author, tweetLink, pictures, everything). come on. do it. please?
*/
@sanchit-gandhi
sanchit-gandhi / whisper_jax_endpoint.py
Last active February 18, 2024 02:54
The Whisper JAX demo can be used as an endpoint through the Gradio Client library. The transcription API takes as input the audio file you want to transcribe, as well as optional arguments such as the task (transcribe or translate) and whether to return timestamps.
from gradio_client import Client
API_URL = "https://sanchit-gandhi-whisper-jax.hf.space/"
# set up the Gradio client
client = Client(API_URL)
def transcribe_audio(audio_path, task="transcribe", return_timestamps=False):

Reinforcement Learning for Language Models

Yoav Goldberg, April 2023.

Why RL?

With the release of the ChatGPT model and followup large language models (LLMs), there was a lot of discussion of the importance of "RLHF training", that is, "reinforcement learning from human feedback". I was puzzled for a while as to why RL (Reinforcement Learning) is better than learning from demonstrations (a.k.a supervised learning) for training language models. Shouldn't learning from demonstrations (or, in language model terminology "instruction fine tuning", learning to immitate human written answers) be sufficient? I came up with a theoretical argument that was somewhat convincing. But I came to realize there is an additional argumment which not only supports the case of RL training, but also requires it, in particular for models like ChatGPT. This additional argument is spelled out in (the first half of) a talk by John Schulman from OpenAI. This post pretty much

@rasbt
rasbt / whisper-audio-to-text.py
Created January 26, 2023 17:46
Transcribes audio files using OpenAI Whisper
# Setup:
# conda create -n whisper python=3.9
# conda activate whisper
# https://github.com/openai/whisper
# pip install git+https://github.com/openai/whisper.git
# Usage:
# python whisper-audio-to-text.py --audio_dir my_files --out_dir texts
import argparse
const SECRET_KEY = ENTER YOUR SECRET KEY HERE;
const MAX_TOKENS = 200;
// For more cool AI snippets and demos, follow me on Twitter: https://twitter.com/_abi_
/**
* Completes your prompt with GPT-3
*
* @param {string} prompt Prompt
* @param {number} temperature (Optional) Temperature. 1 is super creative while 0 is very exact and precise. Defaults to 0.4.
@doctorpangloss
doctorpangloss / install_caffe.sh
Last active September 22, 2021 11:45
Installing Caffe on Mac 10.11.5 and later in the 10.11 series, and 10.12.1 and later in the 10.12 series
#!/bin/sh
# Install brew
/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
# Apple hides old versions of stuff at https://developer.apple.com/download/more/
# Install the latest XCode (8.0).
# We used to install the XCode Command Line Tools 7.3 here, but that would just upset the most recent versions of brew.
# So we're going to install all our brew dependencies first, and then downgrade the tools. You can switch back after
# you have installed caffe.
# Install CUDA toolkit 8.0 release candidate
# Register and download from https://developer.nvidia.com/cuda-release-candidate-download
@bearpaw
bearpaw / Caffe + Ubuntu 12.04 64bit + CUDA 6.5 配置说明.md
Last active March 12, 2020 01:24
Caffe + Ubuntu 12.04 / 14.04 64bit + CUDA 6.5 / 7.0 配置说明

Caffe + Ubuntu 12.04 64bit + CUDA 6.5 配置说明

本步骤能实现用Intel核芯显卡来进行显示, 用NVIDIA GPU进行计算。

1. 安装开发所需的依赖包

安装开发所需要的一些基本包

sudo apt-get install build-essential
sudo apt-get install vim cmake git
sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libboost-all-dev libhdf5-serial-dev
@UniIsland
UniIsland / SimpleHTTPServerWithUpload.py
Created August 14, 2012 04:01
Simple Python Http Server with Upload
#!/usr/bin/env python
"""Simple HTTP Server With Upload.
This module builds on BaseHTTPServer by implementing the standard GET
and HEAD requests in a fairly straightforward manner.
"""
@MohamedAlaa
MohamedAlaa / tmux-cheatsheet.markdown
Last active June 21, 2024 01:45
tmux shortcuts & cheatsheet

tmux shortcuts & cheatsheet

start new:

tmux

start new with session name:

tmux new -s myname