Skip to content

Instantly share code, notes, and snippets.

View jlumbroso's full-sized avatar

Jérémie Lumbroso jlumbroso

View GitHub Profile
@muhark
muhark / hf_model_downloader.md
Last active June 20, 2024 05:37
Offline HuggingFace Models on HPC

Downloading HuggingFace Models

This gist shares a little workflow and script for a task that most people using university HPCs for NLP research will need to do: downloading and storing HuggingFace models for use on compute nodes.

What this workflow is for:

  • Context: you want to use HuggingFace models on Della (or other HPC clusters).
  • Problem 1: you cannot call AutoModel.from_pretrained('model/name') at run time because compute nodes are not connected to the internet.
  • Problem 2: running AutoModel.from_pretrained() on the head node is impractical because the model is too large to be loaded.
  • Problem 3: you do not want to save the model weights to the default ~/.cache/ because you only get 10GB of storage on /home
@jflam
jflam / app.py
Created January 16, 2023 02:55
Citations needed
# To run you'll need some secrets:
# 1. SERPAPI_API_KEY secret in env var - get from https://serpapi.com/
# 2. OPENAI_API_KEY secret in env var - get from https://openai.com
import streamlit as st
import json, os
from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI
from serpapi import GoogleSearch
@josephlou5
josephlou5 / get_file_history.py
Last active December 1, 2022 02:23
Gets the history of a specific file in all the commits of a repository
"""
get_file_history.py
Gets the history of a specific file in all the commits of a repo.
GitPython: https://gitpython.readthedocs.io/en/stable/index.html
"""
# ==============================================================================
import json
@camtheman256
camtheman256 / export_when2meet.js
Created February 28, 2021 22:16
Export when2meet data from JS console
function exportData() {
const peopleMap = {};
for(let i = 0; i < PeopleIDs.length; i++) {
peopleMap[PeopleIDs[i]] = PeopleNames[i];
}
nameAtSlot = AvailableAtSlot.map(e => e.map(i => peopleMap[i]));
timedNames = TimeOfSlot.map((e, i) => [e, nameAtSlot[i]]);
return JSON.stringify(timedNames);
}
@akash-ch2812
akash-ch2812 / Marking_ROI.py
Last active January 17, 2024 06:11
Python code for marking regions of interest in an image for OCR
# use this command to install open cv2
# pip install opencv-python
# use this command to install PIL
# pip install Pillow
import cv2
from PIL import Image
def mark_region(imagE_path):
@fedarko
fedarko / gh_url_to_raw_gh_url.py
Created October 2, 2019 22:10
Convert a github file URL to a raw.githubusercontent.com URL (that can be directly accessed for things like view.qiime2.org or wget)
# your link goes here
link = "https://github.com/knightlab-analyses/qurro-mackerel-analysis/blob/master/AnalysisOutput/qurro-plot.qzv"
# note: this will break if a repo/organization or subfolder is named "blob" -- would be ideal to use a fancy regex
# to be more precise here
print(link.replace("github.com", "raw.githubusercontent.com").replace("/blob/", "/"))
# example output link:
# https://raw.githubusercontent.com/knightlab-analyses/qurro-mackerel-analysis/master/AnalysisOutput/qurro-plot.qzv
@jlumbroso
jlumbroso / SimpleColorLogging.py
Created April 21, 2019 22:49
Short snippet showing how to have colored terminal logging output in Python.
import os
import logging
class _Color:
PURPLE = '\033[95m'
CYAN = '\033[96m'
DARKCYAN = '\033[36m'
BLUE = '\033[94m'
GREEN = '\033[92m'
@gstorer
gstorer / PDF_extract_images.py
Created August 1, 2018 10:15
Extract images from a PDF file using Python, Pillow (PIL) and PyPDF2
# coding=utf-8
from __future__ import print_function
"""
The MIT License (MIT)
Copyright (c) 2018 Louis Abraham <louis.abraham@yahoo.fr>
Copyright ©2016 Ronan Paixão
Copyright (c) 2018 Gerald Storer
\x1B[34m\033[F\033[F
@thackerronak
thackerronak / AESHelper.java
Last active June 16, 2024 18:56 — forked from armanso/AES.java
AES encryption/decryption in crypto-js way, use KDF for generating IV and Key, use CBC with PKCS7Padding for Cipher
import com.sun.jersey.core.util.Base64;
import java.io.UnsupportedEncodingException;
import java.security.InvalidAlgorithmParameterException;
import java.security.InvalidKeyException;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
import java.security.SecureRandom;
import java.util.Arrays;
import java.util.Random;
import javax.crypto.BadPaddingException;
Process for setting up github pages with namecheap domain.
1. Go to namecheap.com, select and buy domain name.
2. Login to namecheap, go to username drop down and select dashboard.
3. Go to DomainList
4. Click manage button
5. Click Advanced DNS tab
6. Click add record and add three records:
Type: A Record | Host: @ | Value: 192.30.252.153 | TTL: Automatic