Skip to content

Instantly share code, notes, and snippets.

View soldni's full-sized avatar
🏳️‍🌈
vibing!

Luca Soldaini soldni

🏳️‍🌈
vibing!
View GitHub Profile
# must install smashed[remote] via `pip install smashed[remote]
from smashed.utils.io_utils import upload_on_success
# Option 1: use upload_on_success as context manager
def run_training():
with upload_on_success("s3://ai2-s2-research/<your handle>/<exp name>") as local_path:
# example if using huggingface trainer
trainer = Trainer(args=TrainingArguments(output_dir=local_path))
trainer.train()
@soldni
soldni / notion.js
Last active January 8, 2024 01:45
A Scriptable widget to display ToDos from a Notion database in a iOS widget.
// Variables used by Scriptable.
// These must be at the very top of the file. Do not edit.
// icon-color: light-gray; icon-glyph: copy;
// follow instructions here https://developers.notion.com/docs/create-a-notion-integration
// for how to configure an integration, get the bearer token, and authorize the integration
// to access a Notion database.
const NOTION_DB_LINK = "https://www.notion.so/[YOUR USERNAME]/[LINK TO DATABASE]"
const BEARER_TOKEN = "Bearer secret_*******************************************"
@soldni
soldni / blingfire-osx-arm64.sh
Created October 28, 2022 06:08
Compile & install blingfire on a Mac with arm64 (Apple Silicon) processor.
#! /usr/bin/env bash
# get script directory
SOURCE="${BASH_SOURCE[0]}"
while [ -h "$SOURCE" ]; do # resolve $SOURCE until the file is no longer a symlink
SCRIPT_DIR="$( cd -P "$( dirname "$SOURCE" )" >/dev/null 2>&1 && pwd )"
SOURCE="$(readlink "$SOURCE")"
# if $SOURCE was a relative symlink, we need to resolve it
# relative to the path where the symlink file was located
[[ $SOURCE != /* ]] && SOURCE="$SCRIPT_DIR/$SOURCE"
import subprocess
import sys
subprocess.check_call([
sys.executable,
"-m",
"pip",
"install",
"spacy",
"blingfire",
@soldni
soldni / env.sh
Created September 29, 2022 04:42
If you need to create a x86-64 conda environment
CONDA_SUBDIR=osx-64 conda create -n <env name> python=<python version>
import numpy as np
from functools import lru_cache
@lru_cache()
def get_tmax(t: str) -> np.ndarray:
return np.array(np.iinfo(t).max, dtype=t)
@lru_cache()
def get_tmin(t: str) -> np.ndarray:
import csv
import json
import pymysql.cursors
import itertools
import tqdm
import re
# Connect to the database
connection = pymysql.connect(host='localhost',
user='ubuntu',
import os
import openai
openai.organization = "YOUR ORG ID HERE"
openai.api_key = "YOUR API KEY HERE"
prompt = '''
Q: Is a housecat a cat or a dino?
A: A housecat is a cat.
Top right corner: Black in | blackinai.github.io
"Black in AI is a place for sharing ideas, fostering collaborations and discussing initiatives
to increase the presence of Black people in the field of Artificial Intelligence. While
artificial intelligence (AI) has the potential to solve an incredible spectrum of problems and
challenges in our lives, our work, and our world, there is a widening disconnect between the
people who are introducing and deploying AI-based solutions and those who set policies for when
and how these solutions are used. We envision a thriving, end-to-end ecosystem sustainably
allocating Black talent to the development of AI through engaging with students, researchers,
and entrepreneurs."
#!/usr/bin/env python
# coding: utf-8
import os
# Configuration settings here
TERRIER_DESTINATION = '/home/ubuntu/wikipedia/pyterrier/ruwiki'
PROCESSED_DATA_PATH = '/home/ubuntu/wikipedia/extracted/ruwiki'
#NUM_RESULTS = 100
NUM_RESULTS = int(os.environ['NUM_RESULTS'])
LANGUAGE_JOIN_CHAR = ' '