Skip to content

Instantly share code, notes, and snippets.

@Mistobaan
Mistobaan / merge_peft.py
Created February 1, 2024 05:12 — forked from mlabonne/merge_peft.py
Merge base model and peft adapter and push it to HF hub
# Example usage:
# python merge_peft.py --base_model=meta-llama/Llama-2-7b-hf --peft_model=./qlora-out --hub_id=alpaca-qlora
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
import argparse
def get_args():
@Mistobaan
Mistobaan / deploy_dolly_v2.ipynb
Created April 29, 2023 18:08 — forked from timesler/deploy_dolly_v2.ipynb
Deploy Dolly v2.0 to SageMaker
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
from graphviz import Digraph
import torch
from torch.autograd import Variable, Function
def iter_graph(root, callback):
queue = [root]
seen = set()
while queue:
fn = queue.pop()
if fn in seen:
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@Mistobaan
Mistobaan / decomposepromptsimple.ipynb
Created August 13, 2020 07:05 — forked from brockmanmatt/decomposepromptsimple.ipynb
DecomposePromptSimple.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@Mistobaan
Mistobaan / introtologprobs.ipynb
Created August 13, 2020 07:03 — forked from brockmanmatt/introtologprobs.ipynb
introToLogProbs.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@Mistobaan
Mistobaan / defaultapinotebook.ipynb
Created August 13, 2020 07:01 — forked from brockmanmatt/defaultapinotebook.ipynb
defaultapinotebook.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@Mistobaan
Mistobaan / Install
Created June 8, 2020 00:48 — forked from ines/Install
Streamlit + spaCy
pip install streamlit
pip install spacy
python -m spacy download en_core_web_sm
python -m spacy download en_core_web_md
python -m spacy download de_core_news_sm

05/05/2018

2018: Speech2Vec: A Sequence-to-Sequence Framework for Learning Word Embeddings from Speech

Projects audio files that contains one word of speech into a hyper-dimension space just like Word2Vec. Uses "Force Aligment" to split audio into words (which requires text). Pad the audio segments with zeros, do MFCC, feed into encoder-decoder which uses RMSE. They also add noise to the signal and make the network denoise it. LibriSpeech 500 hour of audio. Not sure how it can incorporated in an ASR or TTS systems. The audio file has to be paired with a text otherwise Speech2Vec cannot split the audio file into words using "Forced Alignment" method. It is used to query if the spoken word is similar to an existing word in the corpus.

2016: Neural Machine Translation of Rare Words with Subword Units (BPE)

BPE data compression tool that combines most frequent pair of bytes with one. It works well with Named Entity, loadwords and morphologically complex words. Handles OOVs well and rare words. You can