Skip to content

Instantly share code, notes, and snippets.

View GeorgeDittmar's full-sized avatar
💭
Working on something probably

George Dittmar GeorgeDittmar

💭
Working on something probably
  • Portland
View GitHub Profile
@GeorgeDittmar
GeorgeDittmar / MarkovChain.py
Last active July 13, 2019 05:18
Markov Chain Text generator
import random
import string
class MarkovModel:
def __init__(self):
self.model = None
def learn(self,tokens,n=2):
@GeorgeDittmar
GeorgeDittmar / MarkovModelSpark.py
Last active July 13, 2019 05:30
Spark based MarkovChain
from pyspark.ml.feature import NGram
import PreProcess
import random
class MarkovModelSpark:
def __init__(self, spark_session, n=2):
self.spark_session = spark_session
@GeorgeDittmar
GeorgeDittmar / setup_dataset.py
Created December 19, 2020 05:57
Code to generate the training and eval scripts
"""
Now load the data line by line
"""
from sklearn.model_selection import train_test_split
with open('<path to text file>', 'r') as data:
dataset = ["<|title|>" + x.strip() for x in data.readlines()]
train, eval = train_test_split(dataset, train_size=.9, random_state=2020)
print("training size:" + len(train))
# setup imports to use the model
from transformers import TFGPT2LMHeadModel
from transformers import GPT2Tokenizer
model = TFGPT2LMHeadModel.from_pretrained("<path to model directory>", from_pt=True)
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
input_ids = tokenizer.encode("Some text to encode", return_tensors='tf')
@GeorgeDittmar
GeorgeDittmar / generate_text.py
Created December 24, 2020 09:27
huggingface text generation call
generated_text_samples = model.generate(
input_ids,
max_length=150,
num_return_sequences=5,
no_repeat_ngram_size=2,
repetition_penalty=1.5,
top_p=0.92,
temperature=.85,
do_sample=True,
top_k=125,
@GeorgeDittmar
GeorgeDittmar / lm-huggingface-finetune-gpt-2.ipynb
Created January 1, 2021 00:33
LM Huggingface finetune GPT-2.ipynb
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@GeorgeDittmar
GeorgeDittmar / ui.py
Created February 17, 2024 17:49
Gist of the gradio UI component in the local-llm setup
import gradio as gr
from operator import itemgetter
from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.memory import ConversationBufferMemory
from langchain.schema.runnable import RunnableLambda, RunnablePassthrough
# langchain imports
from langchain.llms import HuggingFaceTextGenInference
@GeorgeDittmar
GeorgeDittmar / docker-compose.yml
Created March 2, 2024 22:53
local-llm docker compose
version: '3'
services:
tgi:
image: ghcr.io/huggingface/text-generation-inference:latest
container_name: tgi
ports:
- 8080:80
volumes:
- ${LOCAL_MODEL_CACHE_DIR}:/model_cache