Skip to content

Instantly share code, notes, and snippets.

View Se-Hun's full-sized avatar

Sehun Heo Se-Hun

View GitHub Profile
@Se-Hun
Se-Hun / huggingface_tokenizers_usage.md
Created July 29, 2022 12:42 — forked from lovit/huggingface_tokenizers_usage.md
Hugging Face tokenizers usage
import tokenizers
tokenizers.__version__
@Se-Hun
Se-Hun / huggingface_konlpy.md
Created July 29, 2022 12:42 — forked from lovit/huggingface_konlpy.md
huggingface + KoNLPy

Huggingface

  • NLP 관련 다양한 패키지를 제공하고 있으며, 특히 언어 모델 (language models) 을 학습하기 위하여 세 가지 패키지가 유용
package note
transformers Transformer 기반 (masked) language models 알고리즘, 기학습된 모델을 제공
tokenizers transformers 에서 사용할 수 있는 토크나이저들을 학습/사용할 수 있는 기능 제공. transformers 와 분리된 패키지로 제공
nlp 데이터셋 및 평가 척도 (evaluation metrics) 을 제공
@Se-Hun
Se-Hun / main.py
Created March 25, 2024 06:33 — forked from jvelezmagic/main.py
QA Chatbot streaming with source documents example using FastAPI, LangChain Expression Language, OpenAI, and Chroma.
"""QA Chatbot streaming using FastAPI, LangChain Expression Language , OpenAI, and Chroma.
Features
--------
- Persistent Chat Memory:
Stores chat history in a local file.
- Persistent Vector Store:
Stores document embeddings in a local vector store.
- Standalone Question Generation:
Rephrases follow-up questions to standalone questions in their original language.