Skip to content

Instantly share code, notes, and snippets.

View seopbo's full-sized avatar
🤗
On boarding

boseop kim & nick.coco seopbo

🤗
On boarding
  • kakaobrain
View GitHub Profile
@lovit
lovit / huggingface_tokenizers_usage.md
Created August 27, 2020 22:28
Hugging Face tokenizers usage
import tokenizers
tokenizers.__version__
@lovit
lovit / huggingface_konlpy.md
Last active January 8, 2024 20:43
huggingface + KoNLPy

Huggingface

  • NLP 관련 다양한 패키지를 제공하고 있으며, 특히 언어 모델 (language models) 을 학습하기 위하여 세 가지 패키지가 유용
package note
transformers Transformer 기반 (masked) language models 알고리즘, 기학습된 모델을 제공
tokenizers transformers 에서 사용할 수 있는 토크나이저들을 학습/사용할 수 있는 기능 제공. transformers 와 분리된 패키지로 제공
nlp 데이터셋 및 평가 척도 (evaluation metrics) 을 제공
@danijar
danijar / blog_tensorflow_variable_sequence_classification.py
Last active December 31, 2021 10:04
TensorFlow Variable-Length Sequence Classification
# Working example for my blog post at:
# http://danijar.com/variable-sequence-lengths-in-tensorflow/
import functools
import sets
import tensorflow as tf
from tensorflow.models.rnn import rnn_cell
from tensorflow.models.rnn import rnn
def lazy_property(function):
@yrevar
yrevar / imagenet1000_clsidx_to_labels.txt
Last active July 3, 2024 18:29
text: imagenet 1000 class idx to human readable labels (Fox, E., & Guestrin, C. (n.d.). Coursera Machine Learning Specialization.)
{0: 'tench, Tinca tinca',
1: 'goldfish, Carassius auratus',
2: 'great white shark, white shark, man-eater, man-eating shark, Carcharodon carcharias',
3: 'tiger shark, Galeocerdo cuvieri',
4: 'hammerhead, hammerhead shark',
5: 'electric ray, crampfish, numbfish, torpedo',
6: 'stingray',
7: 'cock',
8: 'hen',
9: 'ostrich, Struthio camelus',