Skip to content

Instantly share code, notes, and snippets.

Avatar

CeShine Lee ceshine

View GitHub Profile
@ceshine
ceshine / spacy_sentencizer.ipynb
Created Aug 14, 2019
Customizing Spacy's Statistical Sentence Segmenter with Custom Rules
View spacy_sentencizer.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@ceshine
ceshine / convert.py
Created Jul 17, 2019
A simple script to convert markdown image specification to hugo shortcode (result is automatically copied to clipboard)
View convert.py
from subprocess import Popen, PIPE
def image_markdown_conversion():
try:
while True:
text = input("Input:").strip()
brackets, split_point = 1, 0
description = ""
assert len(text) > 5, "wrong format!"
@ceshine
ceshine / detector.py
Last active Apr 22, 2020
A Simple CJK Language Detector
View detector.py
import re
def cjk_detect(texts):
# korean
if re.search("[\uac00-\ud7a3]", texts):
return "ko"
# japanese
if re.search("[\u3040-\u30ff]", texts):
return "ja"
@ceshine
ceshine / zip_and_base64encode.py
Created Nov 19, 2018
Useful script for importing your own packages into Kaggle Kernels
View zip_and_base64encode.py
from zipfile import ZipFile
import zipfile
from pathlib import Path
import base64
import sys
import io
def write_folder(zfile: ZipFile, dir_path: Path, prefix: str = ""):
assert dir_path.is_dir()
@ceshine
ceshine / extract.py
Last active Oct 29, 2018
Scripts to scrape and extract data from the Tourism Bureau of Taiwan
View extract.py
# WARNING: this script is out-dated since the last update of the Tourism Bureau website.
from pathlib import Path
import pandas as pd
SCHEMAS = [
(201201, "schema/residence-2012-01.csv"),
(201101, "schema/residence-2011-01.csv")
]
DATA_FILE_PATTERN = "raw_data/{year}-{month}.xls"
@ceshine
ceshine / plotting.R
Created Oct 8, 2018
Plotting Script for the TPU blog psot
View plotting.R
library(ggplot2)
library(ggthemes)
dat <- data.frame(
name = c("CPU", "GPU", "TPU"),
time = c(3 * 3600 + 6 * 60 + 4, 3 * 60 + 16, 1 * 60 + 42)
)
dat$log_time = log(dat$time)
ggplot(data=dat, aes(x=name, y=log_time)) +
View keras-fashion-mnist-tpu.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View keras-fashion-mnist-gpu.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View keras-fashion-mnist-cpu.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.