Skip to content

Instantly share code, notes, and snippets.

@naotokui
naotokui / ja_sentence_tokenize.py
Created May 9, 2017 01:28
Japanese sentence tokenizer - 日本語 - 文に分ける 簡易版
import re
import nltk
sent_detector = nltk.RegexpTokenizer(u'[^ !?。]*[!?。.\n]')
sents = sent_detector.tokenize(u" 原子番号92のウランより重い元素は全て人工的に合成され、118番まで発見の報告がある。\
113番については、理研と米露の共同チームがそれぞれ「発見した」と報告し、国際純正・応用化学連合と国際純粋・応用物理学連合の合同作業部会が審査していた。両学会は「データの確実性が高い」ことを理由に、理研の発見を認定し、31日に森田さんに通知した。未確定だった115番と117番、118番の新元素は米露チームの発見を認めた。森田さんは「周期表に名前が残ることは感慨深い。大勢の共同研究者にまずは感謝したい」と述べた。 \n")
for s in sents:
print s, len(s)
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@naotokui
naotokui / conditional_vae_keras.ipynb
Created June 29, 2017 01:01
Conditional VAE in Keras
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@naotokui
naotokui / vscode_audio.py
Last active August 22, 2022 21:46
play audio on Visual Studio Code
# from vscode_audio import *
# Audio(audio_numpy_array, sr=SR)
import IPython.display
import numpy as np
import json
def Audio(audio: np.ndarray, sr: int):
"""
Use instead of IPython.display.Audio as a workaround for VS Code.
CMItemCount numSamplesInBuffer = CMSampleBufferGetNumSamples(buffer);
AudioBufferList audioBufferList;
CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(
buffer,
NULL,
&audioBufferList,
sizeof(audioBufferList),
NULL,
@naotokui
naotokui / midi_playback.py
Last active April 21, 2022 11:27
Play MIDI file in Python
# see: https://www.daniweb.com/programming/software-development/code/216976/play-a-midi-music-file-using-pygame
# sudo pip install pygame
# on ubuntu
# sudo apt-get install python-pygame
import pygame
def play_music(music_file):
@naotokui
naotokui / emoji_regex.py
Created May 19, 2017 04:21
find unicode emoji in python regex
import re
emoji_pattern = re.compile(
u"(\ud83d[\ude00-\ude4f])|" # emoticons
u"(\ud83c[\udf00-\uffff])|" # symbols & pictographs (1 of 2)
u"(\ud83d[\u0000-\uddff])|" # symbols & pictographs (2 of 2)
u"(\ud83d[\ude80-\udeff])|" # transport & map symbols
u"(\ud83c[\udde0-\uddff])" # flags (iOS)
"+", flags=re.UNICODE)
@naotokui
naotokui / melody_extraction.ipynb
Created May 8, 2017 14:12
extract melody from audio
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@naotokui
naotokui / conv_autoencoder_keras.ipynb
Created January 10, 2017 04:17
Convolutional Autoencoder in Keras
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@naotokui
naotokui / GAN-and-trainable.py
Last active October 14, 2021 19:46
How model.trainable = False works in keras (GAN model)
# coding: utf8
## based on this article: http://qiita.com/mokemokechicken/items/937a82cfdc31e9a6ca12
import numpy as np
from keras.models import Sequential
from keras.engine.topology import Input, Container
from keras.engine.training import Model
from keras.layers.core import Dense