Skip to content

Instantly share code, notes, and snippets.

View candlewill's full-sized avatar

Yunchao He candlewill

  • Beijing, China
View GitHub Profile
@candlewill
candlewill / issues
Created August 10, 2023 09:58
500+ Must Have Words for TOEFL and IELTS
Here is the issue page for ANKI Deck "500+ Must Have Words for TOEFL and IELTS
Please feel free to leave your comment here.
@candlewill
candlewill / tf.data.md
Last active July 21, 2020 09:07
Some tips for using tf.data dataset

Some tips for using tf.data dataset

Pipelines:

  1. Create a source dataset from input data
  2. Apply dataset transformations
  3. Iterate overt the dataset

Methods for create a dataset:

@candlewill
candlewill / speech_synthesis_interview_question.md
Last active March 16, 2020 10:59
语音合成(端到端方向)面试题

语音合成(端到端方向)面试题

  1. 语音合成的方法有参数合成、拼接合成、端到端合成等,请简单介绍一下各种方法(一两句话概况即可),并分析各自方法的优缺点。

  2. 语音合成前端模块一般进行文本分析的工作,提取文本发音相关的特征,请描述一下前端模块需要包含哪些子模块,以及各个子模块常用的一些算法名称。

@candlewill
candlewill / random_copy.py
Created July 24, 2019 07:03
Random select N files with specified suffix, and copy to the target folder
# encoding: utf-8
import os, sys
import glob
import random
Usage = "Random select N files with specified suffix, and copy to the target folder\n" \
".py <source dir> <target dir> <suffix> <num files>"
os.system("export PYTHONIOENCODING=utf8")
os.system("export LC_ALL=en_US.UTF-8")
@candlewill
candlewill / jd_get_coupon.py
Created May 31, 2019 02:37
京东优惠券定时抢
# -*- coding: utf-8 -*-
"""JD_get_coupon.ipynb
Automatically generated by Colaboratory.
Original file is located at
https://colab.research.google.com/drive/1m0QbL4UDq8mLtqSAY1A9nSTVSDxwVs17
"""
import requests
@candlewill
candlewill / redmi 4x cpuinfo.md
Created September 5, 2018 13:21
CPU info of Redmi 4X
Processor       : AArch64 Processor rev 4 (aarch64)
processor       : 0
BogoMIPS        : 38.40
Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer : 0x41
CPU architecture: 8
CPU variant     : 0x0
CPU part        : 0xd03
CPU revision    : 4
import argparse
from collections import OrderedDict
import librosa
import numpy as np
import os
class Segment:
def __init__(self, start, end):
self.start = start
@candlewill
candlewill / kaldi.md
Created February 3, 2018 07:25
Kaldi Tutorial

Kaldi nnet3 教程: nnet3中的数据类型

引言

nnet3目标是支持更加通用的网络结构。希望通过简单的配置文件,就可以构造出复杂的网络结构(LSTMs、RNNs)。和nnnet2一样,nnet3支持多机多GPU训练。

nnet3中的数据类型

目标与背景介绍

# -*- coding: utf-8 -*-
"""Example Google style docstrings.
This module demonstrates documentation as specified by the `Google Python
Style Guide`_. Docstrings may extend over multiple lines. Sections are created
with a section header and a colon followed by a block of indented text.
Example:
Examples can be given using either the ``Example`` or ``Examples``
sections. Sections support any reStructuredText formatting, including

深度学习于语音合成研究综述

本文综述近年来深度学习用于语音合成的一些方法。

WaveNet

在自回归生成模型在图像和文本领域广泛应用的时候,WaveNet [4] 尝试将这些思想应用于语音领域。仿照PixelRNN (van den Oord et al., 2016)图像生成的做法, WaveNet依据之前采样点来生成下一个采样点。生成下一个采样点的模型为CNN结构。为了生成指定说话人的声音,以及生成指定文本的声音,引入了全局条件和局部条件,来控制合成内容。为了扩大感受野,带洞卷积,使filter的按照指数扩张。

WaveNet存在的问题是,1) 每次预测一个采样点,速度太慢;2)如果用于TTS,那初始采样点选择将会很重要;3)以及需要文本前端的支持,前端分析出错,将直接影响合成效果。