Skyward skywardpixel

## default.custom.yaml
patch:
  # 菜单
  menu:
    page_size: 8  # 候选词个数
    # alternative_select_labels: [ ①, ②, ③, ④, ⑤, ⑥, ⑦, ⑧, ⑨, ⑩ ]  # 修改候选项标签
    # alternative_select_keys: ASDFGHJKL  # 如编码字符占用数字键，则需另设选字键
    # ascii_mode、inline、no_inline、vim_mode 等等设定，可参考 /Library/Input Methods/Squirrel.app/Contents/SharedSupport/squirrel.yaml
  # 中西文切换
  #
  # 【good_old_caps_lock】 CapsLock 切换到大写或切换中英。

## wikiextract.sh
#!/bin/bash

# Adapted from XLM's get-data-wiki.sh script.

# Install wikiextractor from pip and opencc before running this.

DATA_ROOT=$PWD
WIKI_PATH=$DATA_ROOT/wiki

mkdir -p $WIKI_PATH/bz2

## keybase.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                skywardpixel
                / keybase.md
            
            
              Created
              December 4, 2019 01:12
            
          
    Keybase proof

I hereby claim:

I am kaichengyan on github.
I am kyleyan (https://keybase.io/kyleyan) on keybase.
I have a public key ASDX0euT_5ykkf66QyUcABIKFtByxq1h9dINTL7Mw-uVMgo

To claim this, I am signing this object:

  
## text_data_preprocessing.py
# Your instructions are to implement the following. First, imagine splitting the dataset into N chunks where N is the batch_size and the chunks are contiguous parts of the data. For each batch, you should return one sequence from each of the chunks. The batches should also be sequential an example is described below.
# The data is 20 characters long [1, 2, 3, ...20]. The batch size is 2 and the sequence length is 4.
# The 1st batch should consist of (data = [[1, 2, 3, 4]; [11, 12, 13, 14]], labels = [[2, 3, 4, 5]; [12, 13, 14, 15]])
# The 2nd batch should consist of (data = [[5, 6, 7, 8]; [15, 16, 17, 18]], labels = [[6, 7, 8, 9]; [16, 17, 18, 19]])
# The 3rd batch should consist of (data = [[9]; [19]], labels = [[10]; [20]])
# There is no 4th batch.

from math import ceil
batch_size = 2
sequence_length = 4
	patch:
	# 菜单
	menu:
	page_size: 8 # 候选词个数
	# alternative_select_labels: [ ①, ②, ③, ④, ⑤, ⑥, ⑦, ⑧, ⑨, ⑩ ] # 修改候选项标签
	# alternative_select_keys: ASDFGHJKL # 如编码字符占用数字键，则需另设选字键
	# ascii_mode、inline、no_inline、vim_mode 等等设定，可参考 /Library/Input Methods/Squirrel.app/Contents/SharedSupport/squirrel.yaml
	# 中西文切换
	#
	# 【good_old_caps_lock】 CapsLock 切换到大写或切换中英。
	#!/bin/bash

	# Adapted from XLM's get-data-wiki.sh script.

	# Install wikiextractor from pip and opencc before running this.

	DATA_ROOT=$PWD
	WIKI_PATH=$DATA_ROOT/wiki

	mkdir -p $WIKI_PATH/bz2
	# Your instructions are to implement the following. First, imagine splitting the dataset into N chunks where N is the batch_size and the chunks are contiguous parts of the data. For each batch, you should return one sequence from each of the chunks. The batches should also be sequential an example is described below.
	# The data is 20 characters long [1, 2, 3, ...20]. The batch size is 2 and the sequence length is 4.
	# The 1st batch should consist of (data = [[1, 2, 3, 4]; [11, 12, 13, 14]], labels = [[2, 3, 4, 5]; [12, 13, 14, 15]])
	# The 2nd batch should consist of (data = [[5, 6, 7, 8]; [15, 16, 17, 18]], labels = [[6, 7, 8, 9]; [16, 17, 18, 19]])
	# The 3rd batch should consist of (data = [[9]; [19]], labels = [[10]; [20]])
	# There is no 4th batch.

	from math import ceil
	batch_size = 2
	sequence_length = 4