- Note:
- Install CUDA 9.0, not 9.1
- already download package, in UBUNTU/home/junfeng
- Remove Old Version
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
The quantile monitor monitors the input and output, as well as simple transforms to them. | |
It logs the quantile values needed. | |
Link to paper reading paper: | |
- Small-scale proxies for large-scale Transformer training instabilities | |
- Mitchell Wortsman et al. | |
- https://arxiv.org/abs/2309.14322 | |
- notion link: https://www.notion.so/nyonic/Small-scale-proxies-for-large-scale-Transformer-training-instabilities-95f7d37711f34d8ebae4f505bc160830 # noqa | |
""" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
"""This script converts a DeepSpeed checkpoint from one format to another. | |
It requires specifying an input_folder and a target_folder before starting the | |
conversion. To determine the target folder, first run the script without checkpointing | |
using the target cluster. | |
The conversion process involves the following steps: | |
1. Building a linked matrix on the input DeepSpeed checkpoint to establish mappings | |
between tensor slices. | |
2. Merging the slice files based on the linked matrix. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def tf(sentence_list, min_cnt=1, max_cnt=None): | |
doc_num = 0 | |
word_list = [] | |
for sequence in sentence_list: | |
word_list += sequence | |
doc_num += 1 | |
word_count = Counter() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import random | |
import numpy as np | |
import re | |
def make_batches(size, batch_size): | |
""" | |
:param size: the size of dataset | |
:param batch_size: the size of batch |