Last active
September 20, 2017 21:06
-
-
Save maddyblue/7187003ad87dc5a4417c929cca203dcd to your computer and use it in GitHub Desktop.
load csv progress notes
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
For local: | |
First phase is all of total progress. Initial, simple, implementation is to count the number of files (N), each file, after being converted to KVs, counts for 1/N of this phase's progress. Follow up work to get the file size from the export storage. Then progress for this phase is number of bytes processed / total number of bytes (among all files). | |
For distributed: | |
Sampling phase is 1/3 of total progress. Same deal as local with counting files and then file sizes for progress, also keeping track of number of KVs produced. | |
Second phase is 1/3 of total progress. As KVs are consumed and written to RocksDB, progress is total number of written KVs / total number of KVs (among all distsql processors). | |
Third phase is 1/3 of total progress. Each SST written to storage is 1/N of this phase. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment