Last active
August 19, 2019 13:03
-
-
Save amaltaro/72599f995b37a6e33566f3c749143154 to your computer and use it in GitHub Desktop.
Data structure for the MS Transferor document
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# OPTION A: | |
{"wf_A": {"timestamp": 0000 | |
"primary": ["list of transfer ids"], | |
"secondary": ["list of transfer ids"]}, | |
"wf_B": {"timestamp": 0000 | |
"primary": [], | |
"secondary": []}, | |
} | |
# OPTION B: | |
{"wf_A": {"timestamp": 0000 | |
"primary": {"dset_1": ["list of transfer ids"]}, | |
"secondary": {"PU_dset_1": ["list of transfer ids"]}, | |
"wf_B": {"timestamp": 0000 | |
"primary": {"dset_1": ["list of transfer ids"], | |
"parent_dset_1": ["list of transfer ids"]}, | |
"secondary": {"PU_dset_1": ["list of transfer ids"], | |
"PU_dset_2": ["list of transfer ids"]}, | |
"wf_C": {"timestamp": 0000 | |
"primary": {}, | |
"secondary": {}, | |
} | |
# OPTION C (the chosen one!) - it assumes we store all the transfer information within the same Couch document: | |
{"wf_A": [{"timestamp":000, "dataset":"/a/b/c", "dataType": "primary", "transferIDs": [1,2,3]}, | |
"timestamp":000, "dataset":"/a/b/c", "dataType": "secondary", "transferIDs": [4]}], | |
"wf_B": [{"timestamp":000, "dataset":"/a/b/c", "dataType": "primary", "transferIDs": [1,2,3]}, | |
"timestamp":000, "dataset":"/a/b/c", "dataType": "parent", "transferIDs": [4,5,6]}], | |
"wf_C": [], | |
} | |
# OPTION D - it assumes a new document is created for every request: | |
{"workflowName": "blah, | |
"lastUpdate": 000, # just as timestamp above | |
"transfers": [{"dataset":"/a/b/c", "dataType": "primary", "transferIDs": [1,2,3], "campaignName": "blah2017", "completion": [0.0]}, | |
{"dataset":"/a/b/c", "dataType": "secondary", "transferIDs": [4], "campaignName": "blah2018", "completion": [0.0]}, | |
{"dataset":"/a/b/c", "dataType": "parent", "transferIDs": [4,5,6], "campaignName": "blah2017", "completion": [0.0]}] | |
} |
@vkuznet Valentin, I created the Option D for the case where we want to store a new document for each workflow. I believe that's going to be our best option TBH.
Alan, your option D is almost identical to my original proposal (the difference that I proposed records per each dataset and you group them for given workflow) and it is a good compromise, i.e. it represents a single entity (in this case workflow) and we do not need to compose gigantic single dictionary.
Ok, let's hope nothing else changes. Let's proceed with option D then, one record/document per workflow.
@vkuznet Valentin, I added a completion
parameter to the option D, such that we can persist the transfer completion every time it gets calculated (and persist it).
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
In general timestamp should be float since we'll use
time.time()
and it is a float number. But of course we can cast it to int. Everything else is correct.