Skip to content

Instantly share code, notes, and snippets.

@diamondap
Created November 16, 2016 20:20
Show Gist options
  • Save diamondap/0e7395a80369d5c74559f89aa023996c to your computer and use it in GitHub Desktop.
Save diamondap/0e7395a80369d5c74559f89aa023996c to your computer and use it in GitHub Desktop.
DPN sync process for APTrust
I sync replication requests from remote nodes to my own node. Any requests in which I'm the to_node go into my processing queue. For each replication request in the queue, I do this:
1. Copy the bag from the remote node, via rsync/ssh.
2. Calculate the sha256 digest of the bag's tag manifest.
3. Send that fixity value back to the ingest node. If I get back a record in which StoreRequested == false, I delete the bag from my staging area and consider the job done.
4. If StoreRequested == true, I validate the bag by making sure all required files and tags are present, and all checksums in the manifest-sha256.txt match. If the bag is invalid, I cancel the transfer on the remote node with a cancel reason indicating that the bag did not pass validation. I delete the bag from staging, and am done.
5. If the bag is valid, I copy it to long-term storage and delete it from my staging area.
6. I update the transfer record on the ingest node to say Stored = true.
My own node does not know the bag is stored until the next time I sync from the remote node.
@diamondap
Copy link
Author

By the way, I do validation AFTER calculating the tag manifest fixity because validating a 250GB bag is really expensive, and I don't want to even start that work if I got a bad bag.

@smutniak
Copy link

It is expected that the replication transfer request initially has it's store_requested set to false. store_requested is set to true by the from_node upon successful fixity response from to_node. This represents a state similar to status=confirmed in v1, correct? When you send fixity back, the only field the to_node changes in the put to from_node is the fixity_value, correct?

@diamondap
Copy link
Author

Yes - store_requested starts out false, and is set to true only when the from_node gets a valid fixity from the to_node. When the to_node sends the fixity value, it updates fixity_value and updated_at.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment