jmchilton/gist:9788629

## gistfile1.txt
Open Model Questions:

Key:
  HDCA - HistoryDatasetCollectionAssociation (connects history and dataset collection)
  DC - DatasetCollection
   -or-
  HDC - HistoryDatasetCollection - combine concepts of HDCA and DC into one.

  DCE - DatasetCollectionElement (connects dataset instance or some kind of collection to
          a some kind of collection, was called DatasetAssociationDatasetCollectionAssociation
          in original card, I had prefered DatasetInstanceDatasetCollectionAssociation in
          first pass, but this new name makes more sense in context of nested collections.).

  H(LD)DA - An HDA or LDDA - a dataset instance belonging to collection.


Potentially the most... brute force thing to do here is just this:

HDCA - DC - DCE - HDCA - DC - DCE - H(LD)DA
                            - DCE - H(LD)DA
          - DCE - HDCA - DC - DCE - H(LD)DA
                            - DCE - H(LD)DA

I am not sure it has any advantage over this though...

HDC - DCE - HDC - DCE - H(LD)DA
                - DCE - H(LD)DA
    - DCE - HDC - DCE - H(LD)DA
                - DCE - H(LD)DA

Problem with either of these is that cloning a HDC(A) from one history to another requires complete copying of entire collection structure say 10000 records in the extreme? I have no problem doing this for 200, but 10,000 makes me nervous. One could imagine not replacing the intermediate HDCAs but then the original owner of the collection could mess with the cloners history - assuming things like name, deleted, visible can be modified on an HDCA.

One can imagine this alternative... where DC and DCE are immutable. Here one can clone a HDCA by just cloning the top level thing - the problem is it is less intuitive how to handle the middle stuff if say mapping over a collection of pairs - if say the nested pairs don't have HIDs or names, etc....

HDCA - DC - DCE - DC - DCE - H(LD)DA
                     - DCE - H(LD)DA
          - DCE - DC - DCE - H(LD)DA
                     - DCE - H(LD)DA

Have been going with this last one, will continue to try I guess. All options seem rotten to me unfortunately.


State of HDCA (or HDC):

 - Do we even need to do the green/yellow/gray box thing for collections so should they just always be beige (border of tool panel) say and
     we just calculate the state when we need to on the backend - while extracting workflows, etc...

If we definitely want a concept of state -

 - Read dynamically each time.
     - Reading state is very expensive?
 - Catch every dataset instance state update and propagate up.
     - Synchronously?
         - Signficant slow down to every dataset state update?
         - More likely to be "correct" than asynchronous approach below.
     - Asynchronously?
         - Speedy reads and writes?
         - Possibility of incorrect state?

Tempted here to do it dynamically because this can be added later on if it proves too taxing. (http://c2.com/cgi/wiki?PrematureOptimization)

 - Other half measure can reduce load as well - update the state less frequently for these in the UI for instance.
	Open Model Questions:

	Key:
	HDCA - HistoryDatasetCollectionAssociation (connects history and dataset collection)
	DC - DatasetCollection
	-or-
	HDC - HistoryDatasetCollection - combine concepts of HDCA and DC into one.

	DCE - DatasetCollectionElement (connects dataset instance or some kind of collection to
	a some kind of collection, was called DatasetAssociationDatasetCollectionAssociation
	in original card, I had prefered DatasetInstanceDatasetCollectionAssociation in
	first pass, but this new name makes more sense in context of nested collections.).

	H(LD)DA - An HDA or LDDA - a dataset instance belonging to collection.


	Potentially the most... brute force thing to do here is just this:

	HDCA - DC - DCE - HDCA - DC - DCE - H(LD)DA
	- DCE - H(LD)DA
	- DCE - HDCA - DC - DCE - H(LD)DA
	- DCE - H(LD)DA

	I am not sure it has any advantage over this though...

	HDC - DCE - HDC - DCE - H(LD)DA
	- DCE - H(LD)DA
	- DCE - HDC - DCE - H(LD)DA
	- DCE - H(LD)DA

	Problem with either of these is that cloning a HDC(A) from one history to another requires complete copying of entire collection structure say 10000 records in the extreme? I have no problem doing this for 200, but 10,000 makes me nervous. One could imagine not replacing the intermediate HDCAs but then the original owner of the collection could mess with the cloners history - assuming things like name, deleted, visible can be modified on an HDCA.

	One can imagine this alternative... where DC and DCE are immutable. Here one can clone a HDCA by just cloning the top level thing - the problem is it is less intuitive how to handle the middle stuff if say mapping over a collection of pairs - if say the nested pairs don't have HIDs or names, etc....

	HDCA - DC - DCE - DC - DCE - H(LD)DA
	- DCE - H(LD)DA
	- DCE - DC - DCE - H(LD)DA
	- DCE - H(LD)DA

	Have been going with this last one, will continue to try I guess. All options seem rotten to me unfortunately.



	State of HDCA (or HDC):

	- Do we even need to do the green/yellow/gray box thing for collections so should they just always be beige (border of tool panel) say and
	we just calculate the state when we need to on the backend - while extracting workflows, etc...

	If we definitely want a concept of state -

	- Read dynamically each time.
	- Reading state is very expensive?
	- Catch every dataset instance state update and propagate up.
	- Synchronously?
	- Signficant slow down to every dataset state update?
	- More likely to be "correct" than asynchronous approach below.
	- Asynchronously?
	- Speedy reads and writes?
	- Possibility of incorrect state?

	Tempted here to do it dynamically because this can be added later on if it proves too taxing. (http://c2.com/cgi/wiki?PrematureOptimization)

	- Other half measure can reduce load as well - update the state less frequently for these in the UI for instance.