Skip to content

Instantly share code, notes, and snippets.

@bwalsh
Last active August 7, 2019 20:44
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save bwalsh/b119c20a513c51c01e3216d28656a28b to your computer and use it in GitHub Desktop.
Save bwalsh/b119c20a513c51c01e3216d28656a28b to your computer and use it in GitHub Desktop.
AnVIL CCDG Participants & Samples

Participants

The participants were derived from 51 Terra projects assigned to AnVIL.

image

The attribute sets for participant were inconsistent across projects. Only 1 field was shared across most projects.

image image

Mapping

Our intial mapping to a GDC-like graph is as follows:

  • Mapping effort blocked somewhat as we consider Family Relationship edges (below)

image

Discussion points

  • The Family Relationship edges are intricate.
Brother or Sister                        767
proband                                  494
mother                                   406
father                                   396
Son or Daughter                          363
Father                                   360
Mother                                   352
Proband                                  346
Nephew or Niece                          312
Sibling                                  218
Affected                                 204
#N/A                                     151
Other 2nd Degree Relative                 94
Brother                                   77
Sister                                    70
Parent                                    60
sibling                                   39
Brother or Sister in Law                  37
Husband or Wife                           33
Other 3rd Degree Relative                 31
Not Related to the Proband                29
Other 4th Degree Relative                 29
Maternal Aunt or Uncle                    26
Cousin                                    26
Other 2nd degree relative                 18
Other 3rd degree relative                 15
Brother or Sister-in-law                  13
Other Not Related to Proband              10
Other second degree relative               9
Paternal Grandparent                       8
Not related to the Proband                 8
Unaffected Sibling                         8
affected sibling                           8
Paternal Aunt or Uncle                     7
Other not related to proband relative      5
Other forth degree relative                5
Other 4th degree relative                  5
affected twin                              4
brother                                    4
Brother or Sister-in-Law                   4
Maternal Grandparent                       3
Other third degree relative                3
Twin brother                               2
Aunt or Uncle                              2
Identical Twin of Proband                  2
mother2/sister1                            1
Mother1                                    1
father2                                    1
First Cousin                               1
Affected2                                  1
DZ twin                                    1
Other fifth degree relative                1
Father1                                    1
Affected1                                  1
Twin sister                                1
brother2                                   1
affected half-sib                          1
sister2                                    1
Son or Daughter-in-law                     1
ProbandBrother or Sister                   1

image

image

image

Samples

The properties associated with sample diverged widely, consisting mainly of CRAM summary statistics

image image

Mapping

  • Added Sample node
  • Added CramFile CraiFile nodes

Discussion points

  • Should we move the bulk of these attributes to the CramFile node?
  • Should we reprocess the CRAM files to create an agreed upon set of attributes?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment