Skip to content

Instantly share code, notes, and snippets.

@DanielEWeeks
Last active January 29, 2020 14:30
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save DanielEWeeks/3120fcfc7e8e4fe38826b93c0afd5c1a to your computer and use it in GitHub Desktop.
Save DanielEWeeks/3120fcfc7e8e4fe38826b93c0afd5c1a to your computer and use it in GitHub Desktop.
Using PLINK to insert missing parents

Adding 'dummy' parents using PLINK

The majority of programs that handle pedigree data require the each pedigree be graphically connected. For example, since Merlin requires that each pedigree be graphically connected, you cannot use Mega2 to convert to Merlin format unless you've adjusted your pedigree structures to be graphically connected.  For example, if a pedigree file contains two siblings but it only contains their mother, then it is not clear from the pedigree file itself if the siblings are full siblings (sharing a common father) or half siblings (each with a different father).  The missing father(s) would need to be inserted into the pedigree file before Mega2 could process the pedigree.  

As the Mega2 documentation states:

Mega2 is designed so that it either needs both Father and Mother to be defined or for both to be undefined, you cannot have one defined and the other set to unknown. As the SimWalk2 documentation explains: “To reconstruct the relationships between individuals properly, often people must be included who are dead or otherwise unavailable for study. One rule is important to keep in mind. Either both parents or neither parent of a person must be listed in the pedigree. Those people without parents in the pedigree can be thought of as founders of the pedigree.”

Toy example testing how PLINK's 'merge' command works:

Here in the data.* data set we have two people who have been genotyped at three markers.  We want to add in their two dummy parents, who are not genotyped at any marker.  We do this by creating the data2.* files below, where the data2.ped file contains the two dummy parents we wish to add to the pedigree structure.

==> data.map <==
1 snp1 1 1 
1 snp2 2 2 
1 snp3 3 3 


==> data.ped <==
1 3 1 2 1  1  A G  T T  A A
1 4 1 2 2  1  A A  T A  G G


==> data2.map <==
1 snp1 1 1 


==> data2.ped <==
1 1 0 0 1  -9  0 0
1 2 0 0 2  -9  0 0

So this PLINK command 

plink --file data --merge data2.ped data2.map --recode

produces this output, which is exactly what we want:

==> plink.ped <==
1 1 0 0 1 -9 0 0 0 0 0 0
1 2 0 0 2 -9 0 0 0 0 0 0
1 3 1 2 1 1 G A T T A A
1 4 1 2 2 1 A A A T G G


==> plink.map <==
1        snp1        1        1
1        snp2        2        2
1        snp3        3        3

Conclusion: PLINK's 'merge' command can be used to easily insert ungenotyped dummy connecting individuals.  

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment