Last active
August 29, 2015 13:56
-
-
Save arq5x/9329620 to your computer and use it in GitHub Desktop.
Examples for generating haplotypes with Macs
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
DL: https://code.google.com/p/macs/ | |
# simulate: | |
# 100 individuals (200 haplotypes) | |
# "genome" is 1Mb (1e6) | |
# mutation and recombinaytion rate at 0.001 | |
macs 200 1e6 -T -t .001 -r .001 > 200.macs | |
# peak at file: | |
grep SITE: 200.macs | head | |
SITE: 0 1.93033631e-05 0.033787115 00000010100000000000000000000010010000000000001000010000010000000100000000000000100000001000000000000000000000000000001000000000000000000000000000001000000000001100110000000001000000010000100100010000 | |
SITE: 1 0.000166766399 0.00155699214 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000000000000000000000000000 | |
SITE: 2 0.00104687764 0.0243700651 00000000000000000000000000000000000000000000100000000000000000000000000000000000000000000001000000000000000000000001000000000001000000000000000000110000000000000000000000000000000000000000000000000000 | |
SITE: 3 0.00129342504 0.0970364607 00000000000000000000000000100000100000000000000000000000000000000000000000000000000000000100000100000000000010001000000000000000000000000000000000000000000100010000000000000000000100001001001000000000 | |
SITE: 4 0.00133399074 0.0973624524 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000010000000000000000000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000000000 | |
SITE: 5 0.00137061099 0.0607467014 00000000000000000000000000000000000000000000000000000000000000000000100000001000000000000000000000000000000000000000000000010000000000000000000000000000000000000000000000000000000000000000000000000000 | |
SITE: 6 0.00146708694 0.0420149854 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000010000000000000000000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000000000 | |
SITE: 7 0.00181525128 0.0151296793 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000100100000000000000000000000000000000000010000000000000000000000000000000000000000000000000 | |
SITE: 8 0.00221456017 0.163915436 10000000000000000000000000000000000000000000000000000000000000000000110000001000000000000000000000010000000001000000000000010000000010000000000000000100000000000000000000000000000000000000000000001000 | |
SITE: 9 0.00237305591 0.161002917 00000000000000000000010000000000000000000000000010000100000000000000000000000100000000000000000000000100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 | |
# to create 100 genotypes from 200 haplotypes, you would need to sum the two adjacent alleles in the string of 200. | |
# this would create a "genotype string" for each variant SITE. | |
# if you want to create a genotype string across all SITES, you need to string together the genotypes derived at each site. | |
# I would start with something small like 10000 individuals (2000 haplotypes) across 1e9 genome. we can scale from there. This will take a while to run. | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment