Skip to content

Instantly share code, notes, and snippets.

@arq5x
Last active August 29, 2015 13:56
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save arq5x/9329620 to your computer and use it in GitHub Desktop.
Save arq5x/9329620 to your computer and use it in GitHub Desktop.
Examples for generating haplotypes with Macs
DL: https://code.google.com/p/macs/
# simulate:
# 100 individuals (200 haplotypes)
# "genome" is 1Mb (1e6)
# mutation and recombinaytion rate at 0.001
macs 200 1e6 -T -t .001 -r .001 > 200.macs
# peak at file:
grep SITE: 200.macs | head
SITE: 0 1.93033631e-05 0.033787115 00000010100000000000000000000010010000000000001000010000010000000100000000000000100000001000000000000000000000000000001000000000000000000000000000001000000000001100110000000001000000010000100100010000
SITE: 1 0.000166766399 0.00155699214 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000000000000000000000000000
SITE: 2 0.00104687764 0.0243700651 00000000000000000000000000000000000000000000100000000000000000000000000000000000000000000001000000000000000000000001000000000001000000000000000000110000000000000000000000000000000000000000000000000000
SITE: 3 0.00129342504 0.0970364607 00000000000000000000000000100000100000000000000000000000000000000000000000000000000000000100000100000000000010001000000000000000000000000000000000000000000100010000000000000000000100001001001000000000
SITE: 4 0.00133399074 0.0973624524 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000010000000000000000000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000000000
SITE: 5 0.00137061099 0.0607467014 00000000000000000000000000000000000000000000000000000000000000000000100000001000000000000000000000000000000000000000000000010000000000000000000000000000000000000000000000000000000000000000000000000000
SITE: 6 0.00146708694 0.0420149854 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000010000000000000000000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000000000
SITE: 7 0.00181525128 0.0151296793 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000100100000000000000000000000000000000000010000000000000000000000000000000000000000000000000
SITE: 8 0.00221456017 0.163915436 10000000000000000000000000000000000000000000000000000000000000000000110000001000000000000000000000010000000001000000000000010000000010000000000000000100000000000000000000000000000000000000000000001000
SITE: 9 0.00237305591 0.161002917 00000000000000000000010000000000000000000000000010000100000000000000000000000100000000000000000000000100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
# to create 100 genotypes from 200 haplotypes, you would need to sum the two adjacent alleles in the string of 200.
# this would create a "genotype string" for each variant SITE.
# if you want to create a genotype string across all SITES, you need to string together the genotypes derived at each site.
# I would start with something small like 10000 individuals (2000 haplotypes) across 1e9 genome. we can scale from there. This will take a while to run.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment