Skip to content

Instantly share code, notes, and snippets.

@cmdcolin
Last active March 8, 2016 07:23
Show Gist options
  • Save cmdcolin/cbfe6e91c92207055e22 to your computer and use it in GitHub Desktop.
Save cmdcolin/cbfe6e91c92207055e22 to your computer and use it in GitHub Desktop.
Process a bunch of GWAS data with rsIDs to a BED file
#for i in GIANT*.txt; do LC_ALL=c sort -k1,1 $i > `basename $i .txt`.sort.bed; done;
export LC_ALL=c
parallel "join -1 4 -2 1 -t $'\t' snp144.sort.bed {} > {.}.joined.bed" ::: GIANT*sort.bed
parallel "cat {} | awk '{print \$2,\"\t\",\$3,\"\t\",\$4,\"\t\",\$1,\"\t\",\$8}' > {.}.final.bed" ::: *joined.bed
parallel "sort -k1,1 -k2,2n {} > {.}.final.sorted.bed" ::: *.final.bed
parallel "bgzip {}; tabix {}.gz" ::: *.final.sorted.bed
#for i in {1..22};do tabix GIANT_BMI_Speliotes2010_publicrelease_HapMapCeuFreq.sort.bed.gz chr$i > GIANT_BMI_Speliotes2010_publicrelease_HapMapCeuFreq.chr$i.bed; bgzip GIANT_BMI_Speliotes2010_publicrelease_HapMapCeuFreq.chr$i.bed; tabix GIANT_BMI_Speliotes2010_publicrelease_HapMapCeuFreq.chr$i.bed.gz; done;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment