Skip to content

Instantly share code, notes, and snippets.

View suqingdong's full-sized avatar
🐍

Qingdong Su suqingdong

🐍
View GitHub Profile
@suqingdong
suqingdong / merge_snp.py
Last active October 17, 2016 15:09
Merge multi snp annovar annotationed files
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Filename: merge_snp.py
# Date: 2016-09-23
# Author: suqingdong
class MergeSNP:
'''
Merge all the samples according to the second column(pos),
@suqingdong
suqingdong / getGeneBed.py
Created October 12, 2016 05:22
Get all gene's bed from origin bed file according to the result snp result file
#!/usr/bin/env python
#======================
# Date: 2016-10-10
# Author: suqingdong
# Introductions: get gene list from annovar annotation result file, then generate a bed file from origin bed.
# Usage: python getGeneBed.py <genelist> <originbed> [outbed]
#======================
import re
@suqingdong
suqingdong / replaceSNP.py
Last active October 14, 2016 14:36
replace each sample's genetype
#!/usr/bin/env python
# Extract columns: 'ChROM POS ID REF ALT GeneName' + samples' columns
def safe_open(infile):
try:
if infile.endswith('.gz'):
import gzip
return gzip.open(infile)
else:
return open(infile)
@suqingdong
suqingdong / depthStatByGene.py
Created October 14, 2016 14:40
depth and coverage statistics by each gene
#!/usr/bin/env python
import re
# One gene may exist in different chromsomes
# One position may belong diffrent genes
# geneList format:
# {'genename1': {
# "chr1": [(start1,stop1), (start2, stop2)],
# "chr2": [(start1,stop1), (start2, stop2)]
@suqingdong
suqingdong / convertSNP.py
Created October 17, 2016 15:12
convert pos and samples from replaced files
#!/usr/bin/env python
#=======================================================
# convert pos and samples
# row name is sample name, and column name is position
# value is the number of alt (0,1,2)
#=======================================================
# sampledict: {'sample1':['0','2',...], ...}
def convertSNP(infile, outfile):
with open(infile) as f:
@suqingdong
suqingdong / mergeSNP.py
Created October 17, 2016 15:13
merge the replaced files, if no pos, mark '0'
#!/usr/bin/env python
#======================================
# merge file1 and file2 of replaced.xls
# if no pos, mark '0'
# output coverted.xls
#======================================
# sampledict structure: {'sample1': {'pos1':'snp1','pos2':'snp2',... }, ...}
def getSampleDict(infile):
sampledict = {}
@suqingdong
suqingdong / get_sample_list.py
Created October 17, 2016 15:56
generate sample_list according to info.txt and list.txt
#!/usr/bin/env python
#-*- coding: utf-8 -*-
def get_sample_list(infofile, listfile):
sampledict = {}
with open(infofile) as f:
for line in f:
sampleid,novoid = line.strip().split('\t')[2:4]
sampledict[novoid] = sampleid
@suqingdong
suqingdong / get_sample_info.py
Created October 17, 2016 16:04
generate sample_info according info.txt, pn and disease
#!/usr/bin/env python
#-*- coding: utf-8 -*-
def get_sample_info(infofile, pn, disease):
header = '#B1\n#FamilyID\tSampleID\tSex\tNormal/Patient\tPN'
if disease:
header += '\tDisease'
header += '\n'
@suqingdong
suqingdong / add_count_samples.py
Last active February 8, 2017 11:10
add counts of variation samples to annotated VCF file
#!/usr/bin/env python
#!-*- coding: utf-8 -*-
import sys
def add_count_samples(infile, outfile=None):
outfile = outfile or infile+'.addCountSamples'
with open(infile) as f, open(outfile, 'w') as out:
for line in f:
@suqingdong
suqingdong / get_blue
Last active March 17, 2017 03:23
A simple example of crawler with requests and BeautifulSoup
#!/usr/bin/env python
# A simple example of crawler with requests and BeautifulSoup
# Pay attention to encoding
import bs4
import requests
def main(genelist):
with open(genelist) as f: