Skip to content

Instantly share code, notes, and snippets.

View ShujiaHuang's full-sized avatar
:octocat:
Focusing

Shujia Huang ShujiaHuang

:octocat:
Focusing
View GitHub Profile
def merge_region(position_region, delta=1):
"""Merge a batch of sorted region
Parameters
----------
``position_region``: a list like, required
A regions (2D) array, format like: [[start1,end1], [start2,end2], ...]
``delta``: Integer, optinal
@ShujiaHuang
ShujiaHuang / gatk_bundle_and_WGS_test_data.sh
Last active April 13, 2024 02:16
Common datasets for GATK
#Known datasets: GATK bundle for human b37 reference
#
wget -c ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/b37/dbsnp_138.b37.vcf.gz.md5
wget -c ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/b37/dbsnp_138.b37.vcf.gz
wget -c ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/b37/Mills_and_1000G_gold_standard.indels.b37.vcf.gz
wget -c ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/b37/Mills_and_1000G_gold_standard.indels.b37.vcf.gz.md5
wget -c ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/b37/1000G_phase1.indels.b37.vcf.gz
wget -c ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/b37/1000G_phase1.indels.b37.vcf.gz.md5
wget -c ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/b37/1000G_phase1.snps.high_confidence.b37.vcf.gz
wget -c ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/b37/1000G_phase1.snps.high_confidence.b37.vcf.gz.md5
root@iZ88i9o3weiZ:~# apt-get update
Err http://mirrors.aliyuncs.com trusty InRelease
Err http://mirrors.aliyuncs.com trusty-security InRelease
Err http://mirrors.aliyuncs.com trusty-updates InRelease
Err http://mirrors.aliyuncs.com trusty-proposed InRelease
Err http://mirrors.aliyuncs.com trusty-backports InRelease
$ cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 44
model name : Intel(R) Xeon(R) CPU X5650 @ 2.67GHz
stepping : 2
microcode : 20
cpu MHz : 1600.000
cache size : 12288 KB
### Read Fa sequence ###
sub ReadFaSeq {
my ( $file, $fa ) = @_;
my ( $refId, $seq );
open I, $file or die "Cannot open file : $file\n";
$/ = ">"; <I>; $/ = "\n";
while ( <I> ) {
### Read Fa sequence ###
sub ReadFaSeq {
my ( $file, $fa ) = @_;
my ( $refId, $seq );
open I, $file or die "Cannot open file : $file\n";
$/ = ">"; <I>; $/ = "\n";
while ( <I> ) {
@ShujiaHuang
ShujiaHuang / uninstall_homebrew.sh
Created December 17, 2015 06:23
如何有效卸载homebrew
#!/bin/sh
# Just copy and paste the lines below (all at once, it won't work line by line!)
# MAKE SURE YOU ARE HAPPY WITH WHAT IT DOES FIRST! THERE IS NO WARRANTY!
# https://gist.github.com/mxcl/1173223
function abort {
echo "$1"
exit 1
}
set -e
#!/bin/bash
# If you adapt this script for your own use, you will need to set these two variables based on your environment.
# SV_DIR is the installation directory for SVToolkit - it must be an exported environment variable.
# SV_TMPDIR is a directory for writing temp files, which may be large if you have a large data set.
#export SV_DIR=`cd .. && pwd`
SV_DIR=/home/siyang/bin/software_pip/svtoolkit
SV_TMPDIR=
runDir=
@ShujiaHuang
ShujiaHuang / pure_data.pl
Last active February 26, 2016 01:36
用于判定窗口长度,完成窗口定位和窗口内甲基化率计算,常用于甲基化Canonical分析
#Author : Shujia Huang
#Date : 2010/11/27
#!/usr/bin/perl -w
use strict;
use warnings;
my ( $file, $outfile_prefix, @bin_num ) = @ARGV;
my %region2num = ( "1000upstream" => 0, "first-exon" => 1, "intron" => 2,
"mid-exon" => 3, "last-exon" => 4, "1000downstream" => 5 );
@ShujiaHuang
ShujiaHuang / PerlIOgzip.pl
Last active August 29, 2015 14:08
Perl gz 读写
#!/usr/bin/perl
use warnings;
use strict;
use PerlIO::gzip;
die "perl $0 <.gz file in> <.gz file out>" unless @ARGV==2;