Skip to content

Instantly share code, notes, and snippets.

View davetang's full-sized avatar
🦀
🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀

Dave Tang davetang

🦀
🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀
View GitHub Profile
@davetang
davetang / get_sequence.R
Last active March 23, 2017 15:23
From a data frame with chromosomal coordinates, obtain the sequence, and calculate the dinucleotide frequencies
#I want to fetch sequences from
#my_random_loci and my_refseq_tss
head(my_random_loci,2)
chr start end strand
1 chr18 59415403 59415407 +
2 chr22 8535632 8535636 -
#install if necessary
source("http://bioconductor.org/biocLite.R")
biocLite("BSgenome.Hsapiens.UCSC.hg19")
@davetang
davetang / intersect_coordinate.R
Last active May 1, 2018 18:43
Given two list of coordinates, find the ones that overlap/intersect
#install if necessary
source("http://bioconductor.org/biocLite.R")
biocLite("GenomicRanges")
#load library
library(GenomicRanges)
#create a GRanges object given an object, my_refseq_loci
head(my_refseq_loci,2)
# refseq_mrna chromosome_name transcript_start transcript_end strand
@davetang
davetang / random_forest.R
Last active December 23, 2015 01:49
From two sets of dinucleotide counts, use random forests to create a predictor
#install if necessary
install.packages("randomForest")
#load library
library(randomForest)
#I have two sets of dinucleotide counts stored in
#my_random_loci_seq_di and my_refseq_tss_seq_di
head(my_refseq_tss_seq_di,2)
@davetang
davetang / transfac_to_tess.pl
Last active December 24, 2015 12:09
Convert the TRANSFAC matrix into a matrix readable by TESS (Transcription Element Search System).
#!/bin/env perl
use strict;
use warnings;
my $usage = "Usage: $0 <matrix.dat>\n";
my $infile = shift or die $usage;
my $accession = '';
my $start = 0;
@davetang
davetang / psl_to_bed_best_score.pl
Last active April 9, 2021 13:55
A more documented version of my psl_to_bed_best_score.pl script at http://davetang.org/wiki/tiki-index.php?page=Blat.
#!/usr/bin/env perl
use strict;
use warnings;
=head1 NAME
This scripts converts a psl file into a bed file written by Dave Tang.
=head1 SYNOPSIS
@davetang
davetang / copy_directory.pl
Last active December 28, 2015 09:29
Perl script that takes two directory paths, one old and one new, compares the two and copies directories in the old to the new if it doesn't exist.
#!/bin/env perl
use strict;
use warnings;
my $usage = "Usage: $0 <old_dir> <new_dir>\n";
my $old = shift or die $usage;
my $new = shift or die $usage;
my %current = ();
@davetang
davetang / random_bed.pl
Created November 19, 2013 04:03
Randomise a BED file.
#!/bin/env perl
use strict;
use warnings;
my $usage = "Usage: $0 <infile.bed>\n";
my $infile = shift or die $usage;
my %bed = ();
@davetang
davetang / split_chr.pl
Last active December 28, 2015 20:19
Script that takes as input a BED file stream and outputs the stream to its corresponding chromosome. Do not use this script in parallel.
#!/bin/env perl
use strict;
use warnings;
#hash for filehandles
my %fh = ();
#read from stream
while (<>){
#!/bin/env perl
use strict;
use warnings;
my $usage = "Usage: $0 <bam_flag>\n";
my $flag = shift or die $usage;
die "Please enter a numerical value\n" if $flag =~ /\D+/;
@davetang
davetang / mandelbrot.pl
Created May 13, 2014 12:41
The Mandelbrot set is a mathematical set of points whose boundary is a distinctive and easily recognizable two-dimensional fractal shape
#!/usr/bin/perl
use warnings;
use strict;
my $BAILOUT=16;
my $MAX_ITERATIONS=1000;
my $begin = time();