This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| getrange <- function(i, length, range){ | |
| sort(c( | |
| c(((i - 1) - c(1:range)) %% length + 1), | |
| i, | |
| c(((i - 1) + c(1:range)) %% length + 1) | |
| )) | |
| } | |
| ## example: get a vector of indices +/- 10 around i = 5 on the circular integer interval [1,100] | |
| getrange(i = 5, length = 100, range = 10) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| require(tibble) | |
| require(dplyr) | |
| require(stats) | |
| # make mock data, genes as columns | |
| control <- tibble(subject = paste0("control", sprintf('%0.2d', 1:20)), | |
| gene1 = rnorm(20, 10, 2), | |
| gene2 = rnorm(20, 10, 2), | |
| gene3 = rnorm(20, 10, 2)) | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # tested with R v4.2.0 and tidyverse v2.0.0 | |
| read_gtf <- function(file) { | |
| require(tidyverse) | |
| cnames <- c("seqname","source","feature","start","end","score","strand","frame","attribute") | |
| # read in raw gtf as tsv and remove comment rows | |
| messy <- read_tsv(file, col_names = cnames, comment = "#") | |
| # get the unique attribute types |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #!/bin/bash | |
| # FUNCTION | |
| ## this script recursively searches a directory for fasta files matching a pattern | |
| ## found files are concatenated and sorted by descending sequence length | |
| # INPUT | |
| ## first positional parameter is the directory to search | |
| ## second is the pattern to match in fasta filenames | |
| ## third is the output filename |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| msa_motifs <- function(msa, motifs) { | |
| require(tibble) | |
| require(Biostrings) | |
| require(dplyr) | |
| # read in fasta as Biostrings object | |
| seqs <- Biostrings::readAAStringSet(filepath = msa, format = "fasta") | |
| maxseqlen <- max(width(seqs)) | |
| # initiate results tibble | |
| results <- tibble::tibble(position = seq(1:maxseqlen)) | |
| # readAAMultipleAlignment requires equal length of sequences |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # requires the native pipe from R >= 4.1 | |
| ## maps each base to a particular color | |
| ## colors can be named or hex values | |
| ## outputs a format recognized by the forna webapp's custom color editor | |
| ## http://rna.tbi.univie.ac.at/forna/forna.html | |
| forna_colors <- function(sequence, colormap) { | |
| inseq <- gsub(x = strsplit(toupper(as.character(sequence)),split="")[[1]], | |
| pattern="T", | |
| replacement="U") |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # assumes fixed number of lines in header and footer | |
| # might change with different versions of MEME suite | |
| read_PWMs <- function(file) { | |
| pset <- head(readLines(con = file, n = -10)[-c(1:29)], -10) | |
| pset <- subset(pset, !grepl(pattern = "letter", pset)) | |
| i1 <- !nzchar(pset) | |
| pset <- unname(split(pset[!i1], cumsum(i1)[!i1])) | |
| names(pset) <- gsub(pattern = " ", | |
| replacement = "_", |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # Given vcf and gff, plot position and allele freq. of SNPs and indels over ranges | |
| # tested with VCFv4.1 and gff version 3 | |
| # expects certain vcf and gff attributes, may not work with all files | |
| plot_vcf <- function(vcf, gff, ranges) { | |
| require(tidyverse) # v2.0.0 | |
| require(gggenes) # v0.5.0 | |
| require(patchwork) # v1.1.2 | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| make_unambiguous <- function(dna) { | |
| require(tidyverse) | |
| require(S4Vectors) | |
| iupac <- tibble(code = c("A", "C", "G", "T", | |
| "R", "Y", "S", "W", | |
| "K", "M", "B", "D", | |
| "H", "V", "N"), | |
| base = c("A", "C", "G", "T", | |
| "AG", "CT", "GC", "AT", | |
| "GT", "AC", "CGT", "AGT", |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| entrez2dss <- | |
| function(id_list) { | |
| require(rentrez) # v1.2.3 | |
| require(Biostrings) # v2.66.0 | |
| require(stringr) # v1.5.0 | |
| # fetch all sequences, trim empty elements | |
| raw_ <- | |
| entrez_fetch(db = "nucleotide", |
OlderNewer