Skip to content

Instantly share code, notes, and snippets.

View timedreamer's full-sized avatar

Ji Huang timedreamer

View GitHub Profile
@timedreamer
timedreamer / PWMSearch.py
Created December 4, 2015 16:57 — forked from JudoWill/PWMSearch.py
This is a python script which processes a SeqInterval file and Jaspar matrix file to find all occurrences of the TF binding sites.
"""
PWMSearch
Searches through a fasta sequence for the relevant Position Weight Matrices
from the Jaspar Database.
"""
from __future__ import division
from optparse import OptionParser
@timedreamer
timedreamer / at_ecotype_map.R
Last active October 27, 2018 19:20
Short R script to plot 18 Arabidopsis ecotypes location on map.
# Plot 18 arabidopsis ecotypes' locations on map.
## Author: Ji Huang
## Date: 2018-10-25
# The longitude and latitude info is provided by Chia-Yi on "EighteenAccession"
# in the email. I changed two locations for the obvious wrong location.
library(ggmap)
library(ggrepel)
@timedreamer
timedreamer / plotPCA_anyPrinciples.R
Last active January 29, 2022 18:04
Plot PCA for RNASeq at any principles as scatterplot. PC1-PC2, PC3-PC4 any as you wish.
# The function is the basically the same as https://github.com/mikelove/DESeq2/blob/master/R/plots.R.
# Inspired by https://www.biostars.org/p/243695/.
# I added two variables, pp1 and pp2 to let user chose which principle to plot. pp1 and pp2 only take integer values.
# Last modified: 2021-12-03
plotPCA_jh = function(pp1=1, pp2=2,
object, intgroup="condition",
ntop=1000, returnData=FALSE) {
@timedreamer
timedreamer / read_all_files_folder.R
Last active September 25, 2019 02:24
Read all files in a folder(sub-folder) into a dataframe in R.
# Date: 2019-09-24
list_of_files <- list.files(path = here("data", "dap_download_may2016_genes"),
recursive = TRUE, pattern = "\\.txt$", full.names = TRUE)
# I discard the amp-DAP data.
list_of_files <- list_of_files[!str_detect(list_of_files, "amp")]
dap <- list_of_files %>%
set_names(.) %>%
@timedreamer
timedreamer / calculate_random_network_AUPR.R
Created April 17, 2020 19:22
Calculate random network AUPR.
# Calculate random network AUPR.
## Author: Ji Huang
## Date: 2020-04-15
library(here)
library(tidyverse)
library(precrec)
@timedreamer
timedreamer / precision_to_edge-weight.Rmd
Last active July 24, 2020 02:42
Convert network precision score to the edge weight.
I used the precision curve only to decide the cutoff value. I chose a precision cutoff 0.2, then the correspondent normalized ranking is *~0.01*. The meaning of the normalized ranking is explained [here](https://github.com/takayasaito/precrec/issues/12).
Therefore, to calculate the **weight** value for precision 0.2, we got the following calculation: (rank-1)/(n-1) = 0.01. In this case, `n=nrow(dfg_ortho_label)`, so the rank is *649*. We went back to the *649* row of the `dfg_ortho_label`, the `weight` is 1.57. Therefore, we kept the edge that has weight higher than 1.57.
```{r, fig.width=5, fig.height=5, fig.align="center"}
scurve_os_b <- evalmod(scores = dfg_ortho_label$weight, labels = dfg_ortho_label$label,
mode = "basic")
sos_df_b <- fortify(scurve_os_b)
p2 <- ggplot(subset(sos_df_b, curvetype == "precision"), aes(x = x, y = y))+
geom_point(color = "blue", size = 0.4)+ ylim(0:1)
@timedreamer
timedreamer / .gitignore
Created May 7, 2020 02:49 — forked from hieblmedia/.gitignore
Gitignore - Exclude all except specific subdirectory
#
# If all files excluded and you will include only specific sub-directories
# the parent path must matched before.
#
/**
!/.gitignore
###############################
# Un-ignore the affected subdirectory
@timedreamer
timedreamer / save_pheatmap_pdf.R
Created January 7, 2021 22:18
Save pheatmap figure into pdf
# An R function to save pheatmap figure into pdf
# This was copied from Stackflow: https://stackoverflow.com/questions/43051525/how-to-draw-pheatmap-plot-to-screen-and-also-save-to-file
save_pheatmap_pdf <- function(x, filename, width=7, height=7) {
stopifnot(!missing(x))
stopifnot(!missing(filename))
pdf(filename, width=width, height=height)
grid::grid.newpage()
grid::grid.draw(x$gtable)
@timedreamer
timedreamer / theme_jh.R
Last active March 17, 2021 14:34
The ggplot theme I updated based on theme_bw().
library(ggplot2)
theme_jh <- function (base_size = 11, base_family = "Arial",
base_line_size = base_size/22,
base_rect_size = base_size/22) {
theme_grey(base_size = base_size, base_family = base_family,
base_line_size = base_line_size,
base_rect_size = base_rect_size) +
theme(panel.background = element_rect(fill = "white", colour = NA),
panel.border = element_rect(fill = NA, colour = "grey20"),
panel.grid = element_line(colour = "grey92"),
@timedreamer
timedreamer / GENIE3_quick_test.R
Created January 6, 2022 19:11
Test using different Regulator or Targets in GENIE3, do I get the same output. It depends. Regulator-No; Target-Yes.
# Test on GENIE3 whether
# (Q1) if using more genes as regulator, for the same regulator-target edge
# do I get the same order? No. The edge order will be different.
# (Q2) if using more genes as targets, for the same regulator-target edges,
# do I get the same order? Yes. The same edge will have the exact same weight.
# Author: Ji Huang
# Date: 2021-01-06