Skip to content

Instantly share code, notes, and snippets.

View dgrapov's full-sized avatar

Dmitry Grapov dgrapov

View GitHub Profile
@dgrapov
dgrapov / covariate_adjust.R
Created January 6, 2024 08:56
Example of linear model base covariate adjustment
#get linear model residuals
#' @import dplyr
#' @export
dave_lm_adjust<-function(data,formula,test_vars,adjust=TRUE,progress=TRUE){
if (progress == TRUE){ pb <- txtProgressBar(min = 0, max = ncol(data), style = 3)} else {pb<-NULL}
out <- lapply(1:length(test_vars), function(i) {
if (progress == TRUE) {
setTxtProgressBar(pb, i)
@dgrapov
dgrapov / Orthogonal Signal Correction (OSC) for PLS models OSC-PLS (OPLS)
Last active November 28, 2023 03:28
Orthogonal Signal Correction for PLS models (OPLS)
#see updated code base and some examples in the function "test"
# https://github.com/dgrapov/devium/blob/master/R/Devium%20PLS%20%20and%20OPLS.r
#Orthogonal Signal Correction for PLS models (OPLS)
#adapted from an example in the book <a href="http://www.springer.com/life+sciences/systems+biology+an+bioinfomatics/book/978-3-642-17840-5">"Chemometrics with R by Ron Wehrens"</a>
#this code requires the following packages:
need.packages<-c("pls", # to generate PLS models
"ggplot2" ) # to plot results
@dgrapov
dgrapov / example.R
Created February 2, 2018 04:48
Example of a shiny app with data upload and different plot options
#initialize
library(shiny)
library(ggplot2)
library(purrr)
library(dplyr)
#example data
data(iris)
@dgrapov
dgrapov / SOM example.R
Last active March 11, 2023 11:21
Self-organizing map (SOM) example in R
#SOM example using wines data set
library(kohonen)
data(wines)
set.seed(7)
#create SOM grid
sommap <- som(scale(wines), grid = somgrid(2, 2, "hexagonal"))
## use hierarchical clustering to cluster the codebook vectors
groups<-3
@dgrapov
dgrapov / example.R
Last active February 21, 2023 15:27
Convert adjacency (or other) matrix to edge list
library(reshape2)
gen.mat.to.edge.list<-function(mat,symmetric=TRUE,diagonal=FALSE,text=FALSE){
#create edge list from matrix
# if symmetric duplicates are removed
mat<-as.matrix(mat)
id<-is.na(mat) # used to allow missing
mat[id]<-"nna"
if(symmetric){mat[lower.tri(mat)]<-"na"} # use to allow missing values
if(!diagonal){diag(mat)<-"na"}
@dgrapov
dgrapov / global.R
Last active September 30, 2022 01:58
Plotting demo using ggplot2. Check out at http://spark.rstudio.com/dgrapov/1Dplots/.
#initialize
library(datasets)
library(ggplot2)
#helper function (convert vector to named list)
namel<-function (vec){
tmp<-as.list(vec)
names(tmp)<-as.character(unlist(vec))
tmp
}
#check for and/or install dependencies
need<-c("RCurl","ggplot2","gridExtra","reshape2")
for(i in 1:length(need)){
if(require(need[i], character.only = TRUE)==FALSE){ install.packages(need[i]);library(need[i], character.only = TRUE)} else { library(need[i],character.only = TRUE)}
}
if(require(pcaMethods)==FALSE){
need<-c('Rcpp', 'rJava',
'Matrix', 'cluster', 'foreign', 'lattice', 'mgcv', 'survival')
for(i in 1:length(need)){
@dgrapov
dgrapov / plotly_select_DT.R
Last active September 10, 2020 01:25
ggplot2 to plotly to shiny to box/lasso select to DT
#plotly box or lasso select linked to
# DT data table
# using Wage data
# the out group: is sex:Male, region:Middle Atlantic +
library(ggplot2)
library(plotly)
library(dplyr)
library(ISLR)
@dgrapov
dgrapov / dplyr_tutorial.Rmd
Created April 28, 2015 23:21
hands on with dplyr
---
title: "Hands-on with dplyr"
author: "Dmitry Grapov"
output:
html_document:
keep_md: yes
---
## Introduction
DATA SCIENCE EXERCISE
The following challenge requires the beer reviews data set called beer_reviews.csv. This data set can be downloaded from the following site: https://data.world/socialmediadata/beeradvocate . Note you can create a free temporary account to download this .csv.
Questions to answer using this data:
Which brewery produces the strongest beers by ABV%?
If you had to pick 3 beers to recommend using only this data, which would you pick?
Which of the factors (aroma, taste, appearance, palette) are most important in determining the overall quality of a beer?
Additional math/coding question unrelated to the data: