Skip to content

Instantly share code, notes, and snippets.

@hakyim
Last active January 21, 2021 23:46
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save hakyim/5d2251ea1a86009499e4ffdf47fe2735 to your computer and use it in GitHub Desktop.
Save hakyim/5d2251ea1a86009499e4ffdf47fe2735 to your computer and use it in GitHub Desktop.
## Relatively fast linear regression
fastlm = function(xx,yy)
{
## compute betahat (regression coef) and pvalue with Ftest
## for now it does not take covariates
df1 = 2
df0 = 1
ind = !is.na(xx) & !is.na(yy)
xx = xx[ind]
yy = yy[ind]
n = sum(ind)
xbar = mean(xx)
ybar = mean(yy)
xx = xx - xbar
yy = yy - ybar
SXX = sum( xx^2 )
SYY = sum( yy^2 )
SXY = sum( xx * yy )
betahat = SXY / SXX
RSS1 = sum( ( yy - xx * betahat )^2 )
RSS0 = SYY
fstat = ( ( RSS0 - RSS1 ) / ( df1 - df0 ) ) / ( RSS1 / ( n - df1 ) )
pval = 1 - pf(fstat, df1 = ( df1 - df0 ), df2 = ( n - df1 ))
res = list(betahat = betahat, pval = pval)
return(res)
}
## ------------------------------
## August 6, 2012
## Hae Kyung Im
## haky@uchicago.edu
##
## Department of Health Studies
## Biostatistics Laboratory
## University of Chicago
## Please cite:
## Eric R Gamazon, R. Stephanie Huang, Eileen Dolan, Nancy Cox, and Hae Kyung Im, (2012)
## Integrative Genomics: Quantifying significance of phenotype-genotype
## relationships from multiple sources of high-throughput data
## Frontiers of Genetics - under review
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment