dmbates/rrssf.md

## rrssf.md

      
    Raw
  

              rrssf.md
            
          
    Background

In revising our 1988 Wiley book Nonlinear Regression Analysis and Its
Applications I have taken advantage of advances in computing to
evaluate projections of residual sum-of-squares contours for
3-parameter models.  By evaluating the residual sum-of-squares on a
3-dimensional grid I can create the projections onto the 2-dimensional
faces and plot contours.  I also hope to use 3-D graphics to plot the
2-dimensional boundary.  Any pointers on how to go about this would be
appreciated.  I presume I would need to generate a mesh for the surface
from the grid values and then use something like OpenGL for the
rendering.
I want to get more ambitious and evaluate other objective functions
that would involve creating and decomposing the gradient matrix of the
residual evaluation so I want to squeeze every ounce of performance
from the evaluation of the grid.
An example

As an example consider the asymptotic regression model implemented in
the SSasymp self-starting model function in R as
A + (R0-A)*exp(-exp(lrc) * t) where A is the horizontal asymptote
as the covariate t goes to infinity, R0 is the response at time
zero and lrc is the logarithm of the rate constant.  The Rumford
data set
> t(Rumford)
       1   2   3   4   5   6   7   8   9  10  11    12  13
time   4   5   7  12  14  16  20  24  28  31  34  37.5  41
temp 126 125 123 120 119 118 116 115 114 113 112 111.0 110

is an example of the type of data for which this model would be
suitable.  The least squares estimates of the parameters in the model
are
> coef(rm1 <- nls(temp ~ SSasymp(time, A, R0, lrc), Rumford))
         A         R0        lrc 
106.195070 129.111127  -3.195303 

R implementation

A pure R implementation of the grid evaluation could be
rrssf <- cmpfun(with(Rumford,
                     function(x) sum((temp-(x[1]+(x[2]-x[1]) *
                                      exp(-exp(x[3])*time)))^2)))
gr <- array(apply(expand.grid(A=Avals, R0=Rvals, lrc=lvals), 1,
              rrssf), c(length(Avals), length(Rvals), length(lvals)))

This implementation has several problems including generating the entire grid when all that is necessary is to iterate over the elements of Avals, Rvals and lvals and recalculating vectors like exp(-exp(x[3]*time) many, many times.
The ranges of parameter values to form the grid are calculated in R as
> rm1pr <- profile(rm1, alpha=pf(3*qf(0.99,3,10),1,10,low=FALSE))
> library(MASS)
> (ci <- confint(rm1pr, level=pf(3*qf(0.99,3,10),1,10)))
          0.1%      99.9%
A    98.102879 109.520821
R0  127.261533 131.194425
lrc  -3.783368  -2.805822

Evaluating a grid with 100 values in each dimension takes about 25-30 seconds on my computer.
> Avals <- seq(ci[1,1], ci[1,2], length.out=100)
> Rvals <- seq(ci[2,1], ci[2,2], length.out=100)
> lvals <- seq(ci[3,1], ci[3,2], length.out=100)
> system.time(gr <- array(apply(expand.grid(A=Avals, R0=Rvals, lrc=lvals), 1,
                               rrssf), c(length(Avals), length(Rvals), length(lvals))))
   user  system elapsed 
26.529   0.084  26.674 

Julia implementation

Apparently I have misunderstood creating a closure in Julia because I tried to do it as
function gridfgen(covariate::Vector{Float64}, response::Vector{Float64})
    szr = size(response)
    if size(covariate) != szr error("Vector sizes must match") end
    function gridrss(A::Vector{Float64}, R0::Vector{Float64}, lrc::Vector{Float64})
        szA  = size(A)
        szR  = size(R0)
        szl  = size(lrc)
        gr = Array(Float64, (szA, szR, szl))
        for l in 1:szl
            nrc  = -exp(lrc[l])
            expf = [exp(nrc * x) for x in covariate]
            for k in 1:szR
                RR = R0[k]
                rr = [response[i] - RR * (1. - expf[i]) for i in 1:szr]
                for j in 1:szA
                    aa = A[j]
                    sm = 0.
                    for i in 1:szr
                        sm += (rr[i] - aa * expf[i])^2
                    end
                    gr[j,k,l] = sm
                end
            end
        end
    end
end

but evaluating that function on the time and temp vectors from the Rumford data set does not produce a function.
julia> Rrss = gridfgen(Rtime, float64(Rtemp))

julia> typeof(Rrss)
Nothing

Suggestions are welcome.
With help from Patrick O'Leary and Stefan Karpinski I now have a version that works
function gridfgen(covariate::Vector{Float64}, response::Vector{Float64})
    szr  = size(response,1)
    cov  = copy(covariate)
    resp = copy(response)
    if size(covariate,1) != szr error("Vector sizes must match") end
    function(A::Vector{Float64}, R0::Vector{Float64}, lrc::Vector{Float64})
        szA  = size(A,1)
        szR  = size(R0,1)
        szl  = size(lrc,1)
        gr = Array(Float64, (szA, szR, szl))
        for l in 1:szl
            nrc  = -exp(lrc[l])
            expf = [exp(nrc * x) for x in cov]
            for k in 1:szR
                RR = R0[k]
                rr = [resp[i] - RR * expf[i] for i in 1:szr]
                for j in 1:szA
                    aa = A[j]
                    sm = 0.
                    for i in 1:szr
                        sm += (rr[i] - aa * (1. - expf[i]))^2
                    end
                    gr[j,k,l] = sm
                end
            end
        end
        gr
    end
end

It turns out that this is pretty fast.
julia> Rrssf = gridfgen(Rtime, float64(Rtemp))
#<function>

julia> @elapsed gr = Rrssf(Avals, Rvals, lvals)
0.250669002532959

to evaluate over the same range of values as in the R code.