dill/dsm2-syntax.txt

## dsm2-syntax.txt
Proposal for new DSM syntax

The current syntax for dsm.fit is based on pretty old code, I think that it makes sense (especially as we add more functionality) to make a change to the way models are specified. The basic idea is to make things look more like mgcv::gam().

Currently things like the family and knots are specified using lists of lists, this involves lots of typing and doesn't make much sense if you're used to fitting GAMs using mgcv. This becomes even more of a pain if we want to start using gamm() for mixed models (I do anyway...)

Proposal:

dsm(formula, ddf.obj, segment.data, observation.data, engine=c("gam","gamm","glm"), convert.units=1, family=quasipoisson(link="log"), group=FALSE, ...)

any options to be passed to gam() can be put in the ... and passed straight through, no messing about parsing lists and strings and doing evals or any of that nonsense. The family argument is exactly as in mgcv, and effectively just gives a reasonable default.

The previous response argument can be put into the formula, which I think makes much more sense. So a valid formula would be something like:

abundance ~ s(x,y) + s(depth)       or        density ~ s(x,y) + s(depth)

or perhaps as a shorthand

N ~ s(x,y) + s(depth)               or        D ~ s(x,y) + s(depth)

then dsm goes off, parses the formula, constructs the variable you want and then from there fits the model.

The other result is we drop support for just giving a set of detection probabilities. As I said previously, this doesn't work anyway, so there's nothing lost except an expectation that it would work.

As for the returned object, I think that returning the raw gam object then adding in the extra ddf object as part of that list and giving the whole thing an extra class ("dsm") would enable us to use the gam methods we want without the current fuss.

The group option is a shortcut to set size=1 in the observation data, giving a group abundance estimate rather than individual (at the request of Len Thomas).
	Proposal for new DSM syntax

	The current syntax for dsm.fit is based on pretty old code, I think that it makes sense (especially as we add more functionality) to make a change to the way models are specified. The basic idea is to make things look more like mgcv::gam().

	Currently things like the family and knots are specified using lists of lists, this involves lots of typing and doesn't make much sense if you're used to fitting GAMs using mgcv. This becomes even more of a pain if we want to start using gamm() for mixed models (I do anyway...)

	Proposal:

	dsm(formula, ddf.obj, segment.data, observation.data, engine=c("gam","gamm","glm"), convert.units=1, family=quasipoisson(link="log"), group=FALSE, ...)

	any options to be passed to gam() can be put in the ... and passed straight through, no messing about parsing lists and strings and doing evals or any of that nonsense. The family argument is exactly as in mgcv, and effectively just gives a reasonable default.

	The previous response argument can be put into the formula, which I think makes much more sense. So a valid formula would be something like:

	abundance ~ s(x,y) + s(depth) or density ~ s(x,y) + s(depth)

	or perhaps as a shorthand

	N ~ s(x,y) + s(depth) or D ~ s(x,y) + s(depth)

	then dsm goes off, parses the formula, constructs the variable you want and then from there fits the model.

	The other result is we drop support for just giving a set of detection probabilities. As I said previously, this doesn't work anyway, so there's nothing lost except an expectation that it would work.

	As for the returned object, I think that returning the raw gam object then adding in the extra ddf object as part of that list and giving the whole thing an extra class ("dsm") would enable us to use the gam methods we want without the current fuss.

	The group option is a shortcut to set size=1 in the observation data, giving a group abundance estimate rather than individual (at the request of Len Thomas).