Skip to content

Instantly share code, notes, and snippets.

@davidckatz
davidckatz / preferred.rotation
Last active September 13, 2019 15:51
preferred.rotation: orient a three-dimensional shape object so that major axes align with Cartesian planes
INSTRUCTIONS (see file below for R script)
This function conveniently aligns a specimen to anatomical position or a similar orientation chosen by the user, with the midline landmarks flush to the XY-plane. Once computed, the rotation matrix from the aligned specimen (e.g., the consensus configuration) can be used to rotate each of the observations. After all specimens are aligned, differences on the z-axis can be interpreted as transformations of width; the y-axis as of height; and the x-axis as of length. This becomes particularly useful when shape variables are fit in linear models, because it makes the coefficients more interpretable. It can also be useful when generating figures, as it ensures a consistent orientation.
An example of the 3D output is provided in the first Comment to this page.
INPUTS
1. specimen: Landmarks*dimensions matrix of shape to be rotated.
2. mids.end: Last midline landmark in specimen.
3. anterior: Anteriormost landmark point of reference for line that should be made horizonta
@davidckatz
davidckatz / structured.dmvn
Created September 10, 2016 22:12
structured.dmvn: compute multivariate normal densities for samples with group structure (e.g., individuals within species)
INSTRUCTIONS (see file below for script)
This is a slight modification of R's multivariate probability density function, dmvnorm, from the mvnorm package. The difference is that structured.dmvn allows means to vary by group (e.g., species, population, etc.). In contrast, dmvnorm evaluates all observations relative to a single, grand mean. The varying mean option is necessary when calculating log-likelihoods of observations in structured samples.
INPUTS
1. x: matrix of observations (indivs * traits)
2. mean: matrix of means (indivs * traits)
3. sigma: covariance matrix (traits * traits)
VALUES (returns)
@davidckatz
davidckatz / impute.pls2B
Last active September 13, 2019 15:52
Impute missing 3D landmarks using partial least squares regression
INSTRUCTIONS (see the file below for the R script)
This function imputes missing landmark data in one specimen based upon the locations of the other, recorded landmarks. The predicted relationship is estimated using two-block partial least squares analysis (PLS) on a set of complete case observations. The complete case data is divided into two blocks: one includes all landmarks that were recorded in the incomplete specimen, the other all landmarks that were not. PLS is used to identify axes that maximize covariance between the two blocks, and these relationships are used to predict missing landmarks in the incomplete case.
INPUTS
1. n*(p*k) matrix of observations in which row 1 is the case with missing data and the remaining rows are complete cases
2. Optionally, a vector of group identities (group.id) for all specimens, with length = nrow(X).
VALUES (returns)
1. Vector of landmark coordinates rescaled by estimated centroid size.
@davidckatz
davidckatz / distance.overland
Last active September 13, 2019 15:52
distance.overland: compute a matrix of overland, pairwise geographic distances between locations
INSTRUCTIONS (see file below for R script)
This function computes approximate overland geographic distances between all sample pairs. Where travel is between regions (more or less synonymous with intercontinental movements), the incorporates waypoints into the function, so that estimated distances do not unreasonably pass over large bodies of water. The function may be useful any time simple, pairwise distances between groups (or more generally, locations) are required. For recent humans, the geographic distance estimates may be used as proxies for genetic distances, as the two are in reasonably close correspondence (Ramachandran et al., 2005). Distances are in kilometers. The earth’s curvature is accounted for using the haversine equation (Sinnott, 1984).
INPUTS
1. geoinfo: object of class data.frame, of dimension groups*4. The columns must be arranged in the following order: first column, latitude; second column, longitude; third column, group name or location identifier; fourth column, region (see NOTES