S4 is a certain specification of R object class (thus, it is useful in object oriented programming). You define an S4 class based on the data you want it to hold, and you can then create many unique copies of that S4 class that hold different data but all adhere to the same specification and can take advantage of the same functions.
The concepts of class and method are crucial to understanding object-oriented programming:
- class defines the behavior of objects in two ways:
- Describes their attributes
- Describes their relationships to other classes
- methods are functions that behave differently depending on the class of their input. For example, the
plot
method functions differently depending on whether you tell it to plot a raster, timeseries, etc.
S4 in particular is a system that features a few formalities compared to other classes in R:
- Formal class definitions describing representation and inheritance for each class
- Special helper functions that define generics and methods
- Multiple dispatch [I don't understand what this means]
S4 classes are often overkill for what R programmers need to do, but they are good for "more complicated systems of interrelated objects".
The core components of an S4 implementation are
- classes
- generic functions
- methods
And those things are "glued" together by method dispatch (of the "simple" and the "multiple" varieties).
- Create and define an S4 class using
setClass()
- This ensures consistency by requiring metadata of all parts of the new class
- Subsequent assignments to class slots are checked for validity of type agreement
- Create objects of an S4 using the constructer function
new
or simply<-
assignment - Inspect class structure with
str()
A class has three key properties:
- name: a string that identifies the class
- contains is a character vector of classes that your S4 class inherits from. The concept of inheritance is a little unclear to me...
- To inherit S4 classes from S3 classes you must use
setOldClass()
- I don't understand why you don't need to use
contains
on your basic types likenumeric
andcharacter
...
- To inherit S4 classes from S3 classes you must use
- representation: is depricated, use
slots()
- slots a list of slots telling the slot names and classes. For example,
slots = c(name = 'character', age = 'numeric')
- Use
getSlots()
to return a description of all slots of a class - Use the
slot(class, 'slotname')
function to access a slot of an S4 object - Alternatively, use the
@
in place of$
to access S4 object slots, like so:
# First create the class:
setClass('Person', representation(name = 'character', age = 'numeric'))
setClass('Employee', representation(boss = 'Person'), contains = 'Person')
# Then fill it:
hadley <- new('Person', name = 'Hadley')
hadley@age
- But it's considered a poor idea to use
@
in place of a properly written accessor- This speaks to the need to write proper accessors---it is on the programmer to write methods for every possible user need
- The developer shouldn't expect the user to directly access slots
- The user shouldn't expect to have to access slots directly
- Thus when I write an S4 class I need to have read and write accessor methods for each slot, and the write accessor should also do validity checking
- Use
showMethods('functionname')
orshowMethods(class = 'classname')
to find out what methods are defined for a function or class, respectively - Define a new method for use with your new class:
- First reserve the name for your method using
setGeneric
- Then use
setMethod
like so:
- First reserve the name for your method using
setMethod('age_squared', signature(the_person = 'Person'),
function(the_person) {
the_person@agesquared = the_person@age * the_Person@age
return(the_person)
}
)
- Or if you want to use an existing method with your new class you can just use the
setMethod
step - "getter" and "setter" methods may resemble each other:
width(ir) <- width(ir) - 5
- "Constructor" function can be named the same name as the class and set all the slots with whatever function arguments. This makes it easy to document creation of new classes. Also explains why doing something like
?lme4::lmer
returns several different suggestions corresponding to the function and the corresponding class(es). So really when you think about it, your call tolmer()
that I think of as creating a model fit is really an S4 class constructor.
Use the setValidity()
function to define validity checks, and define an initialize()
method to check for that validity when creating objects of the class using new()
- OO Field Guide in Hadley's Advanced R ebook
- Bioconductor's resources for learning S4
- Cyclismo.org S4 class tutorial
- Documenting S4 classes with roxygen2
- Bioconductor S4 Overview slides is a quite good source for how S4 classes work, defining various methods like constructor, coercion, etc.
- How S4 methods work