Skip to content

Instantly share code, notes, and snippets.

@agstudy
Created June 15, 2015 13:33
Show Gist options
  • Save agstudy/b93ad037715f80848f35 to your computer and use it in GitHub Desktop.
Save agstudy/b93ad037715f80848f35 to your computer and use it in GitHub Desktop.
---
title: "Data/Structure Validation"
author: "agstudy"
---
This post is an answer to [SO question](http://stackoverflow.com/questions/30844363/data-structure-validation-for-r#comment49735091_30844363) about creating a well typed data structure in R.
I think, that in R the only way to define a typed data structure is to `S4 class`. I should not that even S4 classes are *not strongly typed* since you can define your slot as `list`.
I Create an S4 class **TypedData**:
* I define 4 slots . I set the type of each slot. S4 class will validate gracefully for us that created object respect this typing
* Then I add a validation part to check our slots against some conditions. Here for example age and weight should be positives values.
* You can also add some slot's default values.
```{r}
# Create TypedData class
TypedData <- setClass(
# Set the name for the class
"TypedData",
# Define the slots
representation (
date = "Date",
age = "numeric",
weight = "numeric",
job = "character"
),
# Set the default values for the slots
prototype=list(
job = NA_character_
),
# Make a function that can test to see if the data is consistent.
validity=function(object)
{
if(length(object@age) ==0 || any(object@age < 0) )
return("Age should be >0")
if(length(object@weight) ==0 || any(object@weight < 0) )
return("Weight should be >0")
return(TRUE)
}
)
```
Implement the S3 method to convert the S4 class to a `data.frame`
```{r}
as.data.frame.TypedData <-
function(x, row.names=NULL, optional=FALSE, ...)
{
value <- setNames(
lapply(slotNames(x),function(y){
col <- slot(x,y)
if(length(col)<length(x@date))
col <- rep(NA_character_,length(x@date))
col
}),
slotNames(x))
attr(value, "row.names") <-
as.character(seq_len(length(x@date)))
class(value) <- "data.frame"
value
}
```
Create some data.
I Use the vector interpretation of the S4 class to create a data example.
It is is better for performance to manipulate columns.
```{r}
pers <- new("TypedData",
date=seq(as.Date("2015/1/1"), as.Date("2015/3/1"), "months"),
age=c(20,30,50),
weight=c(80,50,64))
pers
```
Now we convert the S4 object to a data.frame.
We check that the result is well typed.
```{r}
as.data.frame(pers)
str(as.data.frame(pers))
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment