Created
September 9, 2012 22:33
-
-
Save mages/3687713 to your computer and use it in GitHub Desktop.
Using R in Insurance GIRO 2012
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Using R in Insurance | |
============================== | |
```{r results='asis', echo=FALSE, message=FALSE} | |
library(ChainLadder) | |
library(googleVis) | |
data(RAA) # example data set of the ChainLadder package | |
class(RAA) <- "matrix" # change the class from triangle to matrix | |
df <- as.data.frame(t(RAA)) # coerce triangle into a data.frame | |
names(df) <- 2002 : 2011 | |
df$dev <- 1:10 | |
LC <- gvisLineChart(df, "dev", options=list(gvis.editor="Edit me!", | |
title="Incurred claims", | |
hAxis='{title:"Development year"}', | |
width=600, height=350)) | |
print(LC, 'chart') | |
``` | |
Markus Gesmann, GIRO Brussels | |
19 September 2012 | |
# Hello - About me | |
<ul> | |
<li>Name: [Markus Gesmann](https://plus.google.com/118201313972528070577/posts)</li> | |
<li> Profession: Mathematician, working as an analyst at [Lloyd's](http://www.lloyds.com)</li> | |
<li>Maintainer and co-author of two R packages: | |
<ul> | |
<li>[ChainLadder](http://code.google.com/p/chainladder/): Statistical methods for the calculation of outstanding claims reserves in general insurance</li> | |
<li> [googleVis](http://code.google.com/p/google-motion-charts-with-r/): Interface between R and the Google Visualisation API</li> | |
</ul> | |
</li> | |
<li> Blogger: [mages' blog](http://lamges.blogspot.com)</li> | |
</ul> | |
# Agenda | |
<ul> | |
<li>My motivation for using R</li> | |
<li>Brief history of R and its way into insurance</li> | |
<li>Three R examples</li> | |
<li>R for actuaries: Where to start?</li> | |
<li>How I created this presentation with R</li> | |
<li>Conclusions and discussion</li> | |
</ul> | |
# Why I started using R | |
<ul> | |
<li> I started in insurance in 2003, fresh out of university</li> | |
<li> I had used a variety of software, tools and languages already</li> | |
<li> But I had no experience with spreadsheets</li> | |
<li> I was surprised how people worked in insurance</li> | |
<li> I was really surprised what colleagues did with spreadsheets</li> | |
<li> I looked for alternative data analysis tools</li> | |
<li> All the cool kids were talking about R</li> | |
<li> I wanted to be cool as well</li> | |
</ul> | |
<a href="http://lamages.blogspot.com"> | |
<img src="http://2.bp.blogspot.com/-NAMnviVPFnw/Tw_OJRa7kII/AAAAAAAAAIo/ERYyvM2dV84/s1600/photo-1.JPG" alt="mages blog" /></a> | |
# What is R? | |
<img src="http://www.r-project.org/hpgraphic.png" alt="R project" /> | |
<ul> | |
<li> <a href="http://www.r-project.org">R</a> is a free software environment for statistical computing and graphics </li> | |
<li> It compiles and runs on a wide variety of UNIX platforms, Windows and OS X</li> | |
<li> <a href="http://www.burns-stat.com/pages/Present/infernoishR_annotated.pdf">Notes on the history of R by Pat Burns</a></li> | |
</ul> | |
# R made its way from academia to industry | |
<ul> | |
<li> R started at universities in the <a href="http://www.stat.auckland.ac.nz/~ihaka/downloads/R-paper.pdf">1990s</a></li> | |
<li> It is still largely maintained by academics</li> | |
<li> R is widely used for research publications</li> | |
<li> <a href="http://lamages.blogspot.co.uk/2011/10/r-related-books-traditional-vs-online.html">Over 100 text books with R have been published already</a></li> | |
<li> Today many graduates leave universities with some R knowledge </li> | |
</ul> | |
# R made its way from academia to industry | |
<ul> | |
<li> Many commercial applications support R: | |
<ul> | |
<li><a href="http://www.oracle.com/us/corporate/features/features-oracle-r-enterprise-498732.html">Oracle</a>, <a href="http://help.sap.com/hana/hana_dev_r_emb_en.pdf">SAP Hana</a>, <a href="http://support.sas.com/rnd/app/studio/Rinterface2.html">SAS</a>, <a href="http://www-01.ibm.com/software/analytics/spss/products/statistics/developer/">IBM SPSS</a>, <a href="http://spotfire.tibco.com/~/media/content-center/datasheets/r-splus.ashx">Spotfire</a>, <a href="http://www.statconn.com">MS Office</a>, <a href="http://community.qlikview.com/docs/DOC-2975">QlikView</a>, etc. | |
</li> | |
</ul></li> | |
<li> Commercial support is available from third parties, e.g. | |
<ul> | |
<li><a href="http://www.mango-solutions.com">Mango Solutions</a>, <a href="http://www.revolutionanalytics.com">Revolution Analytics</a>, <a href="http://www.trinostics.com">Trinostics LLC</a>, <a href="http://www.burns-stat.com">Burns Stats</a></li> | |
<li>Or contact me</li> | |
</ul></li> | |
<li> R is established in many disciplines outside academia, e.g. <a href="http://cran.r-project.org/web/views/ClinicalTrials.html">pharma</a> and <a href="http://cran.r-project.org/web/views/Finance.html">finance</a></li> | |
<li> The insurance industry is adopting R as well, e.g. <a href="https://docs.google.com/open?id=0By35Mtg9R9_RZTk5NzM1NGItYmUzMi00MmQ2LTk0MWYtYTY4YzZiNjg0ODc2">Lloyd's</a></li> | |
</ul> | |
# Why R in insurance? | |
<ul> | |
<li> Why Excel, SAS, SQL, SPSS, Minitab, ...? </li> | |
<li> [John D. Cook](http://www.johndcook.com/blog/): Why and how people use R | |
<div align="center"> | |
<video controls="" height="240" poster="http://ch9files.blob.core.windows.net/ch9/3f13/009c002a-e4ef-4aeb-95de-e7ecda173f13/LangNEXT2012JohnCookRLanguage_Custom.jpg" width="320"><source src="http://ak.channel9.msdn.com/ch9/3f13/009c002a-e4ef-4aeb-95de-e7ecda173f13/LangNEXT2012JohnCookRLanguage_mid.mp4" type="video/mp4"></source><source src="http://ak.channel9.msdn.com/ch9/3f13/009c002a-e4ef-4aeb-95de-e7ecda173f13/LangNEXT2012JohnCookRLanguage.webm" type="video/webm"></source></video></div></li> | |
<li>Because it gets the job done!</li> | |
</ul> | |
# But why use a computing language? | |
<img src="https://lh5.googleusercontent.com/-0mb8ktQfQjE/TwU72nysQaI/AAAAAAAAI1g/EVOnQXhQsuM/s762/geeks-vs-nongeeks-repetitive-tasks.png" alt="Geeks vs Non-Geeks" /> | |
By <a href="https://plus.google.com/102451193315916178828/posts/MGxauXypb1Y">Bruno Oliveira</a> | |
# Typical use cases for R in insurance | |
<ul> | |
<li> Data transformation</li> | |
<li> Data analysis</li> | |
<li> Statistical modelling </li> | |
<li> Prototyping / ad-hoc work </li> | |
<li> End user computing </li> | |
<li> Background statistical engine for applications, e.g. pricing spreadsheet</li> | |
<li> Reporting and reproducible analysis, e.g. MI, Solvency II documentation </li> | |
<li> Learning statistical and actuarial skills </li> | |
</ul> | |
# Three R examples | |
<ul> | |
<li>Reserving: Mack chain-ladder</li> | |
<li>Automated reporting: Create PowerPoint slide with R output</li> | |
<li>Extracting data from a web page: Display earth quakes of the last 30 days</li> | |
</ul> | |
# Reserving: Mack chain-ladder | |
```{r} | |
library(ChainLadder) | |
RAA ## Example triangle | |
``` | |
# Reserving: Mack chain-ladder | |
```{r results='asis', echo=FALSE, message=FALSE} | |
RAA2 <- RAA # example data set of the ChainLadder package | |
class(RAA2) <- "matrix" # change the class from triangle to matrix | |
df <- as.data.frame(t(RAA2)) # coerce triangle into a data.frame | |
names(df) <- 2002 : 2011 | |
df$dev <- 1:10 | |
LC <- gvisLineChart(df, "dev", options=list(gvis.editor="Edit me!", | |
title="Incurred claims", | |
hAxis='{title:"Development year"}', | |
width=800, height=500)) | |
print(LC, 'chart') | |
``` | |
Chart created with <a href="http://code.google.com/p/google-motion-charts-with-r/">googleVis</a> | |
# Reserving: Mack chain-ladder | |
```{r} | |
M <- MackChainLadder(RAA, est.sigma="Mack") | |
M | |
``` | |
# Reserving: Mack chain-ladder | |
```{r eval=FALSE} | |
plot(M, lattice=TRUE) | |
``` | |
<img src="http://1.bp.blogspot.com/-0dYQffHdShk/Tr7mZibyQGI/AAAAAAAAAEo/hjb7XhrO_Mc/s1600/MackLattice.png" alt="Mack chain ladder output" /> | |
# Automated reporting: Create PowerPoint slide with R output | |
```{r eval=FALSE, tidy=FALSE} | |
myfile=tempfile() | |
win.metafile(file=myfile) | |
plot(M, lattice=TRUE) | |
dev.off() | |
## Load MS Office interface | |
library(rcom) | |
## Run VBA code from R | |
ppt<-comCreateObject("Powerpoint.Application") | |
comSetProperty(ppt,"Visible",TRUE) | |
myPresColl<-comGetProperty(ppt,"Presentations") | |
myPres<-comInvoke(myPresColl,"Add") | |
mySlides<-comGetProperty(myPres,"Slides") | |
mySlide<-comInvoke(mySlides,"Add",1,12) | |
myShapes<-comGetProperty(mySlide,"Shapes") | |
myPicture<-comInvoke(myShapes,"AddPicture", | |
myfile, 0,1,100,10) | |
``` | |
# Extracting data from a web page | |
```{r results='asis', tidy=FALSE} | |
library(XML) | |
library(googleVis) | |
## Source data diretly from the web | |
url <- "http://www.iris.edu/seismon/last30.html" | |
eq <- readHTMLTable(readLines(url), | |
colClasses=c("factor", rep("numeric", 4), "factor"), which=2) | |
## Format location data | |
eq$loc=paste(eq$LAT, eq$LON, sep=":") | |
``` | |
<div align="center"> | |
```{r results='asis' , echo=FALSE, message=FALSE} | |
tbl <- gvisTable(eq, options=list(width=800, height=250)) | |
print(tbl, 'chart') | |
``` | |
</div> | |
# Display earth quake information of last 30 days | |
```{r tidy=FALSE} | |
library(googleVis) | |
## Create a geo chart with the Google Chart API | |
G <- gvisGeoChart(eq, "loc", "DEPTH km", "MAG", | |
options=list(displayMode="Markers", | |
colorAxis="{colors:['purple', 'red', 'orange', 'grey']}", | |
backgroundColor="lightblue"), chartid="EQ") | |
``` | |
```{r eval=FALSE} | |
plot(G) | |
``` | |
<div align="center"> | |
```{r results='asis' , echo=FALSE, message=FALSE} | |
print(G, 'chart') | |
``` | |
</div> | |
# Getting started with R in actuarial work | |
<ul> | |
<li><a href="http://toolkit.pbwiki.com/f/R%20Examples%20for%20Actuaries%20v0.1-1.pdf">Introduction to R for Actuaries</a> by Nigel de Silva</li> | |
<li><a href="http://www1.fee.uva.nl/ke/act/people/kaas/ModernART.htm">Modern actuarial risk theory using R</a> by Kaas, Goovaerts, Dhaene and Denuit</li> | |
<li><a href="http://www.slideshare.net/dataspora/an-interactive-introduction-to-r-programming-language-for-statistics">An Interactive Introduction To R</a> by Michael Driscoll and Dan Murphy</li> | |
<li><a href="http://www.favir.net/">Formatted Actuarial Vignettes in R</a> by Ben Escoto</li> | |
<li><a href="http://www.actuaries.org.uk/system/files/documents/pdf/actuarial-toolkit.pdf">An Actuarial Toolkit</a> presented at GIRO convention 2006 in Vienna</li> | |
<li><a href="http://lamages.blogspot.co.uk/2011/09/r-and-insurance.html">Using R at Lloyd's</a> poster at UseR! conference 2011 in Warwick</li> | |
<li><a href="http://wwww.r-bloggers.com">R-Bloggers</a></li> | |
</ul> | |
# R packages for actuaries on CRAN | |
<ul> | |
<li><a href="http://cran.r-project.org/web/packages/actuar/index.html">actuar:</a> Loss distributions modelling, risk theory (including ruin theory), simulation of compound hierarchical models and credibility theory</li> | |
<li><a href="http://cran.r-project.org/web/packages/ChainLadder/index.html">ChainLadder:</a> Reserving methods in R</li> | |
<li><a href="http://cran.r-project.org/web/packages/copula/index.html">copula:</a> Multivariate Dependence with Copulas</li> | |
<li><a href="http://cran.r-project.org/web/packages/cplm/index.html">cplm:</a> Monte Carlo EM algorithms and Bayesian methods for fitting Tweedie compound Poisson linear models</li> | |
<li><a href="http://cran.r-project.org/web/packages/evir/index.html">evir:</a> Extreme Values in R</li> | |
<li><a href="http://cran.r-project.org/web/packages/fitdistrplus/index.html">fitdistrplus:</a> Help to fit of a parametric distribution to non-censored or censored data</li> | |
<li><a href="http://cran.r-project.org/web/packages/lifecontingencies/index.html">lifecontingencies:</a> Package to perform actuarial evaluation of life contingencies</li> | |
<li><a href="http://cran.r-project.org/web/packages/lossDev/index.html">lossDev:</a> A Bayesian time series loss development model</li> | |
<li><a href="http://cran.r-project.org/web/packages/mondate/index.html">mondate:</a> R package to keep track of dates in terms of months</li> | |
</ul> | |
# Meet the R experts | |
<table> | |
<tr> | |
<td> | |
<ul> | |
<li><a href="https://stat.ethz.ch/mailman/listinfo/r-sig-insurance">R special interest group insurance email list</a> </li> | |
<li> <a href="http://www.londonr.org">London R user group</a></li> | |
<li> <a href="https://www.rmetrics.org">R/Rmetrics Meielisalp Workshop on Computational Finance and Financial Engineering</a></li> | |
<li> <a href="http://www.rinfinance.com">R in Finance, Chicago</a></li> | |
<li> <a href="http://www3.uclm.es/congresos/useR-2013/">UseR! 2013, University of Castilla-La Mancha, Spain</a> </li> | |
</ul> | |
</td><td> | |
<img height="350px" src="http://photos1.meetupstatic.com/photos/event/7/a/highres_133320122.jpeg" alt="London R photo" /> | |
London R user group meeting | |
</td> | |
</tr> | |
</table> | |
# How I created this presentation with RStudio, knitr, pandoc and slidy | |
<ul> | |
<li> [knitr](http://yihui.name/knitr/) is a package by [Yihui Xie](http://yihui.name/) that brings literate programming to a new level | |
<ul> | |
<li>It allows to create content really quickly, without worrying to much about layout and R formatting</li> | |
</ul></li> | |
<li> [RStudio](http://rstudio.org) integrated knitr into its IDE, which allows to knit Rmd-files by the push of a button into markdown</li> | |
<li> Markdown output can be converted into several other file formats, such as html, with [pandoc](http://johnmacfarlane.net/pandoc/)</li> | |
<li> [slidy](http://www.w3.org/Talks/Tools/Slidy2/Overview.html) is one of the options to create interactive html-slides with pandoc</li> | |
<li> For more details see my recent [blog post](http://lamages.blogspot.co.uk/2012/05/interactive-reports-in-r-with-knitr-and.html) and [source code of this talk](https://gist.github.com/3687713).</li> | |
</ul> | |
<code><pre> | |
Rscript -e "library(knitr); knit('Using_R_in_Insurance_GIRO_2012.Rmd')" | |
pandoc -s -S -i -t slidy --mathjax Using_R_in_Insurance_GIRO_2012.md | |
-o Using_R_in_Insurance_GIRO_2012.html | |
</pre></code> | |
# Conclusions | |
<ul> | |
<li> R comes with lots of functions for actuarial work</li> | |
<li> It provides an ideal framework for end user computing</li> | |
<li> The momentum behind R has grown significantly over the last 5 years</li> | |
<li> Today R is often known by graduates - open up to their ideas</li> | |
<li> Many other software products developed R interfaces </li> | |
<li> New business models have evolved and will evolve</li> | |
</ul> | |
# If you liked this presentation ... | |
... you may also like: | |
<ul> | |
<li><a href="http://lamages.blogspot.co.uk/2012/09/interactive-web-graphs-with-r-overview.html">Interactive web graphs with R - Overview and googleVis tutorial</a>, Royal Statistical Society Conference, 2012</li> | |
<li><a href="http://www.rinfinance.com/agenda/2012/talk/MarkusGesmann.pdf">Overview of Lloyd's using R and googleVis</a>, R in Finance, 2012</li> | |
<li><a href="http://chainladder.googlecode.com/files/ChainLadder_Markus_20010Nov10.pdf">ChainLadder at the Predictive Modelling Seminar</a>, Institute of Actuaries, 2010 </li> | |
<li><a href="http://code.google.com/p/chainladder/downloads/detail?name=R_and_MS_Office-MG-20100504.pdf">How to integrate R into MS Office</a>, LondonR, 2010 </li> | |
<li><a href="http://lamages.blogspot.co.uk/2011/12/fitting-distributions-with-r.html">Fitting distribution with R</a></li> | |
<li><a href="http://lamages.blogspot.co.uk/2012/01/say-it-in-r-with-by-apply-and-friends.html">Say it in R with "by", "apply" and friends</a> </li> | |
</ul> | |
# Questions? | |
<ul> | |
<li> Idea: R in Insurance Workshop - Interest? </li> | |
<li>Contact: markus dot gesmann at gmail dot com</li> | |
</ul> |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment