This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
--- | |
title: "geom_col vs geom_bar" | |
author: "Martin Monkman" | |
date: "2020/04/19" | |
output: html_document | |
--- | |
```{r setup, include=FALSE} | |
knitr::opts_chunk$set(echo = TRUE) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
--- | |
title: "Transform Data" | |
subtitle: "hands-on examples, with answers" | |
output: html_notebook | |
--- | |
<!-- This file by Charlotte Wickham (with some modifications by Martin Monkman) is licensed under a Creative Commons Attribution 4.0 International License, adapted from the orignal work at https://github.com/rstudio/master-the-tidyverse by RStudio and https://github.com/cwickham/data-science-in-tidyverse-solutions. --> | |
```{r setup} | |
library(tidyverse) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
### --- | |
# | |
# from @expersso | |
set.seed(894) # number of regular season NHL goals Wayne Gretzky scored | |
x <- replicate(10000, sum(sample(0:1, 20, TRUE, c(0.945, 0.055)))) | |
table(ifelse(x == 0, "Team A win", ifelse(x == 1, "Draw", "Team B win"))) / 100 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# | |
# for details see | |
# http://bayesball.blogspot.ca/2013/06/annotating-select-points-on-x-y-plot.html | |
# | |
# load the ggplot2 and grid packages | |
library(ggplot2) | |
library(grid) | |
# read data (note csv files are renamed) | |
tbl1 = read.csv("FanGraphs_Leaderboard_h.csv") | |
tbl2 = read.csv("FanGraphs_Leaderboard_d.csv") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# quick example of mutate (in the dplyr R package) to create a dummy variable | |
# packages (from the tidyverse) | |
library(tibble) | |
library(dplyr) | |
# a little tibble with an ID number and a gender variable (5 Female, 3 Male, 2 Not Stated) | |
mydata <- tibble(id = 1:10, gender = c("F", "F", "F", "F", "F", | |
"M", "M", "M", | |
"NS", "NS")) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
library(tidyverse) | |
datatab <- as.tibble(c(1:10)) | |
# modulo division | |
datatab$value %% 2 | |
# since we have alternating even and odd value in "value" variable | |
datatab %>% | |
mutate(valueplus = ifelse((value %% 2) == 0, "even", "odd")) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Problem: | |
# - row 2 of data file has non-data title that repeats every two columns | |
# - column 1 / row 1 header label is fine | |
# - the header in every even-numbered column applies to the next odd-humbered column (eg 2 applies to 3, 4 to 5, etc) | |
# - the header in those odd-numbered columns (3, 5, 7, etc) is read initially as an NA | |
# Solution | |
# - read column names only | |
# - hard code even and odd suffix | |
# - copy header value in those even columns to odd columns |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# load the package and data set "Teams" | |
install.packages("Lahman") | |
library("Lahman") | |
data(Teams) | |
# | |
# | |
# CREATE LEAGUE SUMMARY TABLES | |
# ============================ | |
# | |
# select a sub-set of teams from 1901 [the establishment of the American League] forward to 2012 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# | |
library(Lahman) | |
data(Master) | |
# | |
# `debut` variable; create new version `debutDate` | |
Master$debutDate <- (as.Date(Master$debut, "%m/%d/%Y")) | |
Master$debutDate[is.na(Master$debutDate)] <- | |
as.Date(Master$debut[is.na(Master$debutDate)]) | |
# | |
# `finalGame` variable; create new version `finalGameDate` |