Skip to content

Instantly share code, notes, and snippets.

@mcdlee
Created December 13, 2013 02:42
Show Gist options
  • Save mcdlee/7939102 to your computer and use it in GitHub Desktop.
Save mcdlee/7939102 to your computer and use it in GitHub Desktop.
Demo in NUK (2013-12-12)
Demo in NUK (2013-12-12)
========================================================
```{r}
data <- read.csv("fake.csv")
data
attach(data)
```
as.factor
-------------------------------------------------------
```{r}
summary(S3.8)
summary(as.factor(S3.8))
plot(S3.8)
plot(Group, S3.8)
plot(as.factor(S3.8))
plot(Group, as.factor(S3.8))
```
改變分類變項
------------------------------------------------------
* 定義有症狀者為 positive,無症狀者為 negative
```{r}
S3.8 <- as.factor(ifelse(data$S3.8 == 0, "negative", "positive"))
summary(S3.8)
plot(S3.8)
```
* 效果很好,但我想在一張圖上呈現所有症狀,且作組間的比較。打算使用 ggplot2,需要:
1. 把所有資料整理在一個 data.frame 上
2. 利用 ddply 產生 summary table
3. 利用 melt 把 wide table 轉成 long table (參考 Graphic cookbook)
### 把所有資料整理在一個 data.frame 上
* 策略1: 利用 for loop 產生一個症狀一個 factor vector,然後再合併成一個新的 data.frame
* 但我沒辦法讓每個 loop 產生的 vector名字不一樣
* 策略2: 利用 for loop 複寫原來的 data.frame (很髒,卻很成功)
* 順便學寫 function
```{r}
merge_grade <- function(x) {
A <- as.factor(ifelse(data[[x]]==0, "negative", "positive"))
return(A)
}
list <- c(3:13)
for(x in list){
data[x] <- merge_grade(x)
}
data
```
### 利用 ddply 產生 summary table
* 試算表的世界或許可以用下圖的方式,但講到分組就囧了。
![countif](figure1.png)
* 原本的策略還有
1. as.data.frame(prop.table(table(Group, S3.1))),然後再拼起來
2. subset(data, Group == "A")
```{r}
library(plyr)
wide_table <- ddply(data, .(Group), summarise,
"Chest pain" = length(S3.1[S3.1=="positive"])/length(S3.1),
"Palpitation" = length(S3.2[S3.2=="positive"])/length(S3.2),
"Abdominal pain" = length(S3.3[S3.3=="positive"])/length(S3.3),
"Dyspnea" = length(S3.4[S3.4=="positive"])/length(S3.4),
"Diarrhea" = length(S3.5[S3.5=="positive"])/length(S3.5),
"Nausea/vomiting" = length(S3.6[S3.6=="positive"])/length(S3.6),
"Headache" = length(S3.7[S3.7=="positive"])/length(S3.7),
"Dizziness" = length(S3.8[S3.8=="positive"])/length(S3.8),
"Neck soreness" = length(S3.9[S3.9=="positive"])/length(S3.9),
"Weakness" = length(S3.10[S3.10=="positive"])/length(S3.10),
"Flush" = length(S3.11[S3.11=="positive"])/length(S3.11)
)
wide_table
```
### 利用 melt 把 wide table 轉成 long table
* 因為 ggplot2 的世界中, one row = one observation
```{r}
library(reshape2)
long_table <- melt(wide_table, id.vars="Group", variable.name="Symptoms", value.name="Ratio")
long_table
```
### 畫圖
```{r}
library(ggplot2)
ggplot(long_table, aes(x=Symptoms, fill=Group, y=Ratio)) +
geom_bar(position="dodge") +
geom_text(aes(label=round(Ratio *100, 2)), vjust=-0.8, color="black", position=position_dodge(0.9), size=3) +
xlab("Symptoms") +
theme(axis.text.x = element_text(angle=30, hjust=1, vjust=1))
ggplot(long_table, aes(x=reorder(Symptoms, -Ratio), fill=Group, y=Ratio)) +
geom_bar(position="dodge") +
geom_text(aes(label=round(Ratio *100, 2)), vjust=-0.8, color="black", position=position_dodge(0.9), size=3) +
xlab("Symptoms") +
theme(axis.text.x = element_text(angle=30, hjust=1, vjust=1))
```
No Group S3-1 S3-2 S3-3 S3-4 S3-5 S3-6 S3-7 S3-8 S3-9 S3-10 S3-11
1 A 1 0 0 0 0 0 0 0 0 0 0
2 A 0 0 0 0 0 0 0 0 0 0 0
3 A 0 0 1 0 0 0 0 1 0 0 0
4 A 0 0 0 0 0 0 0 0 0 0 0
5 A 0 0 0 0 0 0 0 0 0 0 0
6 A 0 0 0 2 0 0 0 2 0 0 0
7 A 0 0 0 0 0 0 0 0 0 0 0
8 A 0 0 0 0 0 0 0 1 0 0 0
9 A 0 0 0 0 0 0 0 1 0 0 0
10 A 0 0 2 0 0 0 1 1 0 0 0
11 A 0 0 0 0 0 0 0 0 0 0 0
12 A 0 0 0 0 0 0 0 0 0 0 0
13 A 0 0 1 0 0 0 0 1 0 0 0
14 A 0 0 0 1 0 0 0 1 0 0 0
15 A 0 0 0 0 0 0 0 2 0 0 0
16 A 0 1 1 0 0 0 1 0 0 0 0
17 A 0 0 0 0 0 0 0 0 0 0 0
18 A 0 0 0 0 0 1 0 0 0 0 0
19 A 0 0 0 0 0 0 0 0 0 0 0
20 A 0 0 0 0 0 0 0 0 0 0 0
21 A 0 0 0 0 0 0 0 1 0 0 0
22 A 0 0 0 0 0 0 0 0 0 0 0
23 A 0 0 0 0 0 0 0 0 0 0 0
24 A 0 0 0 0 0 0 0 1 0 0 0
25 A 0 0 1 0 0 0 0 0 0 0 0
26 A 0 0 0 0 0 0 0 0 0 0 0
27 A 0 0 0 0 0 0 0 0 0 0 0
28 A 0 0 0 0 0 0 0 0 0 0 0
29 A 2 0 0 0 0 0 0 0 0 0 1
30 A 0 0 0 0 0 0 0 0 0 0 1
31 B 0 0 0 0 0 0 0 0 0 0 0
32 B 0 0 0 0 0 0 0 1 0 0 0
33 B 0 0 0 0 0 0 0 0 0 0 0
34 B 0 0 0 0 0 0 0 0 0 0 0
35 B 0 0 0 0 0 0 0 1 0 0 0
36 B 0 0 0 0 0 0 0 0 0 0 0
37 B 0 0 0 0 0 0 0 0 0 0 0
38 B 0 0 0 0 0 0 0 0 0 0 0
39 B 0 0 0 0 0 0 0 1 0 0 0
40 B 0 0 0 0 0 0 0 0 0 0 0
41 B 0 0 0 0 0 0 0 2 0 0 0
42 B 0 0 0 0 0 0 0 1 0 0 0
43 B 0 0 0 0 0 0 0 0 0 0 0
44 B 0 0 0 0 0 0 0 0 0 0 0
45 B 0 0 0 0 0 0 0 0 0 0 0
46 B 0 0 0 0 0 0 0 0 0 0 0
47 B 0 0 0 0 0 0 0 0 0 0 0
48 B 1 0 0 0 0 0 0 1 0 0 0
49 B 0 0 0 0 0 0 0 0 0 0 0
50 B 0 0 0 0 0 0 0 0 0 0 0
51 B 1 0 0 0 0 0 0 0 0 0 0
52 B 0 0 0 0 0 0 0 1 0 0 1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment