Skip to content

Instantly share code, notes, and snippets.

@kongscn
Last active August 29, 2015 14:03
Show Gist options
  • Save kongscn/34cdfc0585c820be3e43 to your computer and use it in GitHub Desktop.
Save kongscn/34cdfc0585c820be3e43 to your computer and use it in GitHub Desktop.
rmd demo
---
title: "Rmd Demo: Chinese Enabled"
author: "Shel Kong"
date: "Wednesday, July 02, 2014"
hello: "world"
output:
html_document:
highlight: pygments
number_sections: yes
theme: flatly
keep_md: true
---
Source code of this doc(gist): <https://gist.github.com/kongscn/34cdfc0585c820be3e43>
This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see <http://rmarkdown.rstudio.com>.
R markdown and R work well with multiple languages.
If you are runing **unix-like** OS,
just use UTF-8 encoding anywhere and you're done.
If you are running **Windows**, well, I tried and tried
but still got a lot of wired problems. Setting `Sys.setlocale`
can fix the output but the problem is more than that.
My suggestion is through it away and get a Mac or Linux.
If there's a better workout, please fire me from my
[gist](https://gist.github.com/kongscn/34cdfc0585c820be3e43).
R markdown可以处理包含中文(和其它语言)的文档。在类Unix系统中,只要使用UTF-8即可,不需要特别的设置。如果是Windows系统,一个提示是`Sys.setlocale`设定编码,可以解决输出和一些问题,但仍然有很多诡异的问题。除了扔掉你的Windows, 或者扔掉多语言,我没有更好的建议了。另在R markdown文档中使用chunk option `include=FALSE`,可以运行而不显示出这些设置。
```{r setup, include=FALSE}
# If your working dir is not the same of this rmd doc:
require(knitr)
opts_knit$set(root.dir = "../")
# If you are running Windows, and your output
# Set locale incoding to cope with languages other than English
# Sys.setlocale("LC_ALL", "chinese-simplified")
```
# 数据导入与整理
```{r}
aqidf=xlsx::read.xlsx('武汉AQI.xlsx',
sheetName="Sheet1",
encoding='UTF-8')
aqidf = aqidf[!duplicated(aqidf$日期), ]
levels(aqidf$质量等级) = c("优", "良", "轻度污染", "中度污染", "重度污染", "严重污染")
whetherdf=xlsx::read.xlsx('武汉天气.xlsx',
sheetName="Sheet1",
encoding='UTF-8')
whetherdf = whetherdf[!duplicated(whetherdf$日期), ]
whetherdf = subset(whetherdf, select=-c(年,月, 日))
summary(aqidf)
```
# Plots
```{r, warning=FALSE}
library(ggplot2)
p = qplot(日期, AQI指数, data=aqidf, geom="line")
p
```
# AQI指数平稳性检验
```{r, warning=FALSE}
aqin = subset(aqidf, select=-c(日期, 质量等级))
aqixts = xts::as.xts(aqin, order.by=aqidf$日期, frequency=1)
aqi_idx = aqixts$AQI指数
ar(diff(as.ts(aqi_idx)), method='mle')
fUnitRoots::adfTest(aqi_idx,lags=12,type=c("c"))
```
# Comments
Amazing, isn't it! You may wonder why should one use
non-ascii characters in his source code.
Definetly non-Enghlish source(and docs too) is a bad idea
when you are to share your work *world wide*,
But this is not the general circumstance, right?
Mostly you get some data(hopefull in English but usually in YOUR language),
and work around, do some analysis, and write a simple result note
(it's not even a *report*), and maybe share it with someone around you.
In this situation, multi-language is meanful.
If you instead tranlate your data first and work on it,
it becomes LESS straight forward.
I think it's best to keep it **simple**,
even it is *complex* to. (保持简洁很重要,即使方式反而复杂。)
Again you are welcomed to comment in my
[gist](https://gist.github.com/kongscn/34cdfc0585c820be3e43) of this page.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment