Skip to content

Instantly share code, notes, and snippets.

@Athospd
Last active August 29, 2015 13:56
Show Gist options
  • Save Athospd/9138339 to your computer and use it in GitHub Desktop.
Save Athospd/9138339 to your computer and use it in GitHub Desktop.
Litigiosidade vs IDH dos Estados Americanos
require(httr)
require(XML)
require(ggplot2)
# dados de IDH por estado americano do Wikipedia
resp <- GET("http://en.wikipedia.org/wiki/List_of_U.S._states_by_American_Human_Development_Index")
html <- htmlParse(content(resp, as="text"))
htmlTable <- readHTMLTable(html, header = TRUE, stringsAsFactors = FALSE)
# IDH
# Fonte: Wikipedia (http://en.wikipedia.org/wiki/List_of_U.S._states_by_American_Human_Development_Index)
idh_us <- htmlTable[[1]][-1,c("V3","V4")]
names(idh_us) <- c("estado", "idh")
idh_us$idh <- as.numeric(idh_us$idh)
# Litigiosidade (casos novos por 100k Hab)
# Fonte:
litig_us <- read.csv2("litig_us.csv", colClasses = c("character", "numeric"))
# Arruma os nomes dos estados para fazer o merge
idh_us$estado_arrumado <- idh_us$estado
idh_us$estado <- toupper(idh_us$estado_arrumado)
# Washington, D.C não tem informação sobre litigiosidade
litig_us$estado[!(litig_us$estado%in%idh_us$estado_arrumado)]
# DISTRICT OF COLUMBIA e PUERTO RICO não têm informação sobre IDH
idh_us$estado_arrumado[!(idh_us$estado_arrumado%in%litig_us$estado)]
# Merge
litig_idh_us <- merge(litig_us,
idh_us,
byx = "estado",
all=T)
str(litig_idh_us)
# Gráficos
p <- ggplot(litig_idh_us, aes(x=idh, y=litigiosidade)) +
geom_point() +
stat_smooth(se = FALSE) +
labs(x = "IDH (2011)", y = "Litigiosidade (Casos novos por 100.000 habitantes") +
theme_bw()
# Dipsersão
p
# Com rótulos dos estados
p + geom_text(aes(label=estado_arrumado), Vjust=0)
# Sem Maryland e com reta lm
p %+% subset(litig_idh_us, !estado%in%"MARYLAND") +
stat_smooth(method="lm", se = FALSE)
lmFit <- lm(litigiosidade~idh, data=litig_idh_us)
summary(lmFit)
We can make this file beautiful and searchable if this error is corrected: It looks like row 2 should actually have 1 column, instead of 2. in line 1.
estado;litigiosidade
ALABAMA;4500,99
ALASKA;3611,14
ARIZONA;5813,40
ARKANSAS;4058,18
CALIFORNIA;3307,74
COLORADO;7503,26
CONNECTICUT;6312,81
DELAWARE;7504,48
DISTRICT OF COLUMBIA;10183,75
FLORIDA;7810,19
GEORGIA;8879,68
HAWAII;3193,70
IDAHO;5317,00
ILLINOIS;5231,64
INDIANA;7524,38
IOWA;5316,34
KANSAS;6764,27
KENTUCKY;6130,89
LOUISIANA;5908,53
MAINE;3557,26
MARYLAND;18023,70
MASSACHUSETTS;6186,55
MICHIGAN;7446,65
MINNESOTA;3990,11
MISSISSIPPI;2812,42
MISSOURI;5296,88
MONTANA;6721,08
NEBRASKA;7259,39
NEVADA;6144,69
NEW HAMPSHIRE;4079,64
NEW JERSEY;11625,17
NEW MEXICO;4900,02
NEW YORK;8843,01
NORTH CAROLINA;4798,41
NORTH DAKOTA;5282,88
OHIO;6818,18
OKLAHOMA;5250,89
OREGON;5039,34
PENNSYLVANIA;4326,23
PUERTO RICO;4878,20
RHODE ISLAND;5484,26
SOUTH CAROLINA;7661,91
SOUTH DAKOTA;7359,92
TENNESSEE;1111,88
TEXAS;3352,87
UTAH;5237,70
VERMONT;3635,22
VIRGINIA;11306,05
WASHINGTON;3670,73
WEST VIRGINIA;4301,90
WISCONSIN;5151,13
WYOMING;7821,99
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment