Skip to content

Instantly share code, notes, and snippets.

@b-mandelbrot
Last active November 2, 2019 13:38
Show Gist options
  • Save b-mandelbrot/8aaadb62b497c17be3972fd2185b4cef to your computer and use it in GitHub Desktop.
Save b-mandelbrot/8aaadb62b497c17be3972fd2185b4cef to your computer and use it in GitHub Desktop.
Remove linhas com mais de 70% de valores faltantes
# Carrega pacote para ler arquivos arff
install.packages("foreign")
library("foreign")
# Seta diretorio de trabalho
setwd("D:/House-Votes/")
# Lê base de dados
data <- read.arff("house-votes-84.arff")
# Imprime numero de exemplos
nrow(data) # 435
# Imprime linhas a serem removidas
data[which(rowMeans(is.na(data)) > 0.7), ] # 108, 184, 249
# Remove linhas
data_cleaned <- data[-which(rowMeans(is.na(data)) > 0.7), ]
# Imprime linhas apos remocao
nrow(data_cleaned) # 432
# Salva em outro arquivo arff
write.arff(data_cleaned, "house-votes-84-nas-70.arff")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment