Skip to content

Instantly share code, notes, and snippets.

@guidocor
Created April 24, 2019 07:30
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save guidocor/26620367f6a4e478957ce06e32fd44b0 to your computer and use it in GitHub Desktop.
Save guidocor/26620367f6a4e478957ce06e32fd44b0 to your computer and use it in GitHub Desktop.
Clusters con la valoración media de los políticos. Encuesta preelectoral 2019
######################################################################
# Cluster Analysis de la valoración
######################################################################
rm(list=ls()); gc()
options(scipen=20)
if (!require("pacman")) install.packages("pacman")
pacman::p_load("tidyverse", "psych", "haven", "reshape2", "qgraph", "Rmisc", "Hmisc", "mclust", "corrplot")
# Sacamos del CIS la encuesta
dataset <- read_sav("3242.sav")
# Este códio ha sido posible gracias el siguiente post:
# https://willhipson.netlify.com/post/latent-profile/latent-profile/
# Nos interesan las preguntas de simpatía, asi que vamos por la que empiezan por P11
df = dataset %>% dplyr::select(starts_with("P11"))
colnames(df) <- c("Abascal", "Casado", "Garzón", "Iglesias", "Rivera", "Sánchez")
df$id = 1:nrow(df)
df = gather(df, key = "Simpatía", value = "valoración", 1:6)
df[df$valoración == 99 | df$valoración == 98 | df$valoración == 97, "valoración"] <- NA
df <- spread(df, key = "Simpatía", value = "valoración")
df <- select(df, -id)
df <- df[complete.cases(df),]
# Obtenemos el BIC de los modelos
BIC <- mclustBIC(df)
BIC %>% summary
plot(BIC)
# Aparecen 9 perfiles, pero por economía cognitiva modelaremos solo 6
mod1 <- Mclust(df, modelNames = "EEV", G = 9, x = BIC)
summary(mod1)
# Creamos el data.frame de los parametros
means <- data.frame(mod1$parameters$mean, stringsAsFactors = FALSE) %>%
rownames_to_column() %>%
melt(id.vars = "rowname", variable.name = "Profile", value.name = "Mean") %>%
mutate(Mean = round(Mean, 2))
# Creamos los porcentaes de cada grupo
n = data.frame(Profile = mod1$classification) %>%
group_by(Profile) %>% count("Profile") %>% ungroup() %>%
mutate(freq = round((freq/sum(freq))*100, 0) )
n$Profile <- paste0("X", n$Profile)
# unimos
means <- left_join(means, n)
# A plotear!
means %>%
ggplot(aes(rowname, Mean, group = Profile, color = rowname)) +
geom_line(size = 1.25, alpha = 0.3, color = "black") +
geom_point(size = 4.25) +
labs(x = NULL, y = "Valoración media") +
theme_bw(base_size = 14) +
facet_wrap(~ Profile) +
theme(axis.text.x = element_text(angle = 45, hjust = 1), legend.position = "none") +
scale_x_discrete(limits = c("Abascal", "Casado", "Rivera", "Sánchez", "Iglesias", "Garzón")) +
geom_text(aes(x = 5, y = 5, label = paste0(freq, "%" )), color = "black") +
ggtitle("Perfiles basados en las valoraciones a los principales políticos \n N = 9544, Fuente: CIS Abril 2019") +
scale_color_manual(values = c("Abascal"= "green4", "Casado" = "dodgerblue",
"Rivera" = "orange", "Sánchez" = "firebrick",
"Iglesias" = "mediumpurple", "Garzón" = "red"))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment