Skip to content

Instantly share code, notes, and snippets.

@mjrich
Last active September 6, 2015 18:30
Show Gist options
  • Save mjrich/871fd20b7f3831fbb8ba to your computer and use it in GitHub Desktop.
Save mjrich/871fd20b7f3831fbb8ba to your computer and use it in GitHub Desktop.
This is an example script for how to compute weights in base R using demographic variables like age & gender and then how to calculate frequencies while applying those weights
#
#You have a dataframe "df", with variables gender and age, and other variables of interest about radio listenership
#
#Add weights below. The population parameter over the prop.table(..) portion calculating sample statistics.
weight_male = .49500764/(prop.table(table(df$Gender))["Male"])
weight_male
weight_female = .50499236/(prop.table(table(df$Gender))["Female"])
weight_female
weight_15to24 = .31/prop.table(table(df$agecat))["15 to 24"]
weight_15to24
weight_25to34 = .25/prop.table(table(df$agecat))["25 to 34"]
weight_25to34
weight_35plus = .45/prop.table(table(df$agecat))["35+"]
weight_35plus
#Multiply weights to achieve total weight for each record in your dataset
df$weight = 1.0
df$weight[df$Gender == "Male"] = sapply(
df$weight[df$Gender == "Male"], function(x) x*weight_male)
df$weight[df$Gender == "Female"] = sapply(
df$weight[df$Gender == "Female"], function(y) y*weight_female)
df$weight[df$agecat == "15 to 24"] = sapply(
df$weight[df$agecat == "15 to 24"], function(x) x*weight_15to24)
df$weight[df$agecat == "25 to 35"] = sapply(
df$weight[df$agecat == "25 to 35"], function(x) x*weight_25to34)
df$weight[df$agecat == "35+"] = sapply(
df$weight[df$agecat == "35+"], function(x) x*weight_35plus)
#turn back into vector
df$weight = unlist(df$weight)
first_weight = sum(df$weight)
table(df$weight)
first_weight
#plus one last rebalancing to bring the N back to the total N, if needed
df$weight = sapply(df$weight, function(x) x*nrow(df)/first_weight)
#turn back into vector
df$weight = unlist(df$weight)
#these two should now be the same
sum(df$weight)
table(df$weight)
#Weighted frequency function
table_wght<- function (x){
z=df$weight
N<-xtabs(z~x)
return(N)
}
#Radio listenership analysis
table(df$RadioHr2)
table(df$RadioHr4)
table_wght(df$RadioHr2)
table_wght(df$RadioHr4)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment