Skip to content

Instantly share code, notes, and snippets.

@rhhackett
Last active December 27, 2015 12:49
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 3 You must be signed in to fork a gist
  • Save rhhackett/7328934 to your computer and use it in GitHub Desktop.
Save rhhackett/7328934 to your computer and use it in GitHub Desktop.
Program to sample randomly from White House visits
# Below is a tool Julie and I (Bob) created to randomly select
# a group of visitors of some specified size from the some
# subset of visitors
randomvisit = function(data,lo=5,hi=10){
# lo represents the low value; hi represents the high value.
# So, above we are looking at groups ranging from 5-10 visitors.
u = table(data$UIN)
# u refers to a table containing the UINs of some
# subset of visitors that will be specified through "data"
su = u[u <= hi & u >= lo]
nu = names(su)
# su refers to a subset of table u that contains
# groups of a certain size, where the size of
# a group is less than or equal to "hi" or
# greater than or equal to "lo" (limits that are defined above).
# nu will spit out the UINs contained in su
v = data[data$UIN==sample(nu,1),]
return(v)
}
# here we're looking at the UINs contained in some subset of visitors
# (to be defined through "data"). the sample function will randomly
# select a single (aka ",1") group of visitors (or UINs defined through nu).
# return(v) prints that selection for our reading pleasure.
x = randomvisit(csvisits)
x[,c(1:4,12,20:22,27,30)]
# x refers to a random visit selection from a dataset
# we created: "csvisits", which is simply a combination
# of house and senate visitors to potus.
# the next piece defines the information we want to see
# from a randomly selected visit. So we just kept interesting
# information like who came, when, to see who... (we don't
# care who picked them up).
visits$cs = visits$fullname %in% c(snms,cnms)
ucs = table(visits$UIN,visits$cs)
p = ucs[,2]/(ucs[,2]+ucs[,1])
p = p[p>.25]
csvisits = visits[visits$UIN %in% names(p),]
# line 47 creates a new column under visits that will
# pull together the full names of visitors who are
# eaither in the house or the senate
# ucs refers to a tabulation of the house and
# senate visitors with their UINs
# p is the percentage of house and senate vistors
# in a given set of vistors.
# we then define p to be a subset of these visits where
# more than a quarter of the visitors are congressional
# finally, csvisits enables us to see the UINs of the
# members of such visits, those where more than a quarter
# of the visitors are congressional.
# as this is my first code annotation, apologies
# for anything that may be unclear....
# feel to ask me qs. i'll probably cc mark.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment