Instantly share code, notes, and snippets.

@leeper /gender.R
Last active Oct 11, 2017

Embed
What would you like to do?
Gender API Example https://gender-api.com/en/
#' Gets gender by name or email address, optionally by country or IP address.
library("httr")
library("XML")
#' @import httr
#' @import rjson
#' @param name A character string containing a first name, or a character vector containing first names. One must specify name or email.
#' @param email A character string containing an email address with a first name. One must specify name or email.
#' @param country An optional character string containing a two-letter country name, as listed here: https://gender-api.com/en/api-docs
#' @param ip An optional character string containing an IP address, used in place of country.
#' @return A dataframe containing the estimated gender, number of samples that estimate is based upon, an accuracy metric (ranging from 0 to 100), and the length of time for the server to process the request.
#' @export
#' @examples \dontrun{
#' gender('Andrea', country='US')
#' gender('Andrea', country='IT')
#' gender(c('george','thomas','ben'))
#' }
gender <- function(name=NULL, email=NULL, country=NULL, ip=NULL){
args <- list()
if(!is.null(name)){
if(length(name)>1)
args$name <- paste(head(unique(name),100), collapse=';')
else
args$name <- name
if(length(name)>100)
warning("Only first 100 names will be used!")
}
if(!is.null(email))
args$email <- email
if(!is.null(country))
args$country <- country
if(!is.null(ip))
args$ip <- ip
result <- GET('https://gender-api.com/get', query=args)
out <- content(result)
if(grepl('errno', content(result,as='text')))
stop(out$errmsg)
if((!is.null(name) & length(name)==1) || !is.null(email))
out <- as.data.frame(out)
else
out <- do.call(rbind.data.frame,out$result)
rownames(out) <- 1:nrow(out)
return(out)
}
#' @import httr
#' @import XML
#' @return A dataframe containing the country name and two-letter shortcut for every country available via the API.
#' @export
#' @examples \dontrun{
#' genderCountryCodes()
#' }
genderCountryCodes <- function(){
html <- htmlParse(GET("https://gender-api.com/en/api-docs"))
out <- cbind.data.frame(name = xpathSApply(html, "//div[@class='country clearfix']//div[@class='name']", xmlValue),
shortcut = xpathSApply(html, "//div[@class='country clearfix']//div[@class='shortcut']", xmlValue))
return(out)
}
@sebastianbarfort

This comment has been minimized.

sebastianbarfort commented Feb 18, 2014

Hi Thomas,
thanks for the function. However, there might be a problem with the GET command for character vectors larger than 1.

Consider the following (from a real example)

    names  <- c("Abdul", "Abir", "Ada", "Adam", "Adam", "Adele", "Adnan", "Adnan", "Adnan", "Adnan", "Adolph")
    gender(name = names, country = "DK")
    # returns a data frame with dim [3,4]

Compare that to

    get_gender <- function(x){
      args <- list()
      args$name <- x
      args$country <- "DK"
      result <- GET('https://gender-api.com/get', query=args)
      out <- content(result)
      if(grepl('errno', content(result,as='text')))
      out <- out$errmsg
      else
      out <- content(result)
      out <- as.data.frame(out)
      return(out)
    }

    names_list <- as.list(names)
    gender_df <- lapply(names, get_gender)

    library(plyr)
    gender_df <- ldply(gender_df, data.frame)
    # returns a data frame with correct dim [11, 7]

I think the problem lies in the GET request with a character vector of length>1. If you look at the GET request [https://gender-api.com/get?name=Abdul%3BAbir%3BAda%3BAdam%3BAdele%3BAdnan%3BAdolph&country=DK it only returns information for 3 of the 11 names in the vector.

Best, Sebastian

@leeper

This comment has been minimized.

Owner

leeper commented Feb 18, 2014

I emailed the developer. Seems like a bug in the API, but your code is a logical workaround (but note the 1000 request/month rate limit, which you might quickly expend doing one name at a time).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment