Skip to content

Instantly share code, notes, and snippets.

@voltek62
Last active June 4, 2020 16:09
Show Gist options
  • Star 7 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save voltek62/784cf6cb29c76c182ae12b0481645fc2 to your computer and use it in GitHub Desktop.
Save voltek62/784cf6cb29c76c182ae12b0481645fc2 to your computer and use it in GitHub Desktop.
get Web Traffic Data from SimilarWeb API with R
library(httr)
library(jsonlite)
# https://dataseolabs.com
# Doc : https://www.similarweb.com/corp/developer/
# Create your key here : https://pro.similarweb.com/#/account/api-management
# You can have freely 3 Months of Web Traffic Data
# conf
myList <- c("cuisineaz.com","marmiton.org","odelices.com","allrecipes.fr")
myKey <- "YOURKEY"
dateStart <- "2018-03"
dateEnd <- "2018-05"
# create empty dataframe
results <- data.frame(site=character(), date=character(), visits=integer())
for (site in myList) {
# query similarweb
url <- paste0("https://api.similarweb.com/v1/website/",site,"/total-traffic-and-engagement/visits?api_key=",myKey,"&start_date=",dateStart,"&end_date=",dateEnd,"&main_domain_only=false&granularity=monthly")
result <- GET(url)
text <- content(result,as = "text", encoding = "UTF-8")
json <- fromJSON(text)
# add lines if no error
if (grepl("Success", json$meta$status)) {
tmp <- cbind(site, json$visits)
results <- rbind(results, tmp)
}
}
# delete tmp objects
rm(json)
rm(result)
rm(tmp)
print(results)
@ArthurCa
Copy link

ArthurCa commented Jun 28, 2018

  1. I use your script and add an URL as variable : "cnn.com" could be changed by the website you like.
    your_site <- "yourwebsite.com"
  2. I tried to use your script to add several website as variables, like this :
    your_site1 <- "yourwebsite1.com"
    your_site2 <- "yourwebsite2.com"
    your_site3 <- "yourwebsite3.com"
    your_site4 <- "yourwebsite4.com"

I think it is not ideal, and we can go so much further, with for exemple add results as df and compare data.

What I've been doing: https://gist.github.com/ArthurCa/8dd9dd4c07c74d19861477da77adef84

Many thanks Vincent for ideas and scripts.

@voltek62
Copy link
Author

Updated 👍 with

  • easy conf with dateStart, dateEnd, myKey, myList
  • myList = website list
  • error handling
  • writing to a dataframe
  • memory optimization

@naikabhilash
Copy link

The script used to work properly a week back. Now I am getting 400 as the status code. Could you please re-run it and check whether there is any change made by the website.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment