Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Fetch data from Epicollect5 API using R
library(httr)
library(jsonlite) # if needing json format
cID<-"999" # client ID
secret<- "F00HaHa00G" # client secret
proj.slug<- "YourProjectSlug" # project slug
form.ref<- "YourFormRef" # form reference
branch.ref<- "YourFromRef+BranchExtension" # branch reference
res <- POST("https://five.epicollect.net/api/oauth/token",
body = list(grant_type = "client_credentials",
client_id = cID,
client_secret = secret))
http_status(res)
token <- content(res)$access_token
# url.form<- paste("https://five.epicollect.net/api/export/entries/", proj.slug, "?map_index=0&form_ref=", form.ref, "&format=json", sep= "") ## if using json
url.form<- paste("https://five.epicollect.net/api/export/entries/", proj.slug, "?map_index=0&form_ref=", form.ref, "&format=csv&headers=true", sep= "")
res1<- GET(url.form, add_headers("Authorization" = paste("Bearer", token)))
http_status(res1)
# ct1<- fromJSON(rawToChar(content(res1))) ## if using json
ct1<- read.csv(res1$url)
str(ct1)
# url.branch<- paste("https://five.epicollect.net/api/export/entries/", proj.slug, "?map_index=0&branch_ref=", branch.ref, "&format=json&per_page=1000", sep= "") ## if using json; pushing max number of records from default 50 to 1000
url.branch<- paste("https://five.epicollect.net/api/export/entries/", proj.slug, "?map_index=0&branch_ref=", branch.ref, "&format=csv&headers=true", sep= "")
res2<- GET(url.branch, add_headers("Authorization" = paste("Bearer", token)))
http_status(res2)
ct2<- read.csv(res2$url)
# ct2<- fromJSON(rawToChar(content(res2))) ## if using json
str(ct2)
@schafnoir

This comment has been minimized.

Copy link

@schafnoir schafnoir commented Mar 7, 2020

This is very helpful - thanks for sharing! By chance, do you have code for retrieving original format images from an EpiCollect5 project in addition to form data? I tried the following, but got a '400' error:

testPhoto <- "photoName.jpg"
projectSlug <- "my-private-epi-project"
imageURL <- paste0("https://five.epicollect.net/api/export/media/", projectSlug, "?type=photo?format=entry_original&name=", testPhoto)

httr::GET(imageURL, add_headers("Authorization" = paste("Bearer", token)), write_disk(path = "photoName.jpg"), overwrite = TRUE))

@mirko77

This comment has been minimized.

Copy link
Owner Author

@mirko77 mirko77 commented Mar 8, 2020

@schafnoir

This comment has been minimized.

Copy link

@schafnoir schafnoir commented Mar 20, 2020

Thank you @mirko77 - I studied the examples you linked above prior to posting my original question, and am still unable to successfully download an image from a private EpiCollect project using the R httr library. Interestingly, using your code above I am able to successfully download data into R after generating a clientID, clientSecret, and authorization token. However, when I try and retrieve an image I receive a 'Status = 400' response when I use the code I pasted above. I must be missing something in the JavaScript and PHP examples you linked because I cannot tell from those examples where I am going wrong. The relevant output from the httr::GET() call is:
Date: [today's date and time]
Status: 400
Content-Type: application/vnd.api+json; charset=utf-8
Size: ### B

I understand it may not be a priority to support these types of questions here, so please let me know if there is a better place to ask.
Cheers!

@mirko77

This comment has been minimized.

Copy link
Owner Author

@mirko77 mirko77 commented Mar 22, 2020

photoName.jpg is not a valid file name for Epicollect5, they are saved in the following format
02a48b70-6c46-11ea-bd3f-6fc459407ff6_1584885865.jpg

Look at your entries export to get the right filenames

Try to fork this fiddle and use your credentials. Do not forget to uncomment the auth part https://jsfiddle.net/mirko77/y45brprq/

@schafnoir

This comment has been minimized.

Copy link

@schafnoir schafnoir commented Mar 22, 2020

Hi @mirko77 - I do have filenames that are the correct format that I have queried using the code you originally posted above. Sorry I was not clear in my comment that I was simplifying for the sake of the example! I also studied the JavaScript prior to my question two days ago, and just now I tried to fork and test the fiddle you linked above based on your suggestion. I saved what I tested in the fork (I think) and left a few comments as to my thought process, but could not get the JS to work; unfortunately, my JavaScript skills are not that great (I use R for most things).

I will try to give a full picture of what I am doing in R:

  1. First, in EpiCollect via a browser, I created an 'app' and obtained a 'Client ID' and 'Client Secret' as you demonstrate helpfully above.
  2. Second, I define my private project slug and export entries from my private project into R via the API using the token method you demonstrate above. This works no problem. I get a token that I name token that is a very long string of random letters, numbers, and symbols.
  3. Third, to use httr::GET() to obtain a photo, I use this code:

projectSlug <- "meier-cider-trees" # Actual name of my private project
treePhoto <- "7ba0da82-7088-4fe1-88cf-2b86d8d9e8b0_1576430696.jpg" # Actual file name from row 1 of my private project
writeFile <- "export_treePhoto.jpg" # Example name of file written to local destination
imageURL <- paste0("https://five.epicollect.net/api/export/media/", projectSlug, "?type=photo?format=entry_original&name=", treePhoto)
httr::GET(imageURL, add_headers("Authorization" = paste("Bearer", token)), write_disk(path = writeFile, overwrite = TRUE))

I then get the following server response:

Response [https://five.epicollect.net/api/export/media/meier-cider-trees?type=photo?format=entry_original&name=7ba0da82-7088-4fe1-88cf-2b86d8d9e8b0_1576430696.jpg]
Date: 2020-03-22 17:26
Status: 400
Content-Type: application/vnd.api+json; charset=utf-8
Size: 144 B
<ON DISK> export_treePhoto.jpg

As you can see, I think my file names are the correct format, and the Status: 400 message indicates a bad request that is either not understandable or is missing required parameters. So far, I have not been able to figure out either a) what is not understandable in the URL I am sending, or b) what parameters are missing from the httr::GET() request.

I hope that I have been clearer, and thank you for your time so far!

@mirko77

This comment has been minimized.

Copy link
Owner Author

@mirko77 mirko77 commented Mar 23, 2020

I am not an R expert, but if you can get the image for a public project (just set up a test one) having an error on the private one is just an authentication issue

@StephPeriquet

This comment has been minimized.

Copy link

@StephPeriquet StephPeriquet commented Nov 7, 2020

@schafnoir I'm doing the exact same thing as you (as in retrieving media from a private's project), and followed the same steps.
Using the exacts same code as you did, I obtained the same error on export.
But I found a typo in your code above form imageURL. You need to replace ?type=photo?format by ?type=photo&format and it works!

Hope it helps!

Best,
Steph

@schafnoir

This comment has been minimized.

Copy link

@schafnoir schafnoir commented Nov 8, 2020

@StephPeriquet - I believe you are correct about the misplaced ? in my code, thank you for the eagle eyes! Unfortunately, I am unable to test since I am now receiving a '404' error when I try to query for data entries using code that worked a couple weeks ago, and I need the data entries to get the image file names. Not sure what's going on, but one step forward and a few back, it seems.

Cheers,
Courtney

@StephPeriquet

This comment has been minimized.

Copy link

@StephPeriquet StephPeriquet commented Nov 10, 2020

Hi Courtney,
Sorry to hear that... I download the csv on my machine then use it to extract image names, might be a solution?

Best,
Steph

@schafnoir

This comment has been minimized.

Copy link

@schafnoir schafnoir commented Nov 12, 2020

@mirko77 - I wanted to let you know that in the code chunk above, the following no longer works due to the new per_page=1000 limit imposed on the API:

url.branch<- paste("https://five.epicollect.net/api/export/entries/", proj.slug, "?map_index=0&branch_ref=", branch.ref, "&format=json&per_page=1000000", sep= "")

Cheers,
Courtney

@mirko77

This comment has been minimized.

Copy link
Owner Author

@mirko77 mirko77 commented Nov 12, 2020

@mirko77 - I wanted to let you know that in the code chunk above, the following no longer works due to the new per_page=1000 limit imposed on the API:

url.branch<- paste("https://five.epicollect.net/api/export/entries/", proj.slug, "?map_index=0&branch_ref=", branch.ref, "&format=json&per_page=1000000", sep= "")

Cheers,
Courtney

per_page=1000000 would crash even a Google server, just get the data in multiple requests using the page parameter.
You can also reduce the number of entries on each request by using the filter_from and filter_to parameters, to get just the new entries.

@schafnoir

This comment has been minimized.

Copy link

@schafnoir schafnoir commented Nov 12, 2020

@mirko77 - no argument from me as to the wisdom of paginated requests! The code I quoted immediately above came from your example at the top of the page, and all I was pointing out is that it doesn't work, so you might want to update the example :-)

@mirko77

This comment has been minimized.

Copy link
Owner Author

@mirko77 mirko77 commented Nov 12, 2020

@mirko77 - no argument from me as to the wisdom of paginated requests! The code I quoted immediately above came from your example at the top of the page, and all I was pointing out is that it doesn't work, so you might want to update the example :-)

Ah ok, understood.

The code has not been written by myself ;)

@mirko77

This comment has been minimized.

Copy link
Owner Author

@mirko77 mirko77 commented Nov 12, 2020

@schafnoir thanks for reporting it, now it should be fixed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment