Skip to content

Instantly share code, notes, and snippets.

@oevans
Last active April 16, 2024 18:37
Show Gist options
  • Star 27 You must be signed in to star a gist
  • Fork 9 You must be signed in to fork a gist
  • Save oevans/6992139 to your computer and use it in GitHub Desktop.
Save oevans/6992139 to your computer and use it in GitHub Desktop.
Python script to pull hosted features with attachments into a local file geodatabase. See ReadMe below.
  1. Create local file geodatabase to hold data and attachments you want to download from ArcGIS Online (called data.gdb in script)

  2. Create feature class (called myLayer in script), enable attachments, add globalID's

  3. Add the following field to the feature class -GlobalID_str, text, length: 50

  4. Create table called MatchTable (called MatchTable in script).

  5. Add the following fields to the MatchTable table:

    • GlobalID_Str, text, length: 50
    • PhotoPath, text length: 255
  6. Enable "sync" on hosted feature service (http://resources.arcgis.com/en/help/arcgisonline/index.html#//010q000000n0000000)

  7. Open AGOL_pullFeatures script in text editor and modify the following: -ArcGIS Online username/password -REST url to feature service to pull from -path to and name of local file geodatabase -fields to pull from the hosted feature service (must match local feature class) -name of local feature class (this will hold the data from the hosted service and the attachments)

import os, urllib, urllib2, datetime, arcpy, json
## ============================================================================== ##
## function to update a field - basically converts longs to dates for date fields ##
## since json has dates as a long (milliseconds since unix epoch) and geodb wants ##
## a proper date, not a long.
## ============================================================================== ##
def updateValue(row,field_to_update,value):
outputfield=next((f for f in fields if f.name ==field_to_update),None) #find the output field
if outputfield == None or value == None: #exit if no field found or empty (null) value passed in
return
if outputfield.type == 'Date':
if value > 0 : # filter "zero" dates
value = datetime.datetime.fromtimestamp(value/1000) # convert to date - this is local time, to use utc time
row.setValue(field_to_update,value) # change "fromtimestamp" to "utcfromtimestamp"
else:
row.setValue(field_to_update,value)
return
## ============================================================================== ##
### Generate Token ###
gtUrl = 'https://www.arcgis.com/sharing/rest/generateToken'
gtValues = {'username' : 'agol_user',
'password' : 'agol_password',
'referer' : 'http://www.arcgis.com',
'f' : 'json' }
gtData = urllib.urlencode(gtValues)
gtRequest = urllib2.Request(gtUrl, gtData)
gtResponse = urllib2.urlopen(gtRequest)
gtJson = json.load(gtResponse)
token = gtJson['token']
### Create Replica ###
### Update service url HERE ###
crUrl = 'http://services.arcgis.com/xyz123456orgid/arcgis/rest/services/myFeatures/FeatureServer/CreateReplica'
crValues = {'f' : 'json',
'layers' : '0',
'returnAttachments' : 'true',
'token' : token }
crData = urllib.urlencode(crValues)
crRequest = urllib2.Request(crUrl, crData)
crResponse = urllib2.urlopen(crRequest)
crJson = json.load(crResponse)
replicaUrl = crJson['URL']
urllib.urlretrieve(replicaUrl, 'myLayer.json')
### Get Attachment ###
cwd = os.getcwd()
with open('myLayer.json') as data_file:
data = json.load(data_file)
for x in data['layers'][0]['attachments']:
gaUrl = x['url']
gaFolder = cwd + '\\photos\\' + x['parentGlobalId']
if not os.path.exists(gaFolder):
os.makedirs(gaFolder)
gaName = x['name']
gaValues = {'token' : token }
gaData = urllib.urlencode(gaValues)
urllib.urlretrieve(url=gaUrl + '/' + gaName, filename=os.path.join(gaFolder, gaName),data=gaData)
### Create Features ###
rows = arcpy.InsertCursor(cwd + '/data.gdb/myLayer')
fields = arcpy.ListFields(cwd + '/data.gdb/myLayer')
for cfX in data['layers'][0]['features']:
pnt = arcpy.Point()
pnt.X = cfX['geometry']['x']
pnt.Y = cfX['geometry']['y']
row = rows.newRow()
row.shape = pnt
### Set Attribute columns HERE ###
## makes use of the "updatevalue function to deal with dates ##
updateValue(row,'Field1', cfX['attributes']['Field1'])
updateValue(row,'Field2', cfX['attributes']['Field2'])
updateValue(row,'Field3', cfX['attributes']['Field3'])
updateValue(row,'Field4', cfX['attributes']['Field4'])
# leave GlobalID out - you cannot edit this field in the destination geodb
#comment out below fields if you don't have them in your online or destination geodb (editor tracking)
updateValue(row,'CreationDate', cfX['attributes']['CreationDate'])
updateValue(row,'Creator', cfX['attributes']['Creator'])
updateValue(row,'EditDate', cfX['attributes']['EditDate'])
updateValue(row,'Editor', cfX['attributes']['Editor'])
updateValue(row,'GlobalID_str', cfX['attributes']['GlobalID'])
rows.insertRow(row)
del row
del rows
### Add Attachments ###
### Create Match Table ###
rows = arcpy.InsertCursor(cwd + '/data.gdb/MatchTable')
for cmtX in data['layers'][0]['attachments']:
row = rows.newRow()
row.setValue('GlobalID_Str', cmtX['parentGlobalId'])
row.setValue('PhotoPath', cwd + '\\photos\\' + cmtX['parentGlobalId'] + '\\' + cmtX['name'])
rows.insertRow(row)
del row
del rows
### Add Attachments ###
arcpy.AddAttachments_management(cwd + '/data.gdb/myLayer', 'GlobalID_Str', cwd + '/data.gdb/MatchTable', 'GlobalID_Str', 'PhotoPath')
@micahwilli
Copy link

Is the resultant file a feature class within the local GDB or is it a JSON file?

@DeSpawn
Copy link

DeSpawn commented Oct 16, 2013

Any chance we could get the Readme documentation updated, step 6 is to vague and the #comments in the py are not descriptive enough nor clear enough. Also adding fields is unclear.

@anewberry1983
Copy link

I'm trying to use this script and have populated all of the values mentioned in the ReadMe but I keep getting this error when I try to run it:

line 39, in
replicaUrl = crJson['URL']
KeyError: 'URL'

I'm not a heavy Python user, so can anyone help me out with what this means or next steps to take?

@rcsellman
Copy link

I too am trying to use this script but am not having success - getting the exact same error:

replicaUrl = crJson['URL']
KeyError: 'URL'

Have you been able to figure out what this means?

@maphew
Copy link

maphew commented Jun 17, 2014

It's now possible to grab a feature service and attachments and relationship class all in go from the server itself in a file-gdb, meaning I don't think this script is needed any more.

Point browser to http://services.arcgis.com/{xxx123456xxx}/arcgis/rest/services/{folder_name}/FeatureServer//createReplica

Set values to the below, unlisted items can just use default. Click on the resultant “statusURL”, and then “Result Url”, save zip file in wherever and extract a file-gdb with everything intact (except symbology).

Replica Name                  some_meaningful_name
Layers                        0,3 (select by index number)
Return Attachments            TRUE
Return Attachments by Url     TRUE
Create Replica Asynchronously TRUE
Sync                          None
Data Format                   FileGDB

Source: http://forums.arcgis.com/threads/60863-Export-Feature-Service-with-Attachments

@stuartalexandersanders
Copy link

This script is great but when I run it on a task scheduler I sometimes get this error.

Traceback (most recent call last):
File ":_******_*******_.py", line 259, in
crResponse = urllib2.urlopen(crRequest)
File "C:\Python27\ArcGIS10.2\lib\urllib2.py", line 127, in urlopen
return _opener.open(url, data, timeout)
File "C:\Python27\ArcGIS10.2\lib\urllib2.py", line 410, in open
response = meth(req, response)
File "C:\Python27\ArcGIS10.2\lib\urllib2.py", line 523, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Python27\ArcGIS10.2\lib\urllib2.py", line 448, in error
return self._call_chain(_args)
File "C:\Python27\ArcGIS10.2\lib\urllib2.py", line 382, in _call_chain
result = func(*args)
File "C:\Python27\ArcGIS10.2\lib\urllib2.py", line 531, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
HTTPError: HTTP Error 500: Internal Server Error

I have noticed that it seems to happen when I'm trying to pull in data that has been edited offline. Any help would be very grateful.

Thank you

Alex

@scottfierro
Copy link

Didn't see a response here but for the answer to

replicaUrl = crJson['URL']
KeyError: 'URL'

It's a copy/paste oversight. When bringing over the REST service to copy into the python script it'd have everything you need up to /FeatureServer/ but then you have to append in the CreateReplica at the end of the statement or only copy and replace your service name into the script to replace /xyz123456orgid/

http://services.arcgis.com/xyz123456orgid/arcgis/rest/services/myFeatures/FeatureServer/CreateReplica

@bnadler
Copy link

bnadler commented May 8, 2015

this script does the same, but uses ArcRest and is a little cleaner. http://anothergisblog.blogspot.com/2014/10/using-arcrest-to-export-hosted-feature.html

@GISGURUZ
Copy link

I have a script that is supposed to extract feature(s) from AGOL, the script runs without any error(s) but I don't get any output in the geodatabase I created specifically for this purpose as well as the attachment to the specified feature. I noticed that the script drops a .json file in my folder, when I tried to convert this .json file to a shapefile it throws an error - see error below;

arcpy.JSONToFeatures_conversion("C:\Users\KUNLE\Desktop\New folder (2)\PropertyEnumerationForm.json","Sample")

Runtime error Traceback (most recent call last): File "", line 1, in File "c:\program files (x86)\arcgis\desktop10.2\arcpy\arcpy\conversion.py", line 434, in JSONToFeatures raise e ExecuteError: ERROR 001558: Error parsing json file 'C:\Users\KUNLE\Desktop\New folder (2)\PropertyEnumerationForm.json'. Failed to execute (JSONToFeatures).>>>

Below is my script

gtUrl = 'https://www.arcgis.com/sharing/rest/generateToken'
gtValues = {'username' : 'XPARCELS',
'password' : '@3$%678',
'referer' : 'http://www.arcgis.com',
'f' : 'json' }
gtData = urllib.urlencode(gtValues)
gtRequest = urllib2.Request(gtUrl, gtData)
gtResponse = urllib2.urlopen(gtRequest)
gtJson = json.load(gtResponse)
token = gtJson['token']

    ### Create Replica ###
    ### Update service url HERE ###
    crUrl = 'http://services5.arcgis.com/b23aIR8OqGRlfJlE/arcgis/rest/services/PropertyEnumerationForm/FeatureServer/CreateReplica'


    crValues = {'f' : 'json',
    'layers' : '0',
    'returnAttachments' : 'true',
    'token' : token }
    crData = urllib.urlencode(crValues)
    crRequest = urllib2.Request(crUrl, crData)
    crResponse = urllib2.urlopen(crRequest)
    crJson = json.load(crResponse)
    replicaUrl = crJson['URL']
    urllib.urlretrieve(replicaUrl, 'PropertyEnumerationForm.json')

    ### Get Attachment ###
    cwd = os.getcwd()
    with open('PropertyEnumerationForm.json') as data_file:
        data = json.load(data_file)

    for x in data['layers'][0]['attachments']:
        gaUrl = x['url']
        gaFolder = cwd + '\\photos\\' + x['parentGlobalId']
    if not os.path.exists(gaFolder):
        os.makedirs(gaFolder)
        gaName = x['name']
        gaValues = {'token' : token }
        gaData = urllib.urlencode(gaValues)
        urllib.urlretrieve(url=gaUrl + '/' + gaName, filename=os.path.join(gaFolder, gaName),data=gaData)

    ### Create Features ###
    rows = arcpy.InsertCursor(cwd + '/data.gdb/PropertyEnumerationForm')
    fields = arcpy.ListFields(cwd + '/data.gdb/PropertyEnumerationForm')

    for cfX in data['layers'][0]['features']:
        pnt = arcpy.Point()
        pnt.X = cfX['geometry']['x']
        pnt.Y = cfX['geometry']['y']
        row = rows.newRow()
        row.shape = pnt

    ### Set Attribute columns HERE ###
    ## makes use of the "updatevalue function to deal with dates ##
    rows = arcpy.UpdateCursor("C:\Users\KUNLE\Desktop\TaxParcels\data.gdb\PropertyEnumerationForm")
    for row in rows:
        #row.setValue("BUFFER_DISTANCE", row.getValue("ROAD_TYPE") * 100)
        updateValue(row,'UNITNO', cfX['attributes']['UNITNO'])
        updateValue(row,'PROPERTYCLASSCODE', cfX['attributes']['PROPERTYCLASSCODE'])
        updateValue(row,'OWNERNAME1', cfX['attributes']['OWNERNAME1'])
        updateValue(row,'BUILDINGSTRUCTURETYPE', cfX['attributes']['BUILDINGSTRUCTURETYPE'])        
        updateValue(row,'BUILDINGARRANGEMENT', cfX['attributes']['BUILDINGARRANGEMENT'])
        updateValue(row,'BUILDINGGROUPING', cfX['attributes']['BUILDINGGROUPING'])
        updateValue(row,'PHONENUMBER1', cfX['attributes']['PHONENUMBER1'])
        updateValue(row,'EMAILADDRESS', cfX['attributes']['EMAILADDRESS'])
        updateValue(row,'SUBORCONDO', cfX['attributes']['SUBORCONDO'])
        updateValue(row,'NOOFBUILDINGS', cfX['attributes']['NOOFBUILDINGS'])
        updateValue(row,'BUILDINGNUMBER', cfX['attributes']['BUILDINGNUMBER'])
        updateValue(row,'NUMBEROFUNITS', cfX['attributes']['NUMBEROFUNITS'])
        updateValue(row,'BUILDINGUNITNUMBER', cfX['attributes']['BUILDINGUNITNUMBER'])
        updateValue(row,'NUMBEROFFLOORS', cfX['attributes']['NUMBEROFFLOORS'])
        updateValue(row,'FLOORNUMBER', cfX['attributes']['FLOORNUMBER'])
        updateValue(row,'OWNERTYPE', cfX['attributes']['OWNERTYPE'])
        updateValue(row,'BUILDINGSTATUS', cfX['attributes']['BUILDINGSTATUS'])
        updateValue(row,'HasChildren', cfX['attributes']['HasChildren'])
        updateValue(row,'HASWATERPROVIDER', cfX['attributes']['HASWATERPROVIDER'])
        updateValue(row,'HASSEWERPROVIDER', cfX['attributes']['HASSEWERPROVIDER'])
        updateValue(row,'DATECREATED', cfX['attributes']['DATECREATED'])    
        updateValue(row,'CreationDate', cfX['attributes']['CreationDate'])
        updateValue(row,'Creator', cfX['attributes']['Creator'])
        updateValue(row,'EditDate', cfX['attributes']['EditDate'])
        updateValue(row,'Editor', cfX['attributes']['Editor'])
        updateValue(row,'GlobalID_str', cfX['attributes']['GlobalID'])

        rows.insertRow(row)

    del row
    del rows

    ### Add Attachments ###
    ### Create Match Table ###
    rows = arcpy.InsertCursor(cwd + '/data.gdb/MatchTable')

    for cmtX in data['layers'][0]['attachments']:
        row = rows.newRow()

        row.setValue('GlobalID_Str', cmtX['parentGlobalId'])
        row.setValue('PhotoPath', cwd + '\\photos\\' + cmtX['parentGlobalId'] + '\\' + cmtX['name'])

        rows.insertRow(row)

    del row
    del rows

    ### Add Attachments ###
    arcpy.AddAttachments_management(cwd + '/data.gdb/PropertyEnumerationForm', 'GlobalID_Str', cwd + '/data.gdb/MatchTable', 'GlobalID_Str', 'PhotoPath')

Kindly assist me with the above a I've tried all possible combinations to ensure that the script above runs successfully. Waiting for your response. Thanks

@stuartalexandersanders
Copy link

Owen,

There seems to be a new issue with this script for the updates to ArcGIS Online and the hosted feature services that occurred last week. The script connects to the REST and successfully creates the json file from the CreateReplica function in the REST. I can re-create this process manually in the CreateReplica GUI and pull down the json file.

The issue seems to be once the json file is wrote to disk with this code urllib.urlretrieve(replicaUrl, 'myLayer.json') the resulting json file looks more like an html file than a json file. When I run the script now, I get an error that the ArcGIS python json library can't Decode the json file.

Thank you for any assistance.

Alex
errormessagepullhostedfeatureservice
mylayer json

@gEYEzer
Copy link

gEYEzer commented Mar 31, 2016

@stuartalexandersanders - I got this running by adding '?token=' + token to the retrieve call.

@Elamunyon
Copy link

I am also getting the json decoding error. Where does ?token=' + token get added in the script?

@gfausel
Copy link

gfausel commented Dec 22, 2016

Wondering if someone could help. A consultant built this code for a project and it was running fine until a couple of months ago. The error I am getting is:
Traceback (most recent call last):
File "C:\Users\gfausel\Desktop\NJTPA_ETL_Desktop.py", line 380, in
with zipfile.ZipFile(zip, "r") as z:
File "C:\Python27\ArcGIS10.3\lib\zipfile.py", line 770, in init
self._RealGetContents()
File "C:\Python27\ArcGIS10.3\lib\zipfile.py", line 811, in _RealGetContents
raise BadZipfile, "File is not a zip file"
BadZipfile: File is not a zip file.

Here is part of the code where it does the zip:
#Extract the zip file to the temp directory and get the GDB name
with zipfile.ZipFile(zip, "r") as z:
z.extractall(tempdir)
ziplist = z.namelist()

    zipname = ziplist[0]
    GDBName = zipname[0:(zipname.find(".gdb")+4)]

Any one getting this error? I do not know python so any help would be much appreciated.

@MKellyEsri
Copy link

If you are looking to sync hosted Feature Layer attachments on disk (so you can view images/documents/etc. outside of ArcGIS), you can check out the Downloading Feature Layer Attachments via the ArcGIS API for Python script on GeoNet.

@geospatialology
Copy link

geospatialology commented Oct 18, 2017

@MKellyEsri thanks for your script! Should probably post this comment over on Esri Developer where you pushed it, but figured I'd put it here since the comment is related to attachments and I have a question about the JSON also. Any chance you're willing to modify your code to handle the establishment of local attachments to features? Your code is nice to download all the attachments, but I think it's a more common user need to download the attachments and associate them with features that you have downloaded. I think users commonly need to grab their features and associated attachments off of AGOL to do things with them in Desktop GIS and on PC. At least that's a really common need of mine. Also, I'm pretty sure my environment is configured correctly, and I had to comment everything about display and logger out to get your code to work.

@gEYEzer - how did you modify your JSON call? I've been tinkering and failing so far. Trying to avoid completely deconstructing the requests portion of the script to understand the issue...

@mounica144
Copy link

It's now possible to grab a feature service and attachments and relationship class all in go from the server itself in a file-gdb, meaning I don't think this script is needed any more.

Point browser to http://services.arcgis.com/{xxx123456xxx}/arcgis/rest/services/{folder_name}/FeatureServer//createReplica

Set values to the below, unlisted items can just use default. Click on the resultant “statusURL”, and then “Result Url”, save zip file in wherever and extract a file-gdb with everything intact (except symbology).

Replica Name                  some_meaningful_name
Layers                        0,3 (select by index number)
Return Attachments            TRUE
Return Attachments by Url     TRUE
Create Replica Asynchronously TRUE
Sync                          None
Data Format                   FileGDB

Source: http://forums.arcgis.com/threads/60863-Export-Feature-Service-with-Attachments

Hello,

I did a lot of Searching on Internet, but did not find a proper documentation on replicas. I would like to know what happens to the Created Replicas. I wanted to export feature services to GDB on a daily basis(There are almost 40 feature services).
So can I create a replica everyday and then get the GDB?
After creating replicas with the options you mentioned, I'm able to download the GDB. But I cannot see the replicas. So, I do not know how to remove the replica because I don't find a replica ID.
My main concern is about creating replicas everyday which we don't use any other time except on the day which we are downloading GDBs.
Can you please help me with these questions.

@kathysll
Copy link

kathysll commented Dec 9, 2022

My notebook does not work in ArcGIS Pro 3.0. I hope to know if anyone could help me how to export Feature Hosted Server Layer to SDE, I hope to back up the layer in SDE in python. Thank you very much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment