How to initiate a harvest of a single asset in s3 that has not been previously indexed into the GrayMeta Platform
NOTE: my endpoint is http://localhost:7000
. Your endpoint will be different.
You must know the name of the s3 bucket and the s3 key (the path within the bucket) to the asset.
For our example today, I want to harvest the key 5aba86635fc9cbec3a81839abc0fc2a9.jpg
inside the hancocktest
bucket in s3.
How to list containers is documented here.
curl -X GET -H "Authorization: Bearer $TOKEN" http://localhost:7000/api/data/containers/enabled
Response:
[
{
"HashID": "5a5b26bf52570d1f1db04ab809bacf08",
"id": "hancocktest",
"name": "",
"location_id": "5a9e302c466021cfd8c4d136d12286a0",
"version": 9,
"enabled": true,
"last_harvested": "2018-04-30T15:17:52.868123Z",
"request_id": "",
"harvest_success_count": 0,
"harvest_failure_count": 0,
"last_harvested_failure": "0001-01-01T00:00:00Z",
"groups": null
},
{
"HashID": "c69107c85e7a09a2522ae4ac462a67d3",
"id": "hancock-royaltesting",
"name": "",
"location_id": "5a9e302c466021cfd8c4d136d12286a0",
"version": 3,
"enabled": true,
"last_harvested": "2018-03-06T06:33:03.360187Z",
"request_id": "",
"harvest_success_count": 0,
"harvest_failure_count": 0,
"last_harvested_failure": "0001-01-01T00:00:00Z",
"groups": null
}
]
Notice the id
element of the first container in the response. There's no guarantee this is an s3 location though (you could have a container in Azure with the same name), so we need to verify the location is an s3 location.
Step 3: verify the location is an s3 location (can be omitted if you're not harvesting non-s3 locations):
curl -X GET -H "Authorization: Bearer $TOKEN" http://localhost:7000/api/data/locations/5a9e302c466021cfd8c4d136d12286a0
Response:
root@bebc5556e34a /]# curl -X GET -H "Authorization: Bearer $TOKEN" http://localhost:7000/api/data/locations/5a9e302c466021cfd8c4d136d12286a0
{
"id": "5a9e302c466021cfd8c4d136d12286a0",
"kind": "s3",
"name": "s3test",
"config": {
// omitted
},
"groups": [],
"version": 1
}
Notice the kind
in the response is s3. This means we now have the location ID (5a9e302c466021cfd8c4d136d12286a0) and the container id (hancocktest) of the item we want to harvest.
curl -X POST -H "Authorization: Bearer $TOKEN" http://localhost:7000/api/control/item-id -d '{"location_id":"5a9e302c466021cfd8c4d136d12286a0", "container_id":"hancocktest", "item_id":"5aba86635fc9cbec3a81839abc0fc2a9.jpg"}'
Note the poorly named item_id
parameter. This is actually the s3 key
Response:
{
"stow_url": "s3://https://s3-us-west-2.amazonaws.com/hancocktest/5aba86635fc9cbec3a81839abc0fc2a9.jpg",
"gm_item_id": "aada26b2ad2c397dc3d6e322caee5d33"
}
curl -X POST -H "Authorization: Bearer $TOKEN" http://localhost:7000/api/control/harvest -d '{"location_id":"5a9e302c466021cfd8c4d136d12286a0", "container_id":"hancocktest", "item_stow_url":"s3://https://s3-us-west-2.amazonaws.com/hancocktest/5aba86635fc9cbec3a81839abc0fc2a9.jpg"}'
Response:
{
"request_id": "5ae7a2d8f4bd5c8f183c1fd5396caea4"
}
If all is well, you'll get a 201 back with a request id