Skip to content

Instantly share code, notes, and snippets.

@korydraughn
Last active November 6, 2023 22:05
Show Gist options
  • Save korydraughn/78ec96120234659db1c2ba3235efa46c to your computer and use it in GitHub Desktop.
Save korydraughn/78ec96120234659db1c2ba3235efa46c to your computer and use it in GitHub Desktop.
iRODS C++ REST API v1.0

Can the REST API be improved?

I feel the current REST API does not present a cohesive interface for users and we need to investigate alternatives. The current interface attempts to present a simple interface, but exposes several options that may lead to incorrect usage or confusion. If we're planning on the REST API being absorbed by the iRODS server, then we better make sure it is clean and unabigious. Otherwise, it could hurt the adoption of iRODS.

An Alternative Approach

I feel a better approach would be to expose various operations for each entity in iRODS, hence /collections and /data-objects. That means each path comes with an operation parameter (which isn't shown). My hope is users find this approach easier to understand. For example, if you want to create a collection, look at /collections. If you want to schedule a delay rule, look at /rules.

The listing below presents each URL path along with the operations that it can/should expose. Many of these operations can be translated directly to iRODS API calls, which is a good thing.

  • /auth
  • /collections
    • create
    • remove
    • move
    • list
    • adjust permissions
    • stat (i.e view perms)
  • /config
    • view
    • modify (later)
  • /data-objects
    • create
    • remove
    • read
    • write
    • move
    • trim
    • replicate
    • adjust permissions
    • stat (i.e view perms, replicas, etc)
    • register
    • unregister
  • /metadata
    • add
    • set
    • modify
    • remove
    • list
  • /query
    • run general queries
    • run specific queries
    • list available specific queries
    • list available keywords
  • /resources
    • add
    • modify
    • remove
    • list
    • adjust hierarchies
    • rebalance, etc.
  • /rules
    • execute
    • list
    • remove delay rules
  • /tickets
    • create
    • modify
    • remove
  • /users
    • create
    • modify
    • remove
    • list
  • /groups
    • create
    • modify
    • remove
    • list
  • /zones
    • add
    • modify?
    • remove
    • list

Parallel Transfer

To do parallel transfer using port 1247, the steps are as follows:

  1. Open the first stream.
  2. Capture the replica access token.
  3. Open secondary streams.
    • Each stream must use its own connection
    • They must target the same replica
    • They must use the same open flags
    • They must be opened using the replica access token
  4. Send bytes across streams.
  5. Close secondary streams without updating the catalog.
  6. Close first stream.
    • The first stream is responsible for updates to the catalog
    • The first stream is responsible for triggering policy

A possible translation to REST

First, instruct the server to initialize the transfer state.

/data-objects?op=parallel-write-init&channels=4&lpath=/tempZone/home/rods/f.txt[&dst-resource=some_resc|&replica_number=3]

This returns a handle to state that is specifically needed for the upcoming transfer. That state will be shared across multiple requests. The state may contain things such as: connections, chunk sizes, offsets per stream, the replica access token, etc. The state is never exposed to the client of the REST API. This simplifies the interface.

For example, the response could resemble the following:

{
  "error_code": 0,
  "transfer_handle": "<UUID>"
}

Now, the client sends chunks that will be written to various locations within the replica. Notice the header, content-length, is used to inform the server of the body's length. The body being the bytes to write. It's possible for things such as the offset to be stored in the server. This example means there are always at least two API calls per write.

content-length: 8196

/data-objects?op=write&transfer-handle=UUID&offset=1000

Once all bytes are transferred, send a request instructing the server to shutdown the transfer.

/data-objects?op=parallel-write-shutdown&transfer-handle=UUID

Each operation returns a JSON response containing an iRODS error code and an error message if available. For example:

{
  "error_code": 0,
  "error_message": ""
}
@trel
Copy link

trel commented Apr 17, 2023

add/set/modify/remove do not work with how the atomic API is meant to be used

I'll request discush on this point - not sure what you mean here.

Yes, extra 'levels' would mean more work/parsing/routing in the server (later, not necessarily now), but could be more approachable/expected for the programmer / user of the API.

@korydraughn
Copy link
Author

Ideas:

  • Approval test suite
    • JSON file containing inputs and outputs
    • Libraries built on top of this can reuse the JSON file

Also, /data-objects needs to handle checksums.

@korydraughn
Copy link
Author

korydraughn commented Apr 25, 2023

Apparently, HTTP states that custom status codes can be returned as long as they don't conflict with the standardized ones and the class of the status code is maintained. See the following:

Also, according to https://developer.mozilla.org/en-US/docs/Web/HTTP/Status#server_error_responses, HTTP servers are required to implement support for GET and HEAD.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment