dmitrizagidulin/service-uris.md

## service-uris.md

      
    Raw
  

              service-uris.md
            
          
    Context

So, here's the context for this conversation, for those joining in.
The value proposition for DIDs is that they provide stable user-controlled identifiers that can help with two main things: management of cryptographic materials, and portability of service endpoints.
With crypto material management, the idea is that a user's identifier can stay the same while underneath, keys and the like can change -- be added, rotated, revoked, and so on.
Similarly, with portability of service endpoints, the idea is that some sort of DID-based URI stays the same for a given service, and meanwhile the user can migrate from one service to another (and that stable DID-based URI for a service does not have to change).
So for a trivial example, say that a user has a service that stores their user profile picture. And currently, they're storing it at: https://facebook.com/img/userpic.jpg (not a real url, btw). The moment that they want to change providers for that service, all of their existing URLs (for example, those stored in contacts lists) that point to that userpic break.
What you want to be able to do instead, is to use a service URI based on a DID, so the URL above now becomes something like:
<some DID-based URI for service of type #user_pictures>/img/userpic.jpg
And then inside that DID Document, in the service: section, the user can swap out the actual endpoint for a particular service, without changing the service URI.
So for example, if Alice's DID was did:example:123, and she started out with the following DID Doc section:
"service": [
  {
    "id": "#user_pictures",  // <- note that the format of this ID is under discussion
    "serviceEndpoint": "https://facebook.com"
  }
]
The userpic URI would be something like <something based on did:example:123><service with ID #user_pictures>/img/userpic.jpg.
And when this URI would be passed to a DID Resolver, it would automatically be resolved to: https://facebook.com/img/userpic.jpg (because the resolver would fetch the DID did:example:123, look in the service section, find the service with the id #user_pictures, and look at its serviceEndpoint property).
And later, when she migrated to another userpic service, the contents of the DID Document would change to:
"service": [
  {
    "id": "#user_pictures",  // <- still the same as previously
    "serviceEndpoint": "https://new-friendster.com"
  }
]
The picture's URI would remain the same, <something based on did:example:123><service with ID #user_pictures>/img/userpic.jpg
But now, a resolver would translate that URI to: https://new-friendster.com/img/userpic.jpg
So that's the general idea - to enable DID-aware apps to use stable URIs for well known service types, and those URIs would remain the same even if the user migrated to a different service.
(And yes, this whole setup does depend on the path part of the URI staying the same from service to service. But there's actually a surprising amount of use cases where that would be true.)
The Arguments So Far

The argument is about two things:

How do we form the ids of each entry in the DID Doc's service section (this is actually issue #97), and
How do we define the structure of the service URIs (so that we can pass them to resolvers, so they can translate those URIs to specific service endpoints) (that's this issue).

Note that these are two separate topics. Let's examine the second one, the format of service URIs, because I will argue that the solution to the first issue should remain the same, regardless of what we decide.
Service URI Format

Each proposed approach needs to address two main things:

How do you signal to the resolver that this is a Service URI? (Meaning, how difficult will it be for a DID Resolver to tell that a given URL is a service url (and should be handed off to a different code path)?)
How do you specify id or type and the service path? (Given that the Service URI was parsed, the DID part separated from the service part.)

Option 1 - Semicolon based service URIs (current consensus)

The group consensus so far is to use one of the reserved URI delimiter characters, ; (the semicolon), to separate the DID part from the rest of the service URI. So, the structure is: <did>;<service id and path>.
Benefits:

A semicolon is a valid reserved character (a sub-delimiter) in a URI (see Section 2.2 of the URI RFC), specifically created to separate different components.
There is precedent for it -- semicolons are used as separators both in the Data URI Scheme and in the ni:/RFC6920 scheme
It would be fairly easy for a DID Resolver library to test for a presence of a ; character (since unlike with HTTP URLs, only ? and # chars are allowed in the DID scheme).

Downsides/Implications:

Using a ; as a separator between a DID and a service component means that it should be reserved (made an illegal character) in all Specific DID Method Schemes
As @davidlehn points out in the original description of this issue, semicolons are also used as "Matrix parameters" in existing URL paths. That is, they're a valid alternative to & in query params, to delimit key/value combinations. However, they are not a valid delimiter in the DID URI scheme, so this point might be moot.

Open Questions (Stuff that's being argued):
If the ; separates the DID part from the Service part, how do we specify the format for Service ID or the Service Type and the Service Path.
Option 1a:  - service or type keyword, then Path (with no separator)
(This is the option currently proposed in PR #95 by @mikelodder7)
For example: did:example:123;service=user_pictures/img/userpic.jpg
or
did:example:123;type=UserPicService/img/userpic.jpg
The implications (at first glance) with this option is that the Resolver:

Splits on ;, first item is the DID, the rest is the Service component
Parses the Service component up to the first / - this is a key/value pair denoting the Service Locator (an id or a type)
Starting with the first /, everything that follows is part of the service path

Except here's the problem. Although the the examples in the spec and the PR give a single camel-cased string as a type (like UserPicService), in reality it's actually going to be a linked data url. So it'll actually be something like type=https://schema.org/UserPicService. So the full URI will be:
did:example:123;type=https://schema.org/UserPicService/img/userpic.jpg
Notice what that does to the parsing algorithm. It's no longer possible to just parse up to the first / and assume that's the service locator key/value pair, and it becomes almost impossible to tell where the locator ends and path begins.
So the actual Option 1a would require URL-encoding the service locator, like this:
did:example:123;type=urlEncode(https://schema.org/UserPicService)/img/userpic.jpg
Benefits:

Clear whether the service should be looked up by id or type

Drawbacks:

No separator between service and path, so you have to parse up to the first / (and make sure to URL-encode the locator)
Slightly wordier than alternatives below

Option 1b: - Service or Type directly, no keyword, then Path (no separator)
Example: did:example:123;user_pictures/img/userpic.jpg
or
did:example:123;urlEncode(https://schema.org/UserPicService)/img/userpic.jpg
The implication with this option is that the Resolver:

Splits on ;, first item is the DID, the rest is the Service component
Parses the Service component up to the first / (which means, slashes in service IDs or types are illegal) - this is serviceIdOrType
Starting with the first /, everything that follows is part of the service path
Resolver fetches the DID Doc, and first tries the parsed serviceIdOrType as a service id, and failing that, tries it as a service type. Because the set of services is un-ordered, in case there are multiple services for a given type, the resolver would return whichever one it got to first.

Option 1c: - Service or Type directly, no keyword, : separator, then Path
This is same as 1b, but uses a : (colon) to separate the service id or type
and the service path, scp style.
Example: did:example:123;user_pictures:/img/userpic.jpg
or
did:example:123;urlEncode(https://schema.org/UserPicService):/img/userpic.jpg
The idea being, this makes it slightly easier to parse and separate service locator and path. Except just like with the previous options, you still end up having to URL-encode the service type (because of the colon in the http URL).
Option 1d: - Service or Type directly, no keyword, ; separator, then Path
If we switch to re-using the ; (semicolon) separator, we can now get away with not URL-encoding the service type:
did:example:123;https://schema.org/UserPicService;/img/userpic.jpg
This makes the parsing easier - split on ;, first item is DID, second is the Service Locator, and the rest are part of the service path.
Option 2 - Query Params or Hash Fragment Query Params (considered and rejected)

As @Fak3 and others have asked, why not specify the service id/type and the service path as either query parameters or hash fragment query parameters?
In other words, why not do:
<did with path>?service=<service id>&path=<service path>&...<all the other DID query params>
or
<did>#service=<service id>&path=<service path>&... (similar to how OAuth2 Implicit flow passes back the token and state and so on, in the callback redirect).
Reasons not to go with this approach:

Not as easy for a DID Resolver to tell that this is a Service URI -- it would have to first parse the DID and then examine the query params for reserved keywords like service (which would not be allowed to be used in any other context), and only then pass it on to the code path that handled resolving service URIs.
More importantly, this approach goes against the URI spec. Query parameters (see Section 3.4 of the URI spec) are scoped to a particular scheme and naming authority. In other words, query parameters are only meaningful to a given server, and have no universal meaning across different URIs (even within the same URI scheme). Similarly, with hash fragments: fragment's format and resolution is dependent on the media type and Fragment identifier semantics are independent of the URI scheme and thus cannot be redefined by scheme specifications.

[Note to self: how to reconcile this with my recommendations for Integrity Protected URIs in JSON-LD?]
Option 3 - Magnet URI style (not yet considered?)

I'm not sure if this has already been discussed, but @cwebber brought up the possibility of using Magnet URIs for this (and other) use cases.
Magnet URIs basically consist
of nothing but query parameters. And, unlike regular URIs, it is expected that
query params will be registered as well-known keywords. So for example, the xt query param always stands for 'Exact Topic' (the hash of the target file).
So our Service URI would be something like:
magnet:?as=did:example:123&service_id=user_pictures&service_path=/img/userpic.jpg
This option would require us to define and reserve the query parameters (like service_id, service_type, service_path) and their semantics.
Summary

So, we still need to determine which of these options we should take. If we go with the current consensus (; as separator between DID and service component), we need to decide whether to use a separator between the service locator and service path, and decide whether the service locator should be key/value style (type=<ServiceType>) or not.
My personal vote is either Option 1d, or Option 3.