lyzadanger/group-ids.md

## group-ids.md

      
    Raw
  

              group-ids.md
            
          
    After a brief jaunt through code and a think, I think I would summarize my reluctance to try to solve the client-determined identifier need for groups using pubid as follows. This doesn’t touch on the authorization or invite bits involved with pubid, just some high-level things.
I don’t know if this is deep enough to claim that I’ve looked at how hard it would be to update pubid in all the various places, but I realize my argument stands even if group.pubid were easy to modify. I feel like it inherently does something else than the notion we’re after here. Here are my thoughts!

One of the current core things about pubids is that they are known to be unique for a resource across all of our data. This is reflected in their use to identify a single resource in 17 different routes in our application at present.
However, our need for a client-determined identifier needs to enforce uniqueness per authority — the same identifier could be reused in multiple authorities. I believe this to be a design necessity, alert me if I’m wrong.
If we changed pubid to be unique-per-authority, we’d have a bunch of routes that no longer point to unique resources, but possibly one URl that could represent multiple resources. Uh oh!
organization also makes use of a pubid. If we changed the way pubid worked for groups (vis-a-vis their uniqueness or composition), organization and group would be out of sync, which introduces some complexity—organization pubids could be relied upon to point to a single resource, but not group pubids. Uh oh!
Introducing a groupname field would be concordant with the way identifiers work for users (username), and could be brought into yet more harmony by the adoption of a groupid syntax in line with the way we compute userids

We have this system, roughly, which can be applied to various resources:

An internal unique ID, never exposed (i.e. an auto-increment)
A service-generated, web-facing, guaranteed-unique pubid. This pubid has some extra meaning WRT group right now (having possession of it gives you powers). In the future, I would suggest this should perhaps not be the case, but I’m going to put that aside for now.
A client-determined, unique-per-authority identifier (user.username,;proposed group.groupname). The client/authority owns managing these for uniqueness.
Thus: There is a clear line between pubid (owned/managed by the service) and *name (owned/managed by the authority’s client(s)).
A computed syntax that can be used to form a service-wide unique identifier by combining *name with the authority, e.g. the way we compute userid (account:<username>@<authority>). We could also have a groupid syntax.

Extending this a little into the future (fuel for ruminating, not immediate):

users don’t have pubids. Perhaps they should? Could be useful and consistent. Discuss? :)
If our resources had a consistent combination of these, resource endpoints could ostensibly accept either a pubid or a *id (userid, groupid, etc.) syntax…


Second Revision!

I’m starting to realize that there is an inherent incompatibility between all of the following requirements in play at once:

Have a client-definable, unique-to-authority identifier for a resource AND
That identifier must be mutable AND
We want an UPSERT endpoint that uses that mutable identifier as the, well, identifier for a resource

My hunch is that the first two requirements are necessary. The first is relatively self-explanatory (and how we got here in the first place) and the second follows naturally, especially as—as Rob pointed out, thank you—the username field on user is mutable, and this identifier is supposed to be client-owned—thus the client should be in control and able to mutate that property’s value.
However, the third requirement arises from a desire to limit requests made from the LMS app—UPSERT operations are not actually RESTful and trying to impose them here is starting to show some fracture lines. Erp.
My earlier proposal of an endpoint in the style of PUT /api/groups/{groupid} where groupid is a userid-like construct, e.g. ”group:somegroupidentifier@myauthority.org” is not sound, as the somegroupidentifier could be mutated—and even potentially in that very request. Yikes. That is not good.
(Note: I’m going to call the client-owned ID groupname and the service-owned ID pubid in the following but that doesn’t imply those names are baked; I’m also using an invention of groupid as defined above, similar to userid, but that’s provisional, of course, too).
So, what WOULD be REST-ful?


Create a new group:  POST /api/groups (with a groupname  in the request body); no change here;
Retrieve a group:

GET /api/groups/{pubid} OR
GET /api/groups?groupname=somegroupidentifier&authority=myauthority.org OR
GET /api/groups?groupid=group:somegroupidentifier@myauthority.org


Update a group: PATCH /api/groups/{pubid}

Totally legal to update the group’s groupname in this request


In this model, we definitely need to retain something pubid-like. We need that truly unique, system-wide identifier. It needs to be something that cannot be changed by the client.
And in this model, the pubid is the unique reference to the resource; thus the modeling of retrieving-by-groupname-or-groupid is really a search function.
And how would LMS consume this?

With respect to groups, on launch:

Attempt to retrieve the group using the client-determined identifier: GET api/groups?groupid=somegroupname@myauthority.org

If 404 response, the group doesn’t exist. If current LTI user is an instructor: POST /api/groups with request body containing desired identifier (i.e. create the group)


At this point, we should have a 200 response with the group resource in it—which includes pubid
PATCH /api/groups/{pubid} with any fields you want to change in the body (or the same request body as group-create, if you want). In any case, a PATCH

Yes, I concede (sorry!): this is 2 requests related to groups on app launch (possible max of 3 if group doesn’t exist yet), but I think  the API design may be more correct. It also gives the LMS app an opportunity to decide whether it wishes to perform metadata sync (i.e. updating) of resources on every launch or just sometimes/less frequently. It could also allow for a situation in which the requesting client wants to verify that the resource it retrieves is indeed the one it intended to retrieve before blindly updating it.
In Summary


I do believe we still need a distinction between a service-owned, non-mutable, unique reference (currently pubid) and a client-defined, authority-specific identifier (*name or whatever). I don’t see an easy way to retire pubids entirely, without replacing them with something else in this (unique ID) vein.
I’ve banged my head against models for an UPSERT operation for a couple of months now and I keep coming to RESTful dead ends. Sorry!
In this model, technically, groupname is not an ID so much as just-a-regular field that can be used as an identifier in some use cases. See also: username. So our IDs are still pubids. This is important to note: we wouldn’t be introducing a new ID concept here.

This proposed direction satisfies the needs to not store pubids in the LMS database and the ability for the client to define some form of unique identifier, but does not implement UPSERT.
pubids	Provider IDs
Server-generated	Client-provided
Required	Optional
Globally unique	Unique only within their authority
Short and URL-friendly	Long and provider-friendly (it's easiest if all kinds of digests and UUIDs etc are allowed)
Used in secret links, so must be un-guessable	Not used in secret links, so can be predictable (e.g. a SHA-1 of LTI launch params)
Are the same as organization pubids	Are the same as user provider IDs