Skip to content

Instantly share code, notes, and snippets.

@MonkeyDo
Last active March 28, 2019 12:08
Show Gist options
  • Save MonkeyDo/2bb3add231979834e1b3fd034f109cd3 to your computer and use it in GitHub Desktop.
Save MonkeyDo/2bb3add231979834e1b3fd034f109cd3 to your computer and use it in GitHub Desktop.
Auth
- No need for an application page, we will use MetaBrainz’ (currently on MusicBrainz website (https://musicbrainz.org/account/applications).
- No need to generate or store tokens, only handling auth with existing token for existing user. No need to add a DB table to store tokens.
- Possibly a UI change will be needed with a shortcut or simple page that redirects to MBs applications page
Tech stack
- Any good reason why we should use Koa over Express, considering we already use Express for the web server and can possibly reuse some code? https://github.com/koajs/koa/blob/master/docs/koa-vs-express.md
- I don’t think we need Kong. The auth and rate-limiting can be done in Express with the help of packages (links below). Kong will introduce extra requests = extra latency
- Caching with Redis can also be done with Express (links below)
Rate-limit for ExpressJS with Redis: https://www.npmjs.com/package/rate-limit-redis with https://github.com/nfriedly/express-rate-limit
Another rate limiter for Express https://github.com/animir/node-rate-limiter-flexible/wiki/Express-Middleware
Exponential back-off: https://www.npmjs.com/package/express-slow-down
The webserver uses Passport for Express to manage auth: http://www.passportjs.org/
Search
- Search endpoint is likely to only return a simplified representations (bbid, default alias) rather than full entities.
- Once Solr is implemented we can develop the schema and DB access to return full entities
What to return in the queries?
- Anything that is not an Entity needs to be returned with the associated Entity (if requested).
- Anything that is an Entity needs to be fetched as a separate request (no mechanism to automatically fetch linked entities)
- The aim is to save on SQL joins wherever possible. Passing a flag to request data linked to an entity, as described for relationships in the proposal, is fine.
- Instead of ‘include’ subqueries that MB has (https://musicbrainz.org/doc/Development/XML_Web_Service/Version_2#Subqueries), get the client to batch multiple separate requests (fetch author, process relationships by type, new request to fetch array of Work entities bbids). Avoid SQL joins as much as possible to save time and send only specifically requested data = lower latency for everyone
- `GET publication/{id:BBID}/edition` for example does not follow that logic: instead, a user needs to fetch the publication with `editions` flag set to true, then fetch Editions by bbids from the array returned by the first request OR see below:
Entity endpoints should allow to browse by linked entities:
- For example: I should be able to search for Works by Author X using `GET /work?author={authorBBID}`
@yvanzo
Copy link

yvanzo commented Mar 20, 2019

Review of revision 2

Per item:

  • Auth: ACKed
  • Tech stack: no clue (BB specific)
  • Search: I am not sure I understand correctly you suggest two development stages. If so, ACKed.
  • What to return in the queries: I think this is mostly what we agreed on during the latest MeB summit.

Overall:

  • Output format: XML? JSON? Both (as in MB)?
  • Schema: Any reason to make it search specific? Shouldn’t it be defined while coding so as to validate output with?
  • Auto-generated documentation? (which MB doesn’t have yet) Using iodocs?
  • Please wrap lines to make it more readable on gist 👀

@mwiencek
Copy link

The auth bits make sense but I don't have much technical knowledge about Express or Koa. :/ One design issue with the current MB webservice is its being very difficult to cache, so some more planning about how caching/cache invalidation will work might be useful.

Another thing that'd be useful to think about is how standardized/complete you want the format to be. IIRC BookBrainz has some -data project for DB access. It must return objects in some known format to render the React templates. You may want this format to be consistent with the API if you hope to be able to, for example, render some dynamic template on the client based on webservice data.

Anything that is an Entity needs to be fetched as a separate request

Sounds like a good idea, if a user can request multiple entities of the same type (returning an array) in one request. It's something that's been requested for MB.

Passing a flag to request data linked to an entity, as described for relationships in the proposal, is fine.

I can't find any proposal linked, so not sure what this means in relation to the previous point, but I'd maybe think about how it affects caching. Since every extra flag that can change the output makes caching harder and less efficient. Consider if entities had only one possible representation, and unbounded data associated with it (aliases, relationships) were in separate, paged resources. Though API ease of use is a concern too.

@MonkeyDo
Copy link
Author

@mwiencek The proposal I was referring to was from 2018. In the meantime the same applicant has published a new proposal for this year.
Could you also have a look at that new proposal, and chime in, especially concerning plans for ws/3 (I wasn't at the summit discussion so i only have a vague idea)?

Consider if entities had only one possible representation, and unbounded data associated with it (aliases, relationships) were in separate, paged resources.

So instead of includes, for example have separate endpoints, something like /{entity}/aliases, /{entity}/relationships ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment