Skip to content

Instantly share code, notes, and snippets.

@kadamwhite
Last active June 15, 2017 15:18
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save kadamwhite/e51e12db9814d123a4399e1bb1e068a7 to your computer and use it in GitHub Desktop.
Save kadamwhite/e51e12db9814d123a4399e1bb1e068a7 to your computer and use it in GitHub Desktop.
WCEU 2017 Contributor Day REST API architecture and request API lifecycle overview

REST API Contributor Day

Plan for the day

  • Improve documentation
    • Something we explain very poorly in the REST API handbook; hoping to explain it today to de-mystify
    • Handbook is what was on WP-API.org -- some stuff in there is very good and some that is out of date
    • We have some guides but they are not organized
    • We want to come up with a coherent content architecture
  • Expand the endpoints presented by the API
    • Plugin & Theme endpoints
    • Customizer team is also working on REST API stuff; after the intro, if you're more interested in coding, they might need some help. We'll be focused on more meta work and less code.
  • Authentication
    • Have started; but it doesn't work yet. If you're interested let Ryan know.

Attendees

  • Richard (sweeney?) -- best question asker!

How does the REST API work?

  • The wp-json/ "index" is the entry point to the REST API
    • Namespaces can quickly indicate what is available; is jetpack installed etc
    • a specific endpoint is a route with a method; /posts GET vs /posts POST
  • Each route represents a resource: a type of data or a collection of data
  • And endpoint is a function that responds to an action: it is a particular action (GET, POST) on one of those resources
  • Data flow:
    1. HTTP Request comes in
    2. Request goes to WP_Rewrite
      • Normal WP requests would then proceed to WP_Query
      • For the REST API, through a piece of core (not explained here) the request is hijacked and sent to WP_REST_Server. This avoids the whole post loop.
    3. WP_REST_Server cares about the method. A GET request is considered completely different from POST request. This is why we hijack the request and route to WP_REST_Server instead of WP_Query
    4. Inside WP_REST_Server the request goes through serve_request to become a WP_REST_Request object. The data that comes in through $_SERVER is baked in to that REST request object.
    5. WP_REST_Request is forwarded to dispatch (a class method of WP_REST_Server)
    6. dispatch calls the "endpoint" (or "endpoint callback"), e.g. WP_REST_Post_Controller's get_item method
    7. That endpoint method returns a WP_REST_Response object
      • Endpoints may return any object they like; an integer, an array, a string, any JSON. (But WP_REST_Request wraps that data and ensures it has everything needed to fulfill the response. [ed.: check this])
      • All the endpoints in core return a WP_REST_Request object, but custom endpoints do not have to. dispatch passes them through rest_ensure_response to coerce whatever the endpoint method returned into a WP_REST_Request
      • WP_REST_Request has body, headers, status code, and links
      • Dispatch can also handle WP_Error objects; 4xx means client messed up, 5xx means server messed up
    8. WP_REST_Request object now ends up back in serve_request
    9. serve_request calls out to wp_json_encode, and also calls out to header to send any headers.
    10. serve_request eventually calls exit
  • You can make requests internally, as well, in addition to making HTTP calls
    • rest_do_request calls dispatch directly, and returns the response object directly
  • Q: There's a lot of actions in the REST API code. Would there be any reason to hijack the flow?
    • Yes. There are some very important hooks. For the biggest use, caching, in either serve_request or dispatch (not sure off top of head) there is a filter rest_pre_dispatch. (If this does not return null, then that value will be returned instead. [ed.: check this]) There is also a filter called rest_post_dispatch. You can build a caching plugin that sits on rest_post_dispatch that can hash the request object, serialize it, and use that as a key that completely describes the incoming request (though you have to add the user making the request) to set a cache key that can be located in the earlier filter and used to short-circuit the request.
  • The links object describes relationships
    • Taken from the HAL standard, to which the API partially conforms
    • (there's some other slightly non-standard stuff in the API, such as how we don't accept the full subset of ISO8601 date formats)
    • ?context=_embed will allow some of those related resources to be embedded into the main request object, within an _embedded object. The _embedded object will contain response objects for each embeddable: trueresource, ordered as the items in each _link member's array.
      • This data is retrieved by dispatching a request internally using rest_do_request; that request is made with method=GET, and context=_embed.
  • Q: If you could change the method signature that you're calling, can you... (did not catch full question, something about the main WP_Query)
    • WP_REST_Request contains three pieces of information: method, request, and headers. The _method lets you override that method. It does not effect the regular WP request cycle.
  • Meta parameter _method will override the method that will be interpreted by the WP_REST_Server
    • _envelope parameter will wrap the response object in a simple object that includes the headers and status code, to make the API useful on servers that don't support custom headers or specific status codes.

Code walkthrough

  • rest-api.php
    • This is inside wp/includes
    • this is a grab bag of rando functions, working out what they do and where they are used is interesting
    • Main one we talk about in this flow is rest_api_init and the things that flow from that
      • This hook callback is registered in default-filters
    • On the plugins_loaded hook (not init, despite the name) we register our rewrites -- so we can have re-writes but still hijack it -- and register a rest_route query var
    • we then add rewrite rules to handle various routing situations. doing all this at this time permits the API to function when pretty permalinks are not enabled
  • Q: After the rest hijack, none of the other WordPress stuff is loaded? you end with the exit, and that's it?
    • Correct. WP wants to map WP_Rewrite to WP_Query. The only way to avoid the default WP_Query running, which would be unnecessary DB requests, is to call exit once the API is done, before WP proper kicks in.
  • There is a REST_REQUEST constant, but it will only be true when fulfilling a rest request via HTTP; that constant will not be present when fulfilling a request via rest_do_request. (we can't set it on rest_do_request because then we'd need to unwrap that after and constants cannot be reassigned)
    • We discourage the use of this constant; try to write your code to work consistently whether you know you are in the API or not. Using REST_REQUEST can yield very unexpected results when you begin using embedding or dispatching requests from PHP.
  • Back to rest_api_init
    • Once WP_Rewrite is hijacked, if rest route is empty, then we say to use the index. That will never happen with most requests, but ?rest_route= with no slash can lead to a strange path because of PHP slashing.
    • Then we get into serve_request
      • In serve request we know we are taking over the whole request. in dispatch we do not; dispatch is just about retrieving and returning data.
      • serve_request sends content-type JSON; and handles jsonp.
  • Q: Why did you use _jsonp instead of callback, which is more standard?
    • Did not want to support jsonp too much; wanted to make it clear what was happening (callback felt too generic), and ?_jsonp was the standard for some library that was being referenced.
    • Generally speaking we don't want you to use it. Because we have to do a whole bunch of stuff that runs more code to validate the callback, then later we have to append some additional data. JSONP turns out to be a big problem. There was a famous attack, "abusing jsonp with rosetta Flash," to serve arbitrary things from a domain.
    • JSONP is also limited to a subset of things you can do, to avoid some of the security issues inherent in JSONP.
    • (there is also a filter to avoid disabling jsonp handling entirely)
  • We recommend supporting CORS instead of jsonp; we have full support for CORS instead. There's a function rest_send_cors_headers. The REST API is open to all origins by default, however (KAdam interjection: that does not match my experience)
    • _envelope won't get browser compatibility headers, so it won't get the meta-level headers like `Access-Control-*`` etc.
    • We send no-cache headers if the user is logged in.
  • Q: Can you override access-control-allow headers?
    • Yes. The headers are "sent" in class-wp-rest-server.php, but that just buffers them; then you can replace or append later with send_header. This is how we allow modifying the content-type, which is set very early in the process.
  • Most of the API is quite extensible; it is very easy to add data to the API. But it is more complex (by design) to remove fields, because clients are going to be depending on the presence of certain fields and removing them might break core or other plugins
    • By "more difficult" we mean there is no deregister_rest_field method, not that it's impossible
  • we then create a WP_REST_Request by combining the received queries with $_SERVER data
    • This is also where we provide the method overrides
    • Also we check authentication
    • Then we have dispatch
      • filter on rest_pre_dispatch, to provide ability to retrieve data from cache etc (see above)
      • Then we check url parameters (from URL) and attributes (from route definition)
      • Then we do parameter validation (critically important part of the code, pretty complex on its own)
      • We run the permission callback, which does current_user_can checks; this means that permissions checks run internally as well.
        • If you are making a request internally and need data from a different user, you have to switch users to get that data.
      • Now we have done all the pre-checks! We can run the function.
        • We do still allow hijacks, so that you could put in middleware if you wanted. This is not common in WordPress, but prevalent in slim framework, Express, etc, so we provide the capability.
      • run dispatch request callbacks then run rest_request_after_ callbacks
        • you probably don't want the request before/after callbacks because they are quite low-level, but they are there.
      • If there is an error, we massage it into a WP_REST_Response; otherwise we wrap the response body in a WP_REST_Response. This class just wraps data that isn't already a WP_REST_Response.
    • Then dispatch ends, and we jump back up to rest serve request; we do more error checks, handle converting errors as needed
    • Then we do rest_post_dispatch (which runs in rest_serve_request, unlike rest_pre_dispatch which runs inside dispatch) -- this is inconsistent but generally matches the rest of WP, where a pre_ filter lets you hijack, and post_ filters are less common
    • The rest response is enveloped if necessary (wrapped in a secondary WP_REST_Response)
      • You could actually write a client that always uses _method and _envelope, and unwraps those abstractions
    • We send the status code to the browser
    • Then we begin sending the actual output
      • rest_pre_serve_request is yet another hook for hijacking the output; if rather than JSON you wanted to implement your own style of JSONP, you could use this hook. It would send all the same headers but you can control the body yourself.
      • We serialize as JSON; check for errors; then augment the response if it's JSONP to mitigate known attack vectors
    • we hit response_to_data.
      • Which says, if we're embedding, then we want to embed all the embeddable links
  • Q: What about shortcodes? Why are they in the rendered property?
    • If we have a look at the output of the API, some shortcodes are running; they are embedded in the rendered content. The way this works internally is that inside prepare_item_for_response, we take content and pass it through the the_content filter, which renders shortcodes. The raw shortcodes are not available outside of context=edit (and relevant permissions); this is to avoid leaking private data that may be present in shortcodes
      • Some shortcodes however expect JS or CSS to be loaded; that can't be communicated easily over the API.
      • This is one place you could use the REST_REQUEST constant to change what the shortcode does depending on context, but again, there's inconsistencies about it.
  • Last piece: how the user gets set up
    • We have a single hook that handles a ton of authentication work
    • In wp-settings.php, we call a method to set up the current user.
    • We eventually get to determine_current_user in user.php. If you're not logged in you get a user object representing user 0, not logged in.
    • Cookie authentication is hooked in to determine_current_user; every time you log in to WP within the regular login form, it hooks in here. But plugins can hook in here too, to handle other types of auth. This is where the OAuth plugins hook in. (OAuth uses determine_current_user and also rest_authentication_errors)
    • (in response to a Q) Because of the way the REST API works, it is unlikely that a plugin operating at a global level could affect the active user in the REST API without being written to do specifically that.
  • Authentication finally gets checked deep in dispatch; that permission callback returns true (can do this), false (can't do this), or a WP_Error (false with a specific reason). Almost every permission callback internally just uses current_user_can.

Ryan: "The API is about getting consistent access to clean data." Many ways to build clients to tailor that data to specific cases, such as more intuitive embedded data structuring, easier linking, pagination traversal, etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment