faceyspacey/rudy-prefetch-and-code-split-strategy.md

## rudy-prefetch-and-code-split-strategy.md

      
    Raw
  

              rudy-prefetch-and-code-split-strategy.md
            
          
    Rudy Prefetching + Code-Splitting Strategy

We're gonna start by looking at how code splitting works, and build our understanding of the task from there
The Big Picture problem we solve

CODE SPLIT EVERYTHING (NOT JUST COMPONENTS) -- AND DO IT BEFORE YOU NEED IT
The big hidden idea here is that with RUC it's challenging, not perfectly performant, and non-obvious how to code split non-component things like reducers, thunks, etc.
This is because usually the route has already changed, so u must re-trigger the route change to make the new reducers and thunks run.
Our setup described below helps you control the ORDER.
You can now guarantee you have reducers, thunks, and components before the enter state is reached. That's the big takeaway that might not be obvious here. That's the real problem we were trying to solve--not just come up with a route-based code splitting technique for the hell of it. In other words, its not a solution looking for a problem. We have real problems here that we now have solved.
codeSplit usage + implementation

What follows is complete usage and implementation of our v1 code-splitting middleware.
/* app/src/routes.js */
const routes = {
  FOO: {
    path: '/foo/:page',
    load: ({ params }) => import(/* webpackChunkName: my-component */ `../components/${params.page}/index.js`)
  }
}


/* app/src/configureStore */
import codeSplit from 'rudy/middleware/codeSplit'
import routes from './routes'

const { middleware, reducer, firstRoute, flushChunks } = createRouter(routes, options, [
  transformAction, 
  codeSplit('load'),
  enter,
  call('thunk', { cache: true }),
])

// createStore etc


/* app/src/components/MyComponent/index.js */
import MyComponent, { Another } from './MyComponent'
import reducerKey from './reducer'
import thunk from './thunk'

const components = { MyComponent, Another }
const reducers = { reducerKey }
const chunks = 'my-component'

export { components, reducers, thunk, chunks }


/* rudy/src/middleware/codeSplit.js */
const codeSplit = (name = 'load') => (api) => async (req, next) => {
  const load = req.route && req.route[name]

  if (load) { // if `route.load` does not exist short-circuit
    const parts = await load(req)
    addPartsToRuntime(req, parts)
  }

  return next()
}


const addPartsToRuntime = (req, parts) => {
  const { route, action, options, tmp, ctx, commitDispatch } = req
  const { components, reducers, chunk, ...rest } = parts

  if (ctx.chunks.includes(chunk)) return // chunk was already added to runtime, so short-circuit

  if (reducers) {
    options.replaceReducer(reducers)
  }

  if (components) {
    action.components = components // we need to modify `createReducer` to store `state.location.components` so after load they can be dynamically rendered within existing components!
  }

  if (tmp.committed && (components || reducers)) { // if the route change action has already been dispatched, we need to re-dispatch it again, so the new goodies are received
    action.force = true // we need a flag to force this action through, so component are added to state or new reducers receive action -- the `force` flag doesn't already exist, it's a placeholder for something we can already use to force the action passed the `isDoubleDispatch` check; we may have some other piece of infrastructure that precludes needing to create a new custom flag
    commitDispatch(action)
  }

  Object.assign(route, rest) // rest allows you to tack on additional thunks, sagas, etc, to your route object (optionally) -- i.e. you can build the "plane" (aka route) while flying
  ctx.chunks.push(chunk)
}
So what we can see is that a the codeSplit middleware does 2 things:

calls route.load
supplies some automation to apply components, reducers and route aspects

The automation in the addPartsToRuntime function doesn't even have to be used. In which case the concept is simply calling route.load like any other thunk, and it's up to the user to replace reducers, make components globally available for use within components, etc.
But what we propose is a structure to the entry point of the chunk where we discover a hash of components, a hash of reducers, a chunk name, and any key/vals you'd like to tack on to your route.
components

Components are added to the reducer state at state.location.components.fooForExample. Your component tree can thereby show a spinner until this component exists in state! Awesome!!
Later down the line we will have our own <Route /> components, and it will automatically show a loading component until its ready. Eg:
<Route component='fooForExample' loading={<Spinner />} />

I have an API planned for this elsewhere, but it's future stuff; components likely wont make it into launch, and that's fine because we are branding ourselves as the pro way to architect apps; React Router path-based routing is a railsesque/convention-based hack. It's helpful to us to continue to brand our approach as the truly professional state-driven way custom apps create state machines. In addition, the thinking here is when Rudy comes out (before Respond) we ultimately dont even want a huge influx of learners; so it's fine if we are inaccessible to developers mystified about the lack of Route components to start; we will need to take time iron out kinks for pro developers that dont waste our time in Github issues.

Ok, so anyway, it's a pretty simple concept: components are injected into state, and then you can do this:
const MyComponent = ({ AsyncComponent }) => 
  AsyncComponent ? <AsyncComponent /> : <Spinner />
  
export default connect(state => ({ AsyncComponent: state.location.components.fooForExample }))(MyComponent)
Yup, we store the component in state client side, even though that's typically not recommended.
The problem here is you can't do it server-side and transport it to the client (aka rehydrate it). That's because functions such as components and reducers cant be serialized by JSON.stringify. But that's actually fine, since if you're doing SSR, you will flush the names of chunks (described below) and insure they exist client side on first load. Then the same route pipeline will run, but instantly load the chunk (and its components/reducers/etc) into the runtime before entering/committing the route state. So we are talking a 0-1ms additional delay, unless the codeSplit middleware and route.load is called after the route state is committed, in which case a few more ms are needed to re-dispatch the action. After we get our first draft done, we may be able to find a way to work around that.
flushChunks

/src/core/createRouter.js:
  const ctx = { busy: false, chunks: [] }
  const api = { routes, history, options, register, has, ctx }
  
  // the rest of createRouter implementation
  
  return {
    ...api,
    middleware,
    reducer,
    firstRoute: (resolveOnEnter = true) => {
      api.resolveFirstRouteOnEnter = resolveOnEnter
      return firstAction
    },
    flushChunks: () => ctx.chunks
  }
So server side, all we gotta do is provide the array of chunks rendered!
It's up to the user to name them properly to match what webpack has in stats. By no longer relying on babel-plugin-universal-import we thereby circumvent a host problems (that you now know very well) related to matching the names of chunk names to what webpack already has. This is good for us, and definitely the right move in the beginning. In addition, our route is simple: it doesn't have route.load, route.chunkNames, and route.resolve. It just has route.load and the chunk name itself is moved to where it actually is: in the entry point module of the chunk. So overall that gives the apperance of very simple code splitting from the perspective of the routesMap--it's just one key: route.load!
multiple imports in load (advanced not yet supported use case)

Say you wanted to do:
load: () => Promise.all([import(), import()]).then(([a, b]) => ...
You would have to merge the components, reducers and chunks into individual hashes.
Based on the current implementation of the middleware, that's what you would have to do, with one thing missing:
the middleware currently only checks for the existence (and adds) a single chunk. We could easily change that to support an array of chunks. Let's not worry about that for now. Keep in mind we're also building prefetching in this task--so let's start with the simplest version of the middleware. We can revisit how to handle loading multiple chunks in one route later. We have a few options there, such as automatically merging components and reducers. Another thing we could do is have route.chunkNames, which was actually part of the original plan here (but as described above, I felt it better to have a single code-splitting key as route.load).
But what that would mean is that instead of specifying the chunk names in the chunk's entry module itself, route.chunkNames could be a function that returns an array of potentially dynamic chunk names which themselves are potentially based on dynamic import path expressions, eg:
chunkNames: ({ params }) => ['some-component', `some-${params.page}]`)
Then route.load would return a hash of this signature:
{
   'some-component': { reducers, components },
   'some-componentb': { reducers, components }
}
At which point, reducers, components, and chunk names could more easily be automatically merged, but at the expense of a more complicated API. Obviously we would support the single chunk interface as described above, but just having additional docs for the advanced use case adds additional mental overhead.
That's why we should more fully understand the basics of the problem here before trying to support advanced multi-chunk use cases out the gate :) ...we're pros, we know coding things efficiently is about seeing things as a process, and knowing during which steps/iterations to take on additional complexity. Let's not encumber ourselves now with the 20% use case. This is a middleware after all--other developers can modify it to create their own; and by providing a simple starting point, it will be easier to understand what code is necessary to extend it for more advanced use cases :).

aside: i've never personally needed to import multiple chunks; i try to make one chunk have all that's needed; but the use case is complex apps which have shared chunks. Even that use case is pretty much solved by having redundant modules between those chunks. This goes back to our unbundled future bundler, which provides ultimate efficiency in module re-use--we're not there yet, nobody is; it's not the problems we're solving yet; some redundancy is fine.

Prefetching === action dispatch without "entering"

Here's a simplified version of our enter middleware:
import { isServer, redirectShortcut } from '../utils'

export default (api) => (req, next) => {
  if (req.action.prefetch) {
    return next() // skip state change!!!
  }

  const res = req.enter() // commit history + action to state

  return res.then(next).then(() => res)
}
The Concept

If action.prefetch is truthy, we dont actually change the route. I.e. we dont change the state of state.location and we don't push a new history entry (which is what request.enter() does).
But this allows the whole middleware chain to run nevertheless.
The Implications

Your thunks/sagas/etc must be able to successfully run without being dependent on state.
In the original RFR, you would get path params out of getState().location.payload.someParam. Now--if you want prefetching to work--the promoted way to do so is through action.params.someParam. EG:
const routes = {
   SOME_TYPE: {
      path: '/foo/:someParam',
      thunk: ({ params }) => fetch(`/api/${params.someParam}`) // aside: thunk returns are automatically dispatched in Rudy 
      // or
      thunk: (request) => fetch(`/api/${request.action.params.someParam}`)  
   }
}
This enables prefetching, as api pings no longer are dependent on existing state, but rather URL params/queries/hash/etc.
As an aside, the initial motivation for this was so you can perform API requests before the enter middleware changes the state (i.e. in onBeforeChange in RFR or beforeEnter in Rudy). In short, there's a lot of benefits to this approach, at minimal cost.
The exception is you can use long-standing state such as user tokens, IDs, etc, as they are pretty much guaranteed to be available after initial load of the app (or login) and stay consistent.
The final implication is that the action.prefetch flag will need to be used in userland to prevent things like pinging 3rd party analytics from being called. There's a lot of cross-cutting concerns like that in real apps that will now need to be conditionally boxed in, so this is a weakness of this approach--we should seek to better automate this once we better understand the problem :)
Next Steps


add the above middleware to src/middleware/codeSplit.js
i obviously never got it to work, so there are some things you will have to figure out; eg: replaceReducer actually has to combine reducers with the existing reducer; for now options.replaceReducer hints at this; we should supply a default option that combines reducers
manually test and make codeSplit.js work (write some real tests when you're done, or perhaps TDD it using wallaby)
put tests in __tests__/integration/middleware/codeSplit.js
you will need to make src/core/createReducer.js able to store components (easy)
modify enter.js middleware to skip if action.prefetch
use one of our boilerplates to manually test that data is fetched (and chunks loaded)
use action.prefetch in the boilerplate to confirm its a useful flag to prevent unrelated work from happening
write tests in __tests__/integration/middleware/prefetch.js
think about how we will automatically dispatch an array of actions with action.prefetch after loading a route.

So regarding that last point, the idea is that v1 of this task relies on users manually dispatching action.prefetch to prefetch potential next routes.
Well, we're gonna offer route.pefetch: (req) => [action, action] which is called (if available) after the route is complete in return next(req).then within createRouter.js.
Caveat

There is one caveat to the code splitting middleware: it can't simultaneously load a thunk/saga/etc while doing its data-fetching work in parallel!
I actually "built" this and could show you this code, but decided to strip all that code out and give you a far more simplified codeSplit middleware. The parallel version was like 15x more code. Im serious. Check it, here's what it did:
it dynamically added routes to your routesMap prefixed by a /api-proxy and then from the client fetched both the chunk + "data" in parallel. Essentially the thunk, which would currently only exist on the server, was called as an API endpoint, at the same time as the chunk was bringing the same thunk to the client. In other words, I made it so that your routesMap which is used server side to render strings for different URLs, was also used as a fucking API! Pretty cool, right. Cuz, keep in mind express passes all URLs to our server-renderer via *. So nothing is stopping us from using it to intercept other routes and server JSON. That's exactly what I did. I made it otherwise short-circuit, and all you had to do was call express's request.json(), or pass a custom function as an option to createRouter that does similar.
So we gotta cut that from the strategy in order to ship. And we may never encumber ourselves with that. The thinking is that if ur effectively using prefetching, u wont benefit much from parallel loading. And also: most people won't be code splitting their thunks, just the reducers and components. Cuz, keep in mind, we have the addRoutes code splitting capability to add entire routes. So that means we have 2 forms of code splitting: single route + multi route. We're covering a whole lot of basis already, and would be getting extremely nit-picky and OCD by supporting parallel single route code splitting.