Hello!
This is a commentary and discussion from walking through what happens on startup in [https://github.com/circleci/frontend], looking at this SHA in December 2014: [https://github.com/circleci/frontend/commit/273558250040ce197e44e683d5528381f36c4eea].
The first part is a literal walkthrough, where I'm tracing my reading of how the code works. This is a way to check my understanding, since of course I don't have all of your context. And it's also a way to echo back to you how someone else might experience this code without that context. The second part is about asking questions, making suggestions, and exploring some paths for making it even more awesome. I also mixed in a few asides in the walkthrough part where I jump ahead a bit.
Doing this with colleagues has worked well for me as a way to do in-depth code reviews about more abstract design and architectural questions. The goal of the the first part is to clarify we're all looking at the same thing, which creates the space and shared understanding to do a good critique in the second part where we both learn things. :)
One last note, since we don't know each other. :) My goal here is to learn more about these programming constructs and share some of the learnings I've had reading through this code. I'm sharing this with you because reading your work has already been enormously valuable to me, enough so that I believe it's worth it to work towards a more in-depth understanding of how you think about engineering.
So cheers! I'd welcome any thoughts or feedback you'd be willing to share back. :)
-Kevin
Ok, the entrypoint is core/setup!
, called from CoffeeScript.
After defining an initial state
, it creates a new history-imp
, which will listen for browser history events and will call sec/dispatch!
as part of starting up. I'm assuming there's an event loop tick before this gets called, since otherwise it looks like main
and define-routes!
won't have been called yet.
In main
, app/app
gets mounted into the DOM.
:navigation-point
isn't set on the app-state, so the app
rendering code shows an empty div.
The event loop yields, the route change is picked up by routes
and let's assume it picks up the dashboard route. It puts an event in nav-ch
with the navigation-point
and args
map pulled out of the URL.
In core
, nav-handler
picks this up and calls two multi-methods in the navigation
controller, one to update the state atom and another to perform other effects (like making API calls).
In defmethod navigated-to :dashboard
, (https://github.com/circleci/frontend/blob/master/src-cljs/frontend/controllers/navigation.cljs#L79), it takes the current state and returns an updated state (which the nav-handler
uses to do a swap!
). Here you can see the complexity of performing state transitions when everything is in a global atom, but also the benefits of this being dealt with explicitly (e.g., testable, clear seam for debugging).
(As an aside, the comment at the top of the navigation controller about needing some other abstraction here really rings true. My instinct is that it'd be more important to restrict the semantics of state transitions, rather than building machinery to automatically do parts of the transition for each handler. I actually think the way you've composed this with the
->
andassoc
is very clear).
After swapping to the new state, the nav-handler
calls post-navigated-to! :dashboard
(https://github.com/circleci/frontend/blob/master/src-cljs/frontend/controllers/navigation.cljs#L91). This does some updates immediately, like set-page-title!
and recording analytics. And most importantly, it performs several calls to the server and schedules a sequence of them.
post-navigated-to! :dashboard
may call api/get-projects
, and always calls ajax/managed-ajax :get builds-url)
. In a goroutine, it blocks until that call completes (it doesn't put the response into api-ch
the way api
functions do). Depending on the success of :get builds-url
it will put a failure message on the error channel or put a :recent-builds
message in the api-ch
, and then make two parallel server calls for the project's settings
and plan
. All of the api calls will put messages on api-ch
when they return. There's no ordering enforced here for when responses are put on the channel, the sequencing is only for the order of making the requests.
As response messages are put in the api-ch
, they are picked up in core
by api-handler
, which works similarly to the other handlers. It dispatches to a multimethod in the api
controller to transition the state atom. This transition is straightforward for projects
(https://github.com/circleci/frontend/blob/master/src-cljs/frontend/controllers/api.cljs#L98). The state transition for recent-builds
(https://github.com/circleci/frontend/blob/master/src-cljs/frontend/controllers/api.cljs#L107) is a bit more involved, and here there's logic that appears to be guarding against race conditions between updating the state atom with the response, and the state atom having changed during the time it took to service the request.
For project-settings
(https://github.com/circleci/frontend/blob/master/src-cljs/frontend/controllers/api.cljs#L239) and project-plan
(https://github.com/circleci/frontend/blob/master/src-cljs/frontend/controllers/api.cljs#L246), it looks like there's some other validation that these transitions make to check whether they make sense with other bits of the state atom (e.g., what the current project is), and no-op if they don't make sense. This is because the result of these server calls are being merged into the part of the state atom with data about the project.
If I read this code correctly, there's a race condition here, but it's a bit tricky so I'm not quite sure and would love a check on my understanding. :) There's an optimization in post-navigated-to! :dashboard
to not perform a get-projects
request if the project already exists in the state atom. But on first page load the controller will make this server call. Let's assume it takes 15 seconds. The call to :get builds-url
is made next and a goroutine in the controller blocks until it returns. Let's assume that request is fast and takes only 1 second. After that request completes, the state is updated and :recent-builds
is set in the state atom. The navigation
controller also makes requests for the project settings
and plan
. If those requests complete before the original projects
request completes, the guard in the api
controller will prevent the state atom from being updated with the results of the settings
and plan
calls. Then, after 12 seconds, the projects
call completes and it updates the state atom. If the project's plan was a trial plan, this would result in the trial message never being show in the UI.
(Slight aside: If I read things correctly there's no way to cancel the server calls once
post-navigated-to! :dashboard
has been called, so it looks to me like it's possible to run into other race conditions here without guarding all these state transition functions. It'd be nice to split this guard out from the transition itself. Or to move the semantic validation of state transitions to somewhere central, or to limit the transitions that can be performed by restricting access to functions that can change the state map, rather than justassoc
ing. Perhaps the guards written here handle all of those cases, but this is hard for me to see, so I'd be wary of race conditions hiding in here. I'm curious to check my understanding of this. Also would love to get some perspective from Clojure folk on the tradeoffs of using plain maps versus something that restricts state transitions to being semantically correct, and also the match of global state with pending server calls that are implicit state with no clear owner. But I'm getting ahead of myself :)).
(Another aside: It's interesting to see that in this case the more complex setup is done as part of the
navigation
controller. There's some complexity in otherapi
controller methods, likepost-api-event! [:build :success]
(https://github.com/circleci/frontend/blob/master/src-cljs/frontend/controllers/api.cljs#L143), so I'm curious to learn more about the rationale for deciding where to put the complexity, although I didn't look at this further yet).
So, as these API results come in, they'll update the state atom, and a view render will be triggered to update accordingly. But, backing up a bit, the first view change that would occur was from the first swap of navigation-point
done in defmethod navigated-to :default
. This would lead to the app
component mounting the dashboard
component as the dominant-component
(https://github.com/circleci/frontend/blob/master/src-cljs/frontend/components/app.cljs#L130).
The first render of the dashboard
component will show the spinner, since there won't be any builds
in the state atom yet. As the server calls complete, the state atom will transition and the view will update. After projects
and builds
are both loaded, there will be a prompt to add a new repository if there are no projects or builds. If the plan is loaded, and the state is such that the user should see a notice about a trial plan, the view will show that.
I can't quite follow where the results of the :project-settings
server call is being used. As far as I can tell it's not being used anywhere in the view components or models, but I'm also new to reading Clojure projects this large so I must just be missing it. :)
Ok, so that's about it for that code path. One more note before popping up the stack. There's also the ws-ch
, and possible that there are other paths for state changes that are related to projects and this code path that can happen through messages from the server that get put in that channel. I'll ignore that for now, but keeping of track of in-flight server requests when responding to messsages on the ws-ch
channel seems like another place that might be tricky.
It'll be great to get some more eyes on this and verify my understanding, but I'm going to reflect a bit, optimistically assuming that it's at least roughly correct. :)
As framing, my instinct is that what's most important when evaluating front-end architectures is that it maintains the flexibilty and speed to add unanticipated product features. My bias for how this is accomplished is that I think aggressively decoupling and abstracting at the local level is important, but I'm cautious about defining seams that enforce abstractions at the global level, since there's a much greater chance that the assumptions of those abstractions will be wrong. Refactoring is easy and mechanical; re-abstracting is difficult and requires deep understanding.
-
This project is amazing, and has a lot of really creative and powerful ideas to learn from! The pervasive use of pure functions leads to being able to reason about the program with only local knowlege in most of the app. Almost all rendering code is isolated from the code that changes app state (thanks React!), which has the enormous benefit of letting us mostly ignore time in rendering code. The functional idea of "data structures first" is really clear here, with pure static functions parsing, interpreting and decorating them. The organization of the model module and functions is intuitive and natural, with any behaviors or processes on that data defined separately. The edges of the browser like mouse input and URL state changes are pushed to the edges, and there are clear seams there. Using pub-sub points as the way for view code to send events into the rest of the app is really clear, and the mechanism of
handler(e)
->"update state in place" and "perform other effects like making requests"
makes it straightforward to reason about these state changes in isolation. -
It seems like as the state atom grows, the possible state transitions grow more complex as well. It feels to me like even in a project of this size, the enormous benefits of using plain data structures like maps and seqs start to become outweighed by the unbounded transitions that can be performed on them. My first instinct would be to abstract this by creating helper functions that transition parts of the state atom, and then composing those. Or by making a few number of higher-level functions for semantic transitions like
to-logged-out
orto-project-page
. This codebase has a fair bit of this already, which is very clear and simple to understand. The comment about needing help to simplify this rings true to me, but I'm more concerned about whether the semantics of the state transition are valid (including that they're composed correctly), than using middleware mechanics to change how they are composed. I wonder also if by using the global state atom, we actually make this problem much worse. It'd be interesting to look in more depth another time at how these state transitions work for this specific app, and whether there are natural seams to break this up along (even if they are stored in a single global atom). These seams might be along either the entities that are present (e.g., "transition between there being a current project or not"), or they might be along the pages themselves (e.g., "transition to state for dominant-component X"). -
Similarly, the architecture lines for the channels and the controllers (e.g., navigation/API/controls/errors) feel a bit forced, like they are organizing parts of the app based on the "what" or "how" rather than the "why". As an example, the code for responding to most API calls is organized into a single
api controller
module. The context of the "why" is hard to see - why the return value is what it is, why the code is doing this, or where it fits in the context of the app and user experience. The best place to see the "why" is in the navigation controller for the:dashboard
. But it's in anavigation
module, mixed with the "why" of the other pages. It seems more natural to me that the holistic higher-level "why" should be fully described as part of thedashboard
component. That component might make use of commonapi
ornavigation
code bits to perform its work, or split out the mechanics and "how" into separate modules. But it feels to me like the important semantic parts, the meaning and flow of the product, should be expressed directly and wholly in one place. My instinct is that the "dashboard" component is actually the right place for this. -
Continuing further, moving the orchestration of API requests into a component gives it a natural lifecycle for avoiding race conditions, even with a global state atom. For example, when the
dashboard
component is unmounted, any of its requests could be aborted. The callback code for updating the state atom can be owned and contained within the component's lifecycle. This provides the kind of guard that's accomplished now by manual checks in the API controller, and in a more natural way. As the components grow more complex, the mechanics for this are the bits of code that are split out to keep the complexity low. But by using a design that makes it hard to see the "why" and the impact on the product and UX, we actually make the problem of making new products and features harder. And the "why" and product can't be encoded or understood without thinking about view components. My instinct is that we shouldn't try to split the "why" out of view components. It's only the "how" details of server communication, mechanics of cache lookups (but not expiration and invalidation logic), logic of parsing model data, etc. should be split out. -
Using channels for tracking server communication feel like a mismatch for the problem here. This isn't about their local use, but about emitting events about state changes for requests on a pub-sub point. Using the example of the
navigation
controller's making calls for thedashboard
, it creates a parallel kind of application state that isn't made concrete in the state atom. The application state can't actually be entirely described by the state atom, there's also hidden state with in-flight server requests, and even further, goroutines that controller code has queued up, waiting to perform additional requests and state transitions when the server requests do complete. And there's no way to abort in-flight requests or scheduled effects based other state changes that occur. As an example, after the:dashboard
navigation event is emitted, let's say the user navigates somewhere else. The first dashboard request is still in flight, and when it completes, there is code already queued up to respond to that and make additional requests. These requests are irrelevant now (the user has told us they don't care about that data anymore), but there's no way to abort those queued effects. You can see this complexity leaking into the API controller code with those guards. Note: I'm not arguing against channels in general, in fact I think the mechanics or Promises/callbacks/channels are not particularly important. In the way channels are used locally, like with:get builds
in the navigation controller, it's not any different than using Promises or callbacks, that's cool. If we needed some more complex composition, local use of channels and goroutines seems awesome (although I haven't yet seen a compelling real-world use case for channels over Promises, other than the Golang example of fanning out and taking the fastest responders, and that doesn't translate well to the browser). My point here is about the hidden state of in-flight server requests, and code that's queued up to run, neither of which are encoded in the state atom.
The problems that affect me and my colleagues the most day-to-day are the "dynamic process" parts of the UI, and true distributed systems problems. I don't think applying ideas in CSP about how to model dynamic processes is a great fit for UIs, since writing correct concurrent systems is a different kind of work, and in my experience, not the kind of work that usually has the best product payoff for engineers that live close to the product and user experience.
I have a similar feeling about true distributed systems problems of data consistency. In practice, most products I've worked on benefitted by being explicit in their avoidance of these problems, since the engineering investment wasn't worth it for the product payoff. Simpler approximations are almost always sufficient. Like taking a snapshot of the server state and holding it as immutable for a period of seconds or minutes, predicting that the likelihook of it changing underneath the user is very low and the path to recovery when it does happen is very straightforward. Or embracing eventual consistency and decoupling, and allowing data to be updated in different views within seconds of each other rather than synchronously. Or by investing in improvements to the API serving layer so that server response times are reliably within 100s of milliseconds and it's more acceptable to always get fresh data or wait for ack responses.
But if we do want to make a product or feature that functions as a true distributed system, that of all things should be split out from the rest of the application. :) The idea that feels the most promising is making the "view of server state" immutable, and, within the application, recording facts about server state and changes in client state. This way view code can compute views of those facts, like merging optimistic client updates with the view of server state. And there's a first-class way to deal with failure, like retracting facts about optimistic updates that the server later rejects, etc. There's some interesting related work to get a Datomic-style in-memory database in the browser [http://tonsky.me/blog/decomposing-web-app-development/], which is what I picture the database looking like. But the source in that chat example doesn't use DataScript to store state about effects, and in fact the state about server calls for load-user
is kept on the stack in a separate goroutine, unknown to functions trying to transact database state. I think the real architectural achievement is when this kind of design is simple enough to work with that these guidelines still seem natural, intuitive, and unburdensome for small forms or chat apps.
This code is spectacularly well-crafted and reading it has been a fantastic learning experience. I'm sure I'll keep following the work you are doing, as in my mind you're doing some of the leading creative work in this area, and the kind of work that has the potential to really push us forward and help the whole field.
So my closing three thoughts are:
-
What if we reified effects, like in-flight server requests, into state? This would let the function that took
(prev-state) -> new-state
consider the state of effects as part of producing the new state value, and it could also perform effects as needed (e.g., abort or make server requests, kick off another iteration of a simulation). -
Would state transitions (including state of effects) be more natural if they fit within the lifecycle of view components?
-
The global state atom is being mutated in-place, which works well for a representation of the UI state since the DOM ultimately is a place that must be mutated. But it doesn't work as well for a sophisticated client-side view of server state. What if we split out the full client-side view of server state (including facts like optimistic updates) to an immutable data store in the browser?
Of course, I'm writing you because I admire your work tremendously, and would really value any feedback or thoughts you had the time to share. Keep doing (and sharing) your awesome work! :)
-Kevin
krobinson@twitter.com
@krob
I think that should be fine, the guard compares the context that was passed into the ajax/ajax function with the navigation data at the time the ajax call returns. It doesn't depend on the result of the projects api call.
This might be more info than you care about, but the projects api call isn't specific to a project, it returns information about all of the projects that a user is following. It's main use is to populate the sidebar with info about recent builds on all of the projects you are building.