Skip to content

Instantly share code, notes, and snippets.

@axefrog
Last active April 19, 2018 19:23
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save axefrog/30dd0a24cf916daf3bc60d92bdb5162c to your computer and use it in GitHub Desktop.
Save axefrog/30dd0a24cf916daf3bc60d92bdb5162c to your computer and use it in GitHub Desktop.
My note-taking process, followed by an example taken from my own notes in 2016.

My process for designing systems and solving problems is fairly simple, and in a way it seems like an obvious approach, but the degree of effectiveness of the technique is greater than what you'd intuitively expect. When most developers write notes and try to work out how to tackle some kind of architectural design challenge, it usually takes the form of a few whiteboard notes, some scribbles in a notepad, etc. Usually this is fine because the problem is small and constrained to a manageable level. If you want to tackle something bigger and more challenging though, or if you want to design a much more effective system in the first iteration, rather than two-to-three years down the track after several deployments finally lead you to the conclusions you'd like to reach up front, then this process works really well. It does feel like you're committing to a chore if you haven't done it before, and the reason it works isn't itself obvious, which is part of the reason I get pushback from a lot of developers before they commit to trying it. Of those who've tried it though, I've had universally positive feedback. To top it off, as you do it and start to have ideas and breakthroughs, you'll feel satisfaction and start to create a mental association between the process and feeling rewarded, and not only will you no longer think of it as a chore, but you'll start to prefer it over simply diving into code.

The process doesn't work unless you have at least a reasonable awareness of different techniques, algorithms, data structures, libraries and so forth that you could learn if you wanted to. You don't need to know how to use them, or have practical experience with using them; it's more about turning "unknown unknowns", as Donald Rumsfeld first coined the phrase, into "known unknowns". If you know, for example, that there are a number of data structures that do X, Y and Z, and you know that there are a suite of algorithms that exist for solving problems in some category, then you have a set of starting points when trying to think of ways a problem could be solved. Once you know what you need, you can go deeper and find existing solutions, or teach yourself to implement the solutions you've identified as important to what you're doing. In fact, it's important to not try to learn how to implement every algorithm, data structure, design pattern or otherwise. Doing so is inefficient and takes time and energy. Know what is out there, work out what you're trying to do, then shortlist anything relevant for further investigation, and go deeper as needed.

So the first step is to try to start becoming aware of what's out there and, at the very least, to start making mental notes of some of the interesting ideas that could have potential value in whatever you're doing. I recommend looking on Wikipedia for generalised lists of data structures and seeing what they each do, looking up lists of graph-related algorithms and concepts, topics on logic in the Stanford Encyclopedia of Philosophy, and so forth. When you see something posted on, for example, Hacker News, that might be interesting but isn't really relevant to you, take a look anyway, and perhaps file it away in your toolbelt of future possibilities. Any time you have an idea, no matter how crazy, take an optimistic view. Assume that every problem has a solution that you could find if you wanted to, and remember all of the advancements that were groundbreaking at the time, but seem obvious in retrospect. Bitcoin and the Blockchain are a good recent example. These sorts of advances are generally preceded by countless naysayers who "know better" because they are apparently "intelligent", and able to make assessments of the value of an idea with far less information than is probably adequate to do so. Intellectual humility is really key in this process.

Also consider that most people stay in their comfort zones when it comes to their areas of expertise. People know that learning new things takes effort, and often involves frustration, and tend to rationalise their refusal to step outside of the boxes they live in, more often than not, ironically, to the point of irrationality. When your brain hurts - when you're feeling intense frustration with trying to figure something out - that's exactly the catalyst that your brain uses to identify that its current configuration of neural connections is not optimised to solve this kind of problem, and why, when you step away from the problem and return later (often over multiple iterations) that what you're trying to do becomes easier and more obvious. Feeling the frustration of learning is like "feeling the burn" when working out. You need to become frustrated in order to have your brain start optimising itself. If you want to design the most innovative solutions and tackle the hardest problems, expect frustration - chase it even, and build a mental association between frustration, learning new things and becoming more effective at what you do. Don't ever give in to thoughts of "this is too hard" or "I can't do this". Sometimes you just need to break the problem down a bit more. The fact that so many people avoid the frustrations that come with stepping outside their comfort zone, is the exact same reason why, I believe, there are untold unrealised opportunities out there, hidden away and waiting for someone to combine two or more concepts that previously tended to be found in different silos of expertise, the ideas from one silo often dismissed as irrelevant by experts in the other.

As I said earlier, your brain tends to adapt to the way you use it. The more something is in focus, the more neurological resources your brain will allocate to it. The trick is therefore to use a process that optimises what is in focus. As you probably know, writing something down is well known for being better at making something stick, than just reading about it, or watching a presentation, or listening to someone tell you about it. That's why university students are usually more successful when they take notes themselves, than when they don't, or when they just "download the notes from the server", as many will do in order to take the easy way out. It's not about having the notes (though having them is still valuable), it's about the act of writing them down yourself. This creates a signal in the brain that places focus and emphasis on the subject matter.

You may also be aware of the old idea that the best way to learn a thing is to explain it to someone else. Doing so forces you to address the gaps in your own understanding and, in particular, works because you're forced to fully articulate the idea in a way that the other person can understand. More importantly though, it highlights the exact gaps in your thinking - gaps which may have been easily glossed over if you abbreviated what you were saying, or generalised by making your presentation too terse. Teaching someone about a thing is a another signal to your brain that reinforces the emphasis that should be placed on the subject matter.

When you're trying to solve problems, your brain tries to expend the least amount of energy to get the job done. Anything that is focused and emphasized is going to be currently "hot", or "lit up", among your billions of neural connections. To expend less energy, the brain prioritises the active, fuelled areas for pattern matching and problem solving, and gradually falls back to "colder" areas, the less it is able to get the job done with what is currently "in the cache", so to speak. What is more, the more you are able to not only solve problems and come up with ideas, but to reinforce them in your brain and keep them emphasized and in focus, the more you'll create a positive feedback loop that allows an idea or realisation to piggyback on what came before it, and to do so faster and more effectively because your brain is not having to expend resources to bring what's important into focus.

You may have heard of the old idea of "rubber duck debugging". In short, a senior developer, upon being asked for assistance in debugging a problem by one of his programmers, placed a yellow rubber duck on the developer's desk and told him to explain his problem to the duck. Upon doing so, the answer became clear, and the senior developer's help was no longer required. We've all experienced coming to an answer ourselves, upon trying to explain a problem to someone else. What is important is not the explaining of the problem to another person, but the complete articulation of the problem, which created a level of neural emphasis that was not possible simply by only thinking about the problem.

So here's the process.

I like to use Markdown files myself. You can use whatever you like, but make sure it's a medium that allows you to hammer out a lot of text as quickly as you can, and to easily read back over when you need to. Also remember that this is for you only. Spelling mistakes are irrelevant. Presentation is only important to the degree that you can mentally reload old thoughts and understand what you were talking about if you come back to your notes later. The format also doesn't need to be consistent, and you don't need to finish a particular train of thought if it turns out a new train of thought supercedes the old one. Just leave whatever you write in place, and append, append, append.

First, start a new session by writing down a couple of fully-articulated sentences explaining what you're trying to solve, what you're having trouble with, where you're up to, or whatever. This highlights the problem domain to help you get started.

Next, write down a list of bullet points that are essentially a dump of what you currently know to be true. It's ok to repeat earlier thoughts, and it's ok to write down the obvious. Think of it as thinking out loud, but by writing it down, you'll force it into focus more than just thinking about it would do.

Any time you're stumped, or faced with indecision of some kind, or know there are problems you need to solve that you haven't gotten to, or that you know you're going to need to think about as you move forward, write them down as fully-articulated questions. "How can I make it so that when X happens, Y is the result?" or something to that effect. There is something particularly powerful about posing questions to yourself like this, fully-articulated in written form. I'm not sure exactly why it works, but somehow asking the question creates an open loop of sorts, which your brain treats as a kind of "extra strength" signal that a higher-than-usual level of resource expenditure is desired in order to close the loop, rather than leaving the question hanging.

As soon as you've done this, start writing down bullet points with every idea you have, no matter how simple or incomplete. Note that I'm talking about simple or incomplete ideas, not sentences. Make sure you write out your thoughts as full sentences. "Maybe if I added X to Y, it'd allow me to do Z", etc. If you find yourself having ideas that expand on a bullet point, indent from there with a second-level bullet list and all of the thoughts you're having.

When you don't know how to answer a question, engage in recursion - break the problem down. Start by writing down everything you do know, and what stumbling blocks you're currently perceiving. If you're still stumped, then you can break the problem down into even finer levels of granularity, and/or make a point of applying critical thinking to the process. Question the very nature of what you're doing. Ask questions like "is X even necessary?", "could I take a completely different approach?", "what other approaches could I investigate?", "what assumptions am I making about Y?" and "which assumptions that I'm currently making without thinking, could in fact be challenged altogether?". When you disqualify an idea, don't delete it. Write down something like "NO - I can't do X because that would lead to Y", and so forth.

When you have a breakthrough, highlight it, or make it bold or something. Make it easy to sift the process from the lightbulb moments when you revisit your notes. Any time you need to switch into another medium, such as drawing a diagram, writing some code, etc., make a note of what you did so you can cross-reference your notes with the related media in the future. If you draw on paper or on a whiteboard, be sure to take a photo with your phone, and have your phone set up so that it your photos automatically sync to your dropbox, google drive or whatever else is relevant.

If you need to flesh out some code, often you don't actually need to write real code just yet - you just need to hypothesize about how your ideal solution would work. I like to use Markdown fenced code blocks for this, as they stay inline with my notes, and the editor won't complain about problems with linting, compilation or missing dependencies. It just gives me a nice way to paint a "what if" picture, where I can pretend my solution exists, have a go at seeing what it would be like to use (i.e. what function calls would I make? What arguments would I need? How would it flow? What members would this interface need? How would I compose these two functions or objects?), and then evaluating my approach and seeing what I've failed to consider, before wasting a bunch of time trying to code it for real.

I like to start new sessions with a date heading. This gives me a sense of continuity and lets me separate one session from the next. After a file gets above a few thousand lines in length, I tend to start a new file and leave the old one where it is for reference.

As a bonus, I absolutely can't overstate how useful it is to also be able to look back over past notes that are structured as coherent, fully-formed sentences. I can read exactly what I was thinking and get back up to speed. There are many, many ideas and approaches I would never have been able to come up with if I hadn't been able to review my earlier thoughts in this way.

All of the above is derived from my own practical experience. I have written literally hundreds of thousands of words worth of notes in this way, so I have definitely put it to the test. Give it a go, and if anything's unclear, let me know.

======== Sep 11, 2016 ========

I need a way of identifying an unclaimed element as a priority candidate for subsuming I need to write down how to identify the relevant elements for each slot type Consider: an "active" property that doesn't remove the slot but simply removes the element from the DOM Cleanup after patch applied, allowing existing elements to stay in place for reuse during slot patching Root element is key because it already points somewhere Previous and next slot siblings are key because they identify the range boundaries for eligible elements

As I traverse the tree:

  • I need to know where in the DOM I'm currently pointing
  • Root slot with no target will need to append as first child of document body
  • Keep track of current parent and next and previous element sibling
  • Ordering based on keys is an array-specific concern; siblings are always children, and children are always members of an array slot
  • Keep track of current range-exclusive element target bounds so that only valid candidate elements are chosen for subsuming

======== Sep 12, 2016 ========

  • Some slots are resolvers, some are physical
    • Resolvers point only to other slots, some of which are physical, either directly or indirectly
    • Physical slots point directly at something in the DOM
  • The root slot can conceivably move around (or have its siblings change) due to external influences
    • When we create the root slot, we're either pointing at an existing target element, or at a parent
    • A resolver slot without siblings will append to its parent
    • A resolver slot with no parent will append to document.body
  • How do I have a resolver slot "take over" for a physical child slot?
    • Though it's very unlikely that one kind of resolver would take over for another type, a new version of the same resolver type may need to take over.
    • If the above is true, the normal patch process would preserve the existing child as part of the type-specific patch function for the slot
    • Could I attain clean preservation of child dependencies via the as-yet-unimplemented "suitable target" selection function?
      • The parent/next/previous properties on the patch state could be relevant here.
      • I might be able to reuse an object reference by making sure I assign all of its properties each time
      • Alternatively maybe I should just pass an argument for each element boundary (parent/next/prev)
    • Could I have the patch process for resolver slots return a reference to what they've currently resolved as

Node Types & Patching

A clean same-type patch can be called for every slot type if a new value of the same type is provided. When an indirect type replaces a direct type, the old slot is passed as the initializer for the new slot When a direct type replaces an indirect type, the old resolved slot replaces the indirect slot and is then patched When an indirect type replaces another indirect type, First-time indirect slots are initialized with a void slot

Maybe I don't need to worry about resolved propagation and carry-over. The patch process will be carrying references to a parent element and exclusive element range boundaries. These references will not change as we descend through the hierarchy; they'll only be updated when hard targets are iterated over when patching direct slot types. If we just simply subsume the first element in the child range as the potential patch target, then the depth of resolution is irrelevant, and switching out one deferred emitter for another will not affect things; we'll patch the correct target automatically.

  • VOID
    • no value, no node, just an empty slot pointer
    • if replaced, the new type acts as though it is the first slot at this position
  • PRIMITIVE
    • plain value to be serialized as a text node
  • NODE
    • standard virtual node representing an element in the DOM
    • replaced if the selector is different
  • PROPERTY
    • applies a resolved value to the specified property
  • ARRAY
    • resolves to a variable number of nodes at the position that the slot points to
  • PROXY
    • standard attributes to be shared by whatever this node resolves as
  • STREAM
    • a slot pointer that resolves each time the stream emits a value
  • PROMISE
    • a slot pointer that resolves when the promise resolves
  • FUNCTION
    • a slot pointer that resolves each time the patch process is triggered
  • COMPONENT
    • an object for which a specified key will be queried to resolve the slot value

Property Slots

A property slot could resolve through several levels of indirection. When a value is resolved normally, a child node will be created or updated, with the value translating to an entire physical DOM node (or nodes).

Range/target markers would be the solution to everything if it weren't for properties.

Property resolution cases:

  • A single style selector value, or a deferred reference to all of them
  • As above for HTML attributes and properties
  • Event listeners

In each of the above cases, the type of value that can be assigned is dependent on the type of patch being performed. Certain slot types are invalid at this level too, for example nodes, components and arrays are invalid as styles.

Slots are entirely abstract; they are simply value resolvers, and their behaviour depends on their type. I wonder if the way a value resolves, and the way it is applied, is something that needs to be treated with better SoC than what I have currently designed with slots.

  • I know now that I will use range markers to narrow the patch target while descending the tree
  • The process of descending the tree might be able to be performed in a big loop and with the use of a stack
  • The contents of arrays are known NOW; they are not deferred in any way

Case: two children. The first is a stream, the second is an array. The first stream emits its first value AFTER the range markers for the array are established. If the resolved stream value is deeper in the hierarchy than the array (which it will be), then propagating the range boundary to the neighbouring array won't be easy to do.

IDEA: What if slots are ONLY used for physical targets, and have an extra property "resolver"?

  • resolver would be void if the value is of a direct type
  • a resolver could be like its own slot, complete with a type, and a resolver of its own
  • a resolver would call back with the resolved value when it is available
  • if hooks are really just callbacks of a sort, maybe it is ok to keep resolvers as slots and have them NOT directly cause a subtree patch, but call back their parent when a distinct value has been resolved
  • range markers are cached in direct slots whenever they are updated, but don't directly cause patches
    • when a right range marker for one slot changes, it must propagate as the left slot marker
  • direct types
    • array
      • when used as a node's children, the range markers use the node as the parent and void for left/right
    • node
    • primitive
    • property

Hang on... if resolver slots aren't going to directly descend to further physical slots, then range markers aren't as important anymore. Instead, each slot can probably get away with referencing its neighbouring slot and parent slot. Or perhaps the slot can just store its owner and index. No... rearranging slots would make that a problem because I'd have to update the indices of all the slots the follow it.

I think it might be ok to simply cache the left/right slot references in each neighbouring slot, just as I was going to do with range markers. So the range markers would remain, but they'd be slots rather than elements. During a patch loop, I'd also keep left/right element references for efficiency, but when a resolver triggers a subtree patch, I would have a utility function for taking a slot and returning the left/right element.

There are multiple ways to patch an element, particularly with respect to different platforms.

How should I structure a slot to allow the patching method to be delegated to a secondary function?

How should the equivalent of Snabbdom's modules be implemented?

  • Modules are called only when a vnode is being patched
  • Each module can potentially resolve to child nodes (children) or properties, or something else, such as event listeners.
  • Snabbdom simply calls each registered module during the patch process, passing the old and new vnodes, and allowing the module to look at the node's options to determine what changes to make.
  • Remember that hooks are not modules. Hooks are lifecycle events; they're not for actually performing patches.

Ok, so Snabbdom steps out of the patch process to apply node options, but xDOM needs the patch process to be active for options patching, because any individual option could need to be handled by a standard resolver.

The patch process for a custom option still needs to construct a slot subtree and resolve it in a standardized way. A property slot will be used to patch the target. A resolver instance (if necessary) will be required to trigger the patch. A hierarchy of options (e.g. {a: {b: 'foo', c: {d: 'bar', e: 'xyz'}}}) will need to be normalized into slots, in order to allow each option to be treated individually as the set of options evolves.

  • A resolver is probably not the same as a slot. It's probably more like a persisted stack.
  • Could/should resolvers be treated like modules? i.e. the ways to resolve something treated in a pluggable fashion.
    • No, YAGNI with a performance cost
    • Reintroduce a generic deferred resolver with a callback?
    • Maybe streams and promises use deferred resolvers?
    • Maybe the existing function type is kind of pointless. I don't know where I'd ever use it. How about making it a generic deferred resolver?
    • A deferred resolver would assign a disposable state to itself
    • It would have a reference to the value, just like a slot
    • It would only be initialized if the value changes
    • When the value changes, additional stack layers can be discarded because there are no descendant nodes to preserve
    • It doesn't matter if a second-order value has the same reference as what was just discarded, because when it is resubscribed/resolved, it will probably emit the same value, and that value will be compared to the current value and ignored if it hasn't changed.

Slot: { type: so we know what to do with this slot. maybe this is as simple as 'deferred' or 'indirect' raw: the value from the vtree - this is always compared against first. resolved: the resolved value (maybe as simple as ref.toString()) state: { dispose() } parent: a slot left: a slot right: a slot }

Once a resolver is established, we don't really need to call methods on it or introspect it; we just need to be able to dispose it when it's done. The state object with the dispose method is probably all that needs to stay in place. The shape of the state object can be consistent if, when it is in a certain shape, it is only accessed inside a code path where it is always that shape. That's why the type is important.

left/right are only relevant for physical nodes. property slots are not relevant. parent is always needed for hook/event propagation. left/right can be ignored for anything that does not sit in the DOM tree streams can sit directly in the DOM tree, so maybe "resolvers" are in fact just slots, as I'd always intended.

the actual resolved value should probably go in state platform-specific types and patch paths should probably go in state too, pretty much as I always intended.

The secret sauce is simply this: deferred types simply call resolve(parent, value). The resolve function is like the patch function, but is bottom-up, rather than top-down, and targets the "resolved" value, not the reference value.

The slot should have a prototype function resolve. Or maybe the state object should have the resolve function.

If the slot has the resolve function, it's as simple as calling parent.resolve(value). Even node slots would call this function.

A deferred slot would have two children:

  1. The resolver slot - the stream/promise/whatever that calls back with the resolution value
  2. The resolved slot - a fully-qualified slot that is based on whatever was resolved
  • A proxy slot might be deferred, or might just represent a static value that we need extra metadata for.
  • A bare stream might simply be a proxy slot with voided options... no, that would be hard to validate
  • A deferred slot would have the bare stream, promise, whatever as the reference value
  • A proxy slot results from calling context.wrap(), so the reference value will be of a clearly-identifiable shape

TYPE: deferred, node, property SUBTYPE: deferred: stream, proxy, callback node: element, literal, array, void property: whatever (platform-dependent)

Maybe I'm wrong about the deferred slot type being the mount point for child nodes. Instead, a deferred slot should be trying to resolve the parent...

So... a deferred slot has a resolvable state. When it receives a value, if the value is a deferred type, then we mount it as a child slot. If the value is a resolved type, we call parent.resolve(). In the latter case, if a deferred child slot exists, we destroy it.

KEY POINTS

  • deferred slot types ALWAYS call their parent back when a value is finally resolved
  • non-deferred slot types ALWAYS distinguish between a reference value and a resolved value NO... hang on.

a deferred slot should always be able to be a mount point for a child node, especially given that a deferred slot might be an item in an array. calling the parent back won't help anyone.

OK... try again:

  1. non-deferred value received by a deferred slot
  2. is the parent slot a deferred slot, a property slot or a proxy slot?
    • yes: call parent.resolve(value)
    • no: call this.patch(value)
  • slot.update(value)
    • trigger prepatch
    • check equality (exits if equal)
    • store reference value
    • if a non-deferred type:
      • if the new value is of a compatible type:
        • call slot.patch(value) immediately
      • if incompatible:
        • dispose state
        • reassign type
        • slot.initialize(value) // distinct from patch, which is designed to modify the existing state
          • maybe call slot.patch() after the element has been created; "previous" values can be undefined.
          • alternatively, maybe patch and initialize both call into inner code paths.
    • if a deferred type, mounts deferred state as needed (preserves) mount point for subsequent patch comparison
  • slot.patch(value)
    • once we get to this point, we know that the new value is compatible.

Child slots are ALWAYS properties of state. I'm thinking a more functional approach would be appropriate:

function update(slot, value) {} // apply a new reference value
function patch(slot, value) {}  // apply changes now that the value is guaranteed to be of the correct type

Calling patch on a deferred type causes the deferred type to pass the value argument to state.child.update().

So the distinction here is that update is about the nature of the slot itself, and patch is about making changes when the slot has known characteristics. patch() MUST be called before any changes appear in the actual DOM. update() can result in all kinds of changes to the state of deferred emitters, but that's only about setting up the shape of the environment that leads to real changes to the DOM.

Patching a node's values

  1. Call update with a new node value. it checks out and does not have reference equality.
  2. If this is a new node, call init, to establish state and create the target element
    • this is a node slot, so init will call into platform.initNode, passing in the slot and the new vnode object
    • store the vnode in slot.state
    • store the element target in slot.state
  3. OTHERWISE, call patch, passing in the new vnode object.
    • this is a node slot, so call platform.patchNode, passing in the slot and the new vnode object.
    • replace the vnode in slot.state
  4. Each options module will have its own function to be called
    • for speed and YAGNI prevention, hard code the set of module types in the platform's patchNode function
    • use if(options.style) { ...
      • see if there's a state.style; if not:
        • if the value is a deferred type:
          • INTERJECTION: read below.
        • make a property slot and:
        • set state.property = 'style'
        • set state.target = slot.state.elementTarget (or similar)
      • call update(propertySlot, value) where value is the value of options.style.
      • follow this approach for descending the options object's hierarchy, setting the property subtype to whatever.
  5. The boolean return value from patch will tell us whether or not to call relevant hooks on the parent
    • emitting hook events is affected by the current subtree root from which the patch was triggered
    • don't worry about who/when/how hook events are being triggered just yet, just get the main code path in place first

I think I need a new slot type: properties. This would represent the set of variable options for the node. Certain options such as the key or the specified hooks and events are not deferrable, so would be handled by the main platform.patchNode function. For everything else though, create a properties slot, initialize it with the options object and run through the set of possible option types.

The deferred slot type would need only look at the parent's type. The parent will be properties or property, which is a non-deferred type. It should call patch on the parent, passing the resolved value.

ACTUALLY: Just use property or maybe replace it with properties. Either way it makes no difference.

NOTE: A deferred slot will only mount a new node as a child of itself if the parent is a node type. If the parent is a property type, or the parent is also a deferred type, we can ask the parent to call patch. Deferred nodes can be arbitrary and not cared about by the parent, which is why a deferred type needs to be able to take control of mounting a node. Deferred slots for properties can't exist without having a parent slot property though, and it makes no sense to mount an arbitrary child property slot without deferring to the parent.

======== Sep 16, 2016 ========

Item array patching/sorting

The basic algorithm for patching an array of items will be:

  • Start with a reference to the parent, as the array items will all share the parent in common
  • Start with a void reference to the previous sibling, to be updated during iteration
  • When descending to patch nested children, we either pass a new reference context for parent/left/right, assigning the current element as the new parent and voiding the base left/right, or if it's only a virtual descent (such as an array within an array), the current context is maintained and continues to be updated, as the nested items are actually siblings of the outer adjacent items.
  • An array slot will use a Map of linked list nodes, keyed according to the keys of the items in the array.
  • Item keys will include BOTH the key and the selector (e.g. ${key};${selector})
  • Maintain index markers for left and right
  • The left marker will always advance first, with the aim being that the marker should stop on a non-keyed slot
    • The left marker can stop if the right marker becomes resolved
    • If the right marker becomes resolved, advance that marker forward and continue patching
  • Only advance the right marker if it has finished being patched
    • If the right marker had a key, add it to a set so that duplicate occurrences of that key can be disregarded
  • If the left marker points to a keyed slot and the right marker has not visited that key yet:
    • Add the key to a set of removal candidates
    • Increment the left marker
    • Continue patching
  • If the right marker points to a keyed slot, see if the map has that key:
    • YES: (it's an existing slot that has been repositioned)
      • Remove the key from the set of removal candidates
      • Relocate the referenced slot so that it is in the correct position
      • Apply platform-specific relocation of element targets, if relevant
        • Pass in the previous and next target references so that array slots can update their first/last elements
    • NO: (it's a new slot)
      • Create a new slot and insert it before the left marker
      • Add the key to the map
  • If the right marker points to an anonymous slot: (i.e. no key)
    • The left marker will currently be at EOF or at a non-keyed slot
    • If the two slots are of the same type, perform a patch and advance both markers
    • If the two slots are of differing types, destroy the left slot and replace it with a new slot of the right type

THINK ABOUT: deferred elements that should kind of be repositioned rather than just thrown away as a side effect of not being directly referenced...

  • A chain of deferred slots will always mount the resolved slot at the top
  • The resolved slot can be queried for the target that it references
  • A first/last node reference could be surfaced for the patch context, possibly via the event system
  • Sibling references could be broadcast both up and down the tree where applicable
  • Rather than calling redundant propagation functions internally, it might be more efficient to set up proper streams

Proxying arrays for filtering and slicing

Is there a way I can cause a subset of an array to be rendered without having to asynchronously/independently dispatch individual changes to each affected element? The "active" window of a large child array could be changed very quickly if I were able to simply emit a set of slice parameters and have them applied as a batch.

Is there a way I can control the sort order independently of the literal order of the items in the array? How can I cause new items to sort themselves directly into a correct location without causing redundant patch ops?

  • I would need to be able to specify an item's position independently of its literal position
    • It would be inefficient to keep emitting new arrays if I can't guarantee the order of those arrays
    • Is there a way to specify children with a better collection than an array?
  • Could the sort position be a computed value?
  • Could the node output itself be a computed value?
  • If sort/slice/filter/map functions are applied to a proxy, they could be pushed to the first resolved array slot
    • If a non-deferred slot exists between the proxy and the first nested array, array options would be disregarded
  • Could insert/remove options be streamed to an array node individually?
    • Could I manage the shape of an array in-place via collection-related functions?
    • Is this kind of thing best left to an opt-in (external) component?
  • Could slot items relate directly to data, with the renderer applied during patching?
    • The render (mapping) function supplied would need to take the data as an argument and return a patchable node type
  • Perhaps the ability to treat virtual nodes immutably would be quite useful?

QUICK IDEAS FOR LATER:

  • Skip lists applied to array slots for faster sorting?
  • Hierarchy of skip lists?

======== Sep 17, 2016 ========

Ok, let's assume for a minute that I'm ok to implement a basic immutable data structure, which will mean I can add immutable operations to different types of nodes. If I do this, where does that leave me with projected collections?

There are two ways I could apply projection options to an array:

  1. Proxy node options
  2. A function to wrap around the embedded array, taking options and an array as arguments

Things to consider implementing:

  • Range windowing (slicing)
    • This is a form of filtering based on the computed sort index of the collection
    • Can only be applied in tandem with active sort order
      • Invalid if start/end values don't match sort order
      • Sort order may be function-based, so inconsistent range values would just create weird results (can't be policed)
    • Can be quickly adjusted by understanding that it's a range filter
      • Only recompute items from head or tail of the active range, depending on the change in range
  • Filtering
    • Affects each item's active status
    • If the filter is changed, items must be recomputed
  • Mapping
    • If I dump a bunch of data items into a tree directly, they should be able to be rendered using a mapping function
    • If no mapping function is present, each item would be treated as a direct node value
  • Sorting
    • New items should be inserted directly in the correct sort position
    • Native index (order of inclusion) should be retained so that it can be restored if sorting is removed
    • Re-sorting a collection would probably be at least an O(N) operation, but would preserve trie nodes where possible
    • A hash map trie would be maintained for keyed instances
    • Default sort order would be via native index and simply mimic plain vector insertion order (standard appending)

Collection features required:

  • Immutable
  • Hash map (by key)
  • Indexed (according to sort order)
  • Insertion sort (trie should tolerate inserts into arbitrary positions)

In StreamScope, I'm using Immutable.js heavily. How would those collections translate to xDOM collections?

  • I know what I've added or removed based on versioning information, so I could just propagate the changes accordingly

Refinement

Based on the above information, where do I start and what do I need to implement?

I think that the third argument for a node needs to be treated as the general child argument, irrespective of type:

  • arrays are children
  • primitives are single text node children
  • collections are children
  • components behave as per the default component key
  • streams could emit any of the above types
  • if there are two arguments, and the second is an object, it will always be treated as options, not as a component
const {wrap, collection} = context;

// curried:
const itemsOf = collection({
  sort: (a, b) => a.index - b.index,
  filter: item => item.name.startsWith('a'), // how do i tie this to a stream?
  map: renderItem,
  range: [50, 99, item => item.index]
})

var node = div('.items', itemsOf(item$))

I keep thinking that maybe I just separate this out as an Immutable.js thing, but then I remember the lack of in-place sorting during insertion.

https://infoscience.epfl.ch/record/169879/files/RMTrees.pdf

How can I design the library so that array management does not require O(n) traversal every time? Currently, at the very least, I need to step through the new array instance and check-and-compare every item against the old version of the array to see if the item has changed.

Existing techniques:

  • Use a collection manager such as Motorcycle Collections
  • Emit a new array whenever anything changes
  • Use thunks to avoid traversing unchanged list items
  • Emit a small array by pre-slicing it and then rendering the result
  • If the shape of a single item changes, emit a new array
  • Implement range filtering by keeping a state object that includes a list of items to manually add and remove from
  • Implement sorting by blowing everything away and starting again
  • Implement filtering as above (destructive)

How can xDOM make these techniques concise/efficient/clean?

The following questions need to be answered first:

  • How would adding/removing/replacing/updating items look?

IDEA: The mechanism that manages a node array's positions and slot state would react to changes in the source data rather than being a simple process of iterating over the array and comparing it to the old array. If I actually know what has changed in advance, then I can apply those changes to the DOM directly, rather than trying to do an iterative comparison. Can this be abstracted away in order to implement basic lists for now (Map/linked list) but leave the door open for advanced collections?

  • The filter stack would need to be a cascading list, with each holding a projected collection reference
  • The specifics of a change to the collection propagate to each stack layer, terminating propagation for non-matches
  • Filters would need to be applied efficiently by the developer
  • A sort filter would be its own nested stack (sort by ([[field, direction], ...]))
  • A range filter would be implemented by identifying range boundaries and iterating from there, stopping at no-change.
  • The items in the filter stack should each be individually streamable
  • Each filter is the head of the filters underneath it in the stack; replacement causes cascading recomputation
  • The filter stack would itself need to be an observable structure, and each item in the structure would need to be able to be fed by a stream

The filter at the top would take collection management events (add/remove/etc.) and update the source collection. It would then emit the new collection and the event that was applied to it.

An underlying filter would either maintain a projected collection, initialized with the one emitted by its parent. If the filter is only a mapping and does not change the composition of the list, it would merely propagate a change, perhaps by transforming the incoming change event itself. Any change to the attributes of a given filter would cause a cascade of recomputation by each underlying filter. The top layer would always be the canonical collection. If a clean copy of a collection is required, it would need to be sliced away (cloned) via a special method which removes any filtering data without unrolling the changes caused by those filters.

  • Should the patch system use a tail filter to receive and project changes to the DOM?
  • Can I make it so that all arrays are instead managed as filtered collections?
  • Should the filters in the stack each be a special kind of virtual node?
  • How can I preserve the physical element collection for comparison when the filter stack changes?
    • The actual node collection would be derived from the root array, not the projected array
    • Nodes that have been filtered out would still exist; they'd just be removed from the DOM

Use Case: filtered collection with the "view" of each item changed after filtering

  • I wouldn't want to have to re-render every item immediately (lazy evaluation preferred)

  • I would want to be able to recompute items based on siblings and so forth (top position)

  • If different views are possible, their heights could all be precalculated at the start

    • When working with linked data, the types of views may not yet be available, or their characteristics may change
  • Deferred slots unfold as a chain, then resolve as a child of the root deferred slot

    • Where does the idea of streamed filters converge with deferred slots?
    • Deferred slots are generally supposed to represent DOM subtrees that have yet to resolve or which can change
    • Could the criteria controlling the shape of the collection itself be unfolded like a deferred chain?
      • The final filter in the stack would cause it to resolve back to the subtree root
      • When resolving, it would emit a set of patch instructions to be followed sequentially
      • A raw, unfiltered array could resolve in the same way
    • When emitting new instances of an array, the normal approach is compare and rearrange
      • The new approach could be to abstract that comparison away as a default approach, then output the patch commands
  • Thunks are a data-based approach for determining whether a subtree should be reevaluated

    • The key point is that we're using data as the basis for determining whether the subtree should change
    • What if we treat the data as more of a first-class idea during composition?
    • If a node is data-driven, it can include metadata controlling how it renders:
      • How should the data be compared for changes?
      • What function will render it?

Need to be careful to not weave too much magic into the library. Many of these concerns might simply be better handled externally as part of the application's stream composition.

Sidestep: The hook system in Snabbdom means that you can perform a mutation on a node in response to its lifecycle movements within the DOM.

  • hooks: ['create', 'insert']
  • hooks: [['create', el => { ... }], 'insert'] Providing a function would not change the fact that I get a stream I can plug into, but it would allow me to affect propagation... actually is this even useful? Why do I care? Tylor uses hooks to mess with nodes before they are applied by Snabbdom. That may be unnecessary for xDOM.

...Back to the data-driven stuff above...

Because streams can be embedded in the DOM, it means that the data-driven approach is free. I don't need to support this explicitly. Instead of using thunks to prevent a re-render, I can either just use streams, or if I want to patch a whole subtree chunk, but only parts of it, I can simply allow a data object and a comparator to be supplied.

LIGHTBULB:

Collection filters output the changes to make to the next projected layer. This is really just a patch changeset, and is basically the same idea as what I talked about for resolving to a set of patch instructions at the subtree root.

xDOM is supposed to be friendly to arbitrary collections of things that could be rendered, though naturally it favours serialization to materialised DOM nodes as much as possible. The filter process is a data-driven way of efficiently managing the patching of multiple children.

Ok, so it seems like there are two primary ideas here:

  1. Collections are arrays of DOM items that are updated by executing sets of precalculated patch instructions
  2. Filters create transformed projections of a collection without losing the details of the source collection

And that bonus idea:

  • Rather than having a component slot type, have the context configured with a fallback function. When encountering a value (generally an object) which does not have default way of being expressed, pass it to the fallback function to resolve to another value. The slot type would be something like data or general. It would store the raw value as usual, and have a child slot that is the result of resolving the value.
  • When embedding a component in a view, the previous idea was to extract the default sink key as the embeddable value, and make the remaining sinks easily accessible from the view's API. The idea will be changed as follows:
    • The context fallback function will be used to resolve the component's default view key
    • The value will still be accessible from within the view, as it currently is
    • Wrapping the value via the standard wrap method will make the sinks available as normal

Another idea:

  • Prevent needless propagation of hook events by being aware of when a subscription begins and checking if a subscription exists before emitting event data.

Collections

  • Can be initialized with an array of initial items
  • No longer need to manage stream lifecycles like switchCollection did:
    • The collection will simply be a list of actual data items
    • I need a way of passing changes to the data item to a component as streams, rather than reconstructing the component
    • Every item will be identified via a key
    • For plain arrays/lists, the key is internally-generated
    • The metadata for a list item includes its index, siblings and the data itself

... hmmm... one of the good ideas I originally had was being able to easily get secondary sinks out of a view. The examples I specified did not really consider that case with respect to the components being members of a collection though. The original idea actually still survives without needing changes. How do I extend the idea so that merging and combining can be easily applied to the entire collection? I want the solution to be clean, simple and powerful. Less is more. The correct solution exposes the minimum feature set required in order to allow for full functionality.

  • A collection node is exposed with an API, just as a regular node is.
  • The API could expose a stream/callback that itself emits the changesets that are used internally for rendering
  • The filter stack is a property of the collection type
  • Each filter can itself be expressed as a stream
  • Collection output only occurs when all filters have resolved to a filter (same idea as deferred slots)
    • Filters are deferred slots?
    • Maybe the collection is like a slot production line
      • A filter slot calls patch on the next adjacent slot
      • The last filter slot is what is actually rendered
      • Deferred slots unfold vertically as normal, and resolve back at the top like normal

Evolving idea

A collection can be created in the normal kind of way. Call map, filter, etc. to get a projected collection. Internally, the projected collection is built upon the previous filter stack layer.

Add a new slot type: pipe. A pipe looks like an array but is generally very short and can thus be compared sequentially if the entire pipe is rebuilt. Just as node options are handled by slot trees but don't directly cause child nodes to be rendered, pipes only render the last item.

div('.list', pipe([
  movies$, // raw data collection, resolved as a deferred slot
  sort$, // emits something like: movies => movies.sort('rating')
  filter$, // movies => movies.filter(m => m.title.startsWith('the'))
  capture([

  ])
]))

// devil's advocate:

const zeroMovies = collection();


combine((movies, sort, filter) => {
  sort(movies)
}, [movies$, sort$, filter$])

const movies = collection();
const filtered = movies.filterBy(prefixOf);
const sorted = filtered.sortBy(titleOf);
const paged = sorted.range(boundsOf)

// Somewhere down here the final projection still needs to include all of the items in the collection.
// Also, I need to be able to extract streams at any point in the stack. For example at the top level, I want all movies
// added or removed. At the bottom level, I want all movies, but I also need to know which ones are active.
// Each layer is initialized by the first emission by the preceding layer, or the latest emission when the layer's
// parameters are updated. i.e. Each layer is initialized with the current value of the previous layer each time the
// current layer's parameters are updated. When the previous layer is changed, it emits the new structure and the ops
// that were applied to create the new state. If the layer is already initialized, it holds the previous layer's value
// internally simply applies the operation to the active value. Any layer that is initialized should cause the next
// layer to reinitialize also. Projected lists cannot have items directly added or removed. Trie leaf nodes representing
// data items should include a reference to the synonymous node in previous layer.

// It might be an idea to hang node pointers as additional leaf nodes. First this would allow very quick traversal of the underlying list, but it would also make it easy to update the sort order independently of the actual base list.

//

const list$ = events
  .loop((movies, op) => {
    // apply the operation to the collection, then:
    return {
      seed: movies,
      value: [movies, op]
    }
  }, zeroMovies)
  .loop(([movies, op]))

I think the initial collection is not something we deal with directly, but rather it's kind of a starting point. Internally, it would maintain an immutable collection as state, but the API methods are actually building a composed set of streams.

So the collection would internally remember the changes made to it. When returning from the aggregation function, the persisted collection would "commit" the changes as a set of instructions. Next time the aggregation operator receives a value, a fresh changeset would be constructed.

Each additional filter layer is an API that works kind of like the static Collection methods, but retains the initialization information received by the parent layer, and wires up the internal streams of the previous layer, holding the latest emitted collection instance for future initializations, and applying changesets as per the filter parameters.

A projection can be broken out and turned back into a full collection via a simple toCollection() call. Ok, so what we're doing here is starting with a collection, which is actually a streaming wrapper for an internal collection, and providing operations for creating projections, each of which generate further projections.

The question is this: At what point can I turn the data into components and have the changes each exposed as a stream?

  • A dispatch operation will be required in order to direct events efficiently
  • Changesets would need to include the changed metadata (index, next/previous sibling, data changed, etc.)
  • If a mapping function can receive a bundle for the target item and have the bundle include only the changed metadata, then the changeset could be used by the developer to build sources, without forcing Cycle to be an integral part of the way the library is designed.
  • The changeset would include the original index, the current index, the left and right sibling references, etc. In fact it would be super easy to create streams of those items, complete with their metadata, if desired.
  • A mapping projection would receive the changesets from the previous projections and these would be used as the source inputs for components. Initialization would create streams that start with the metadata in the trie nodes.
  • It's important to remember that a projection creates information that is layered over the original tree. It's not important to know intermediate sorted indexes; only the actual current index and original ordinal index. Therefore projects simply create additional metadata nodes, such as next/previous links to visible, actual, ordinal and sorted.
  • It might be useful to prefix metadata node keys by an id that represents the filter layer. That way streams can be created for the metadata of any layer. If not specified, the key would just be the filter depth. e.g. 2.next.visible.
  • Initializing a layer causes reinitialization of secondary layers, but we want to preserve components as much as possible, so when a mapping layer is initialized and its parameters have not changed, it would take the metadata of the new tree node, compare it with the old node and generate changeset information without destroying the existing instance.
  • The only time a component is added or removed is when it happens explicitly at the top.
  • With the component internally having access to a stream telling it whether or not it has been filtered out, it can propagate that value to a view's active property, or it could change the view's style to make the view faded out.
  • When sorting, I only need to propagate change events for nodes that have actually moved.
    • If the last item moves to the start, every index will change, but I only need one operation to be performed in the DOM.
    • The index in the DOM is unimportant. The sibling chain is what counts. If only two items have their nextSibling values altered, only two changes need to propagate to the DOM.

Quick thought: Most has a very simple, elegant design. Would it be possible to implement operators using Most's architecture but allow something that is not Most to drive the lifecycle of those operators? It would mean I can provide flexible integration with Most, without tying myself to it. I'd lose access to the built-in operators, but if what I'm doing is kept simple,

Another thought: when a raw stream (or other deferred type) is encountered in a DOM structure, perhaps we create a mount point, i.e. a slot of type computed or something. That slot is designed to manage a data source and a child for whatever that data source resolves to. If the source is a deferred slot, then the deferred slot will call resolve as needed. By doing this, it means I can create consistent behaviour for deferred slots, no matter the context. A deferred slot would always call its parent to resolve the unfolded value. When the raw value changes, a simple equality check would be made, and if different, passed as a patch value to the deferred slot. In this way, a deferred slot will only ever have a single child, and can always safely call its parent.

Ok, so I have a way to create fast, projected collections that translate into views.

  1. How do I make other component sinks easily accessible for consumption by the application?
  2. How do I get changeset information to propagate from the component mapping layer to the DOM?
    • The DOM cares only about whether or not an item exists and what its active siblings are
    • A component reference will just be an extra metadata item
    • The mapping layer that builds a component will propagate any metadata that has changed, including sibling info
    • If a component reference doesn't change, the DOM won't be notified
  • I think that projected layers should not prefix metadata keys with a filter layer key. Instead, the keys should be standard and intuitive, and that each leaf node should simply contain a pointer to the version of the node in the previous layer. Because nodes are copied when changed, the frontmost layer will contain direct pointers to each of the old metadata values, which will only be replaced if directed to be replaced explicitly as part of the filter operation.
  • A mapping layer actually replaces the data value of the node with a component reference. This is why that particular type of change will not propagate to the next layer, or the patch system, if the mapping layer does not produce a value that is different to the old.
  • The changeset is itself an immutable object, because it needs to be able to be modified as it propagates.
  • A mapping layer for components would simply remove changeset items pertaining to a change in data. If this leaves only positioning metadata in the changeset, then that is all that will propagate. If the only change was the data being mapped, then nothing further would be emitted.
  • At the end of the chain, the DOM mutation process could plug itself in as a projection of sorts.
    • Each layer provides a changeset stream
    • A change type can be INIT, with the collection reference as the value
    • Rather than returning a stream, we actually return an API that wraps around the stream and provides standard filter functions. IDEA: If each of these is just a function, they could be chained together using thru. A fluent interface could be provided though, if desired. The methods would be higher order functions. If this.stream$ is available, the operation would be applied in place. If not, a new function that takes a stream would be returned.
    • So we have a stream of changeset information. Filter it to only include:
      • The data item
      • Changes to sibling positions
      • Visibility
  • Can I make it so that the patch system has a simple API for managing a child collection?
    • insert, append, remove, move
      • get(key)
      • insertBefore(value, nextSlot) // parent slot not required because it's already a private field of the API
      • append(value)
      • removeSlot(slot|key)
      • moveBefore(slot, nextSlot)
      • replace(value, slot)
    • Each item has a unique key, irrespective of index
    • If a series of instructions are available for managing the child set:
      • Key per item
      • Each item in a slot
      • Each slot references its siblings
      • All slots in the list hash mapped for fast lookup
  • The previous basic idea for diff/patching arrays:
    • Lines 260-300 in this file for a more detailed overview
    • Two markers, each pointing at a slot; one left (old version), one right (new version)
    • Right marker never advances unless its slot has been patched
    • We always want to advance the right marker as early as possible
    • Left marker advances only as much as is needed to make it possible to resolve the right marker
    • Reposition a slot when the position of the new slot is known, relative to its previous sibling
    • Maintain a reference to the previously-patched slot so that when the next slot is resolved, the previous slot's sibling reference can be updated
      • Updating half of the slot on iteration i, then the other half on iteration i+1, makes for a nastier API
      • Instead, MOVE the previously-patched slot after the current slot has been patched
      • Set the last slot's nextSibling property to nothing, once it has been moved

Side note: when building a tree for the first time, the root element may already have a DOM structure in place from the server. The very first patch of a given slot will not have an existing slot to compare against. In this case, we should be able to acquire the next element in the DOM and subsume it as the element patch target. The insertion event should still fire- the fact that it was inserted earlier doesn't matter; it is only important that the element has a chance to be initialized.

One thing I didn't really discuss above is handling of non-view component sinks.

  1. Build a projected collection of data
  2. Map the collection to a component set
  3. Map the components to a single output value per item, for rendering

Before step 3 I need to get the other sinks out of the component.

var view = div('.items', items$) // items$ is a stream of items projected to components

Actually maybe that's just not important. The collection is managed in a stream anyway.

Or...

var view = div('.items', data$.scan((items, item) => items.add(customComponent(item))))

The above example won't work very well because it'll miss out on all of the extra metadata.

// create a lazy collection. internally we're just deriving from data$ and emitting [list, changes]
var things = collection(data$, (list, data) => list.addRemoveWhatever(data.key, data.value));

// creates a projected stream internally and returns it wrapped in a projection api
var filtered = things.filter(data => data.isCoolYo);

// rather than a function, supply a deferred value that can be resolved to a filter
var filtered = things.filter(filter$|promise|callback);

I wonder if this whole idea of deferred slots is better represented as a simple deferred state resolver. It would take a callback function (usually the parent's resolve method) and unfold/resolve as intended. Slots would be unnecessary for this, and the mechanism could then be more purpose-specific for properties, collections, etc. The number of slot types would be reduced too, which would simplify the API. The single computed slot type (or whatever I call it) becomes a catch-all for anything that is normally deferred.

QUICK THOUGHT: Can I separate concerns so cleanly that the individual components seem useful on their own and can be tested entirely in isolation without any great need for integration tests?

// stashing this here because it might be useful but I don't need it right now.
function hammingWeight(v: number) {
  v = v - ((v >> 1) & 0x55555555);
  v = (v & 0x33333333) + ((v >> 2) & 0x33333333);
  return ((v + (v >> 4) & 0xF0F0F0F) * 0x1010101) >> 24;
}

Things to implement

  • Immutable data structures
    • Persistent vector trie (RRB Tree)
    • Hash trie
    • Metadata leaf nodes
    • Outer wrapper for combining inner structures
  • Deferred value resolver
  • Collection projections
@brucou
Copy link

brucou commented Nov 4, 2017

I have been making a more structured and disciplined effort to aggressively note down the evolution of my thoughts. That turned out to work pretty well, specially in my case, given that I never have more than 3 hours in a row I can dedicate to thinking/architecting/programming, so context switch is an important cost for me and the structured writing technique alleviates it a lot.

I have to say though that I still haven't found the best organization for the miscellaneous files generated by this process. Paper is what works the best for me in fact. Still haven't invented something that is as mobile, and more easily browsable, and high-res than paper.

Last comment, as a big fan of D. Knuth and litterate programming, I write more pseudo-code than code in the investigation phase, and before writing code, I have all the types and interfaces defined. Implementation is really the last thing that comes to me. Works for me, don't know if that works for others.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment