Skip to content

Instantly share code, notes, and snippets.

@threepointone
Last active June 1, 2023 18:35
Show Gist options
  • Star 36 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save threepointone/0609a9ed627abb5650f2410b8e6a257c to your computer and use it in GitHub Desktop.
Save threepointone/0609a9ed627abb5650f2410b8e6a257c to your computer and use it in GitHub Desktop.
Implementing a client for feature flags

On implementing a client for feature flags in your UI codebase

This document isn't an explainer on Feature Flags, you can find that with my amateur writeup, or literally hundreds of better writeups out there.

This document is also agnostic to the choice of service you'd use: LaunchDarkly or split.io or optimizely or whatever; that's orthogonal to this conversation.

Instead, this document is a list of considerations for implementing a client for using Feature Flags for User Interface development. Service providers usually give a simple fetch and use client and that's it; I contend that there's a lot more to care about. Let's dive in.

To encourage usage, we'd like for the developer experience to be as brutally simple as possible. So, this should be valid usage:

import flag from "fflags";

// ...

const flagValue = flag("flag-name", defaultValue);

This basic experience already raises some implementation details:

  • We would like for this call to be typechecked, so defaultValue errors if it's of the wrong type. The data for the type of 'some-flag-name' will be inside the service, which means we need a tool that will pull this data from the service and generate a local fflags.d.ts signature (during build? periodically on dev's machines?).
  • The flag() call is synchronous. What we'd like, is for all required flags to be fetched at one go when the application is started. This would be great. That approach involves statically analysing an entire codebase and figuring out which flags have been used.
    • This also implies that only string literals can be passed as flag names to flag() (i.e. no vars or nothing dynamic).
    • This also implies that this should work even if the requests for flags are inside node_modules.
    • This also gives us the opportunity to verify that the flag is spelled correctly.
  • Alternately, we could use a stale-while-revalidate approach, where any existing value is used immediately while the flag is fetched again in the background, and any future uses (i.e. after a page refresh) uses the fresh flag value (stored in localStorage or something). (pros and cons of either approach left as an exercise to the reader)
  • You will notice that the call doesn't require any user identifier. This is because it's not needed for client side data; the 'thread' is separate from other user's 'threads' (where thread = browser process).
  • However, this doesn't hold true if you're planning on using the flags in a server context (which is true if you're doing isomorphic code, or just serving different data for different users based on flags). There, multiple user requests will be passing through the same process (specifically with node.js). It would suck if we have to thread a request context with every flag() call, and it would make the isomorphic story a bit icky.
  • Instead, we could use async_hooks/AsyncLocalStorage to establish request contexts, and on the server, flag() should read this context to read the right flag value for a user.
    • The caveat here is there are some circumstances in which this maybe a bit buggy (particularly around using pooled objects, see this issue for some more context nodejs/node#34401). I don't think it's a deal breaker, but definitely something to look out for.
    • Which means we'll load different versions of 'fflags' for the client side and server side.
  • And of course, feature flags might not be tied to a user context. It could also be tied to the product context, or something else. For example, (your design system) might be using a feature flag to roll out a feature. Products could opt-in to using a feature or not. We may have to introduce "product" (or url?) as a first class type with the service we use. (TODO: discuss this further)
  • We shouldn't use stale-while-revalidate on the server side. That would imply that we're holding flag values in memory for multiple users over a period of time. Let's just not do that, costs memory, and not great in a serverless/multi-server world.
    • This implies your feature flag service should be physically close to your actual servers, or it'll end up being a bottleneck to all requests.
  • I think it would also be useful to have a mechanism to explicitly override flags, probably with a url scheme. So a url like jpmchase.com/some/page?fflags=feature-A:true&feature-B:false would set flag values for feature-A and feature-B, regardless of what the service returned for them.
  • One of the concerns with feature flags, is that a codebase eventually gets littered with flags after a period of time. I do believe that's a better situation to be in than branching hell, but I commiserate and empathise.
    • One of the tools we can build that makes this better, is being able to scan a codebase, and tell which flags are at 100%. This makes it easier to do cleanup rounds, where the non-matching code is removed the codebase.
    • A cool feature would be if PRs were automatically generated that removed dead branches.
    • It's also important to follow this up with with removing the actual flag from the service (after it's actual usage drops to near 0).

Bundlers

With the way we use this and the current state of bundlers (webpack, parcel, etc), you'll see that we ship the code for multiple branches inside a bundle. This isn't a very critical problem for us at (heavily client side rendered software firm); we don't optimise as much for Time To First Interaction. But, there are a couple of options we can take, each with it's own tradeoffs.

  • Use the flag as a signal to dynamically import code, like flagValue ? import('feature-A') : import('feature-B'). While this means less code loaded overall, this might mean even slower initial load time, since you'll have a waterfall of js requests before the app starts up for good.
  • Invest in bundler/serving infrastructure that considers feature flags as a first class concept. A naive approach would be to generate bundles for every combination of flags, but a better approach would be to break the bundles up into chunks that can be loaded dynamically and stictched together on the browser. This is more of a long term plan of course, but we can get there.

Analytics

It's important to tie this closely with our analytics story, whatever that may be.

  • We should send all used flag values with every analytics event. It would be tempting to try to deduce them post-facto, but that way dragons lie.
  • This will be particularly useful later for tracking down errors and issues, if and when we can slice and dice analytics by flag values.

TODO: Flesh this out in more detail

Education for developers

We can't just ship this feature and assume it'll be used perfectly immediately. This is a massive culture change for (our firm), and we'll have to guide everyone through this journey (which may last for months, if not years)

  • We'll want to write resources on how to use flags. Complete with examples and recipes. A debugging helper may also be nice, as a chrome extension or something.
  • It's also important that we show how not to use flags. There will be initial temptation to use flags for business logic features, since it'll look so similar to 'configuration' and we must push back against that.
  • We also want to show how to write tests for code that uses flags, This maybe as simple as showing show to mock 'fflags' and writing tests against different values, but it must still exist.
  • As a specific note because of how it can be complicated, we want to show how to debug issues with async_hooks when feature flags are used on the server.

Education for SDLC

The above is useful for the actual act of writing code, but there's also education to be developed in the broader context; how to use feature flags to ship softeare effectively.

  • How does one ship a feature with feature flags? How does one ship parts of the feature incrementally and validate in safely in a production context before turning on the feature fully for everyone?
  • What are the caveats and problems that may arise in these scenarios? Say when it comes to flags that seem mutually exclusive but subtly clash together in an unexpected manner? And so on.
  • How does this affect the QA cycle? What does it mean when one can ship a feature, but totally turned off to production, then have QA validate features?
  • Similarly, how can we then gradually roll out features in production, testing that we haven't broken anything, and have the freedom to roll it back?
@NoriSte
Copy link

NoriSte commented Jan 18, 2021

Some small typos:

-with with removing
+with removing

-the non-matching code is removed the codebase
+the non-matching code is removed from the codebase

-to ship softeare effectively
+to ship software effectively

Anyway: thanks a lot for sharing ❤️

@yozlet
Copy link

yozlet commented Jan 20, 2021

Hi, I'm with LaunchDarkly. We agree with most of the ideals you've stated. Here's how we implemented them in our SDKs:

  1. The client-side Javascript SDK does asynchronous initialization, during which it opens a connection to our servers (or your proxy) and downloads all the flag values for the provided user context. When it's ready it fires an event, and keeps the server connection open to receive flag updates as server-sent events. Note that this doesn't require any static evaluation of your code to know which flags to download; it just pulls all the flags from your LaunchDarkly project. Unless your project has several hundred flags, it won't be a lot of data, and it's coming in asynchronously.
  2. The part about receiving flag updates as SSEs is really important for multiple reasons:
    1. It prevents the client from having to repeatedly poll, which would be significantly costly for both client and server
    2. It means that flag evaluation calls (flag() for you, variation() for us) are synchronous, instant, and pretty much free
    3. It means that all connected SDKs - both client- and server-side - receive flag updates instantly, usually within about 200ms
  3. If you want to evaluate flags before initialization has completed, bootstrapping is a neat trick. You can also use a LocalStorage cache or similar, but it does depend on the client having already connected once, and on none of the important flags having changed since the cache was built.
  4. To change user/product/whatever contexts, call identify(context_object). On the client-side SDKs, a new set of flag values will be downloaded.
  5. Server-side SDKs download the entire flag configuration at initialization, and they have our targeting rules engine built in. This enables them to do flag evaluations themselves in memory, even when contexts change. They also maintain server connections so they receive flag changes instantly. (I hope this simplifies your section about contexts, threads, and the server-side -- I confess that I don't quite follow the problem you're trying to solve.)
  6. It can make sense to have separate user and product context objects, but it may be easier to simply add a product attribute to the user context (i.e. "this is the product we're in right now"). Remember that a context object is just a JS Object which can have as many custom attributes as you like.
  7. To find out which flags are used where in your code, use our Code References tool in your CI process. The problem of flag litter is a common one, and we still don't have a silver bullet for this. (Uber's Piranha looks fascinating.) However, Code References help a lot, as do...
  8. ... flag Insights graphs, which are live. All our SDKs send back flag evaluation info within a few seconds by default unless disabled.

But this is all specific to our SDK. If you're rolling out a flag system across an engineering org, there are many benefits to using a service-agnostic wrapper. Not only will it make it much easier to migrate the backend if needed, but it also provides room for custom hooks, observation, etc. There are quite a few wrappers out there (such as FlopFlip for React) but it's not hard to write your own. Just make sure that it enables the kinds of usage you need.

As mentioned earlier, it's possible - in fact, it's likely - that flags will update while code is executing. This is usually a good thing. To make the most of it, ensure that flag evaluations are basically free (instant and non-blocking) and then prefer re-evaluation over caching. Even better, ensure that your client/wrapper fires events when flags update. However, there are also cases where you don't want a flag value to change within a critical section of code. In most of those cases, the critical section is small enough that you can just evaluate the flag once at the start and then cache it in a variable.

There are many options for bundling. We don't provide anything special for this, but I know that some of the people working on Webpack module federation are looking explicitly at using flags to choose dynamic imports.

As for education: I'm happy to talk more and provide answers and opinions if you want. (I've already passed my limit for how much I want to write in one comment!) Bear in mind that, just like the rest of software engineering practice, there's quite a diversity of opinion around what works and what doesn't. We've started building a library of best practice guides and tutorials, and we have much more advice on the blog.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment