Skip to content

Instantly share code, notes, and snippets.

@Skyblueballykid
Forked from alexcjohnson/LICENSE
Created July 30, 2021 22:24
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Skyblueballykid/68be590d523c5529d1b688cbab8762b7 to your computer and use it in GitHub Desktop.
Save Skyblueballykid/68be590d523c5529d1b688cbab8762b7 to your computer and use it in GitHub Desktop.
Working with React and D3 together

React + D3

The key challenge in integrating D3 code into a React app is how to break up responsibility for DOM manipulation, so the two frameworks don’t step on each others’ toes yet updates are efficient and reliable. There are a lot of ways to go about this and which is best may depend on the specifics of your application. The key is to draw a very clear line between the responsibilities of React and D3 and never let either one cross into the other’s territory. React will always provide the overarching structure, D3 the details of the chart, but the exact boundary can be drawn in several places.

One other note - most of the discussion below (except for example react-faux-dom which is tailored to D3) applies just as well to integrating other packages or JS components inside a React app.

Approaches

Lifecycle methods wrapping regular D3 code:

  • Have React create a container element but put nothing in it
  • Attach D3 (or any other javascript) code to React lifecycle methods to manipulate the contents of the container
  • This is what the official react-plotly.js component does
  • It's good practice to create your D3 code separately, with an API you can call from the React component. This isn't strictly necessary though, a simple chart could be coded entirely within the React component.
  • Example: http://nicolashery.com/integrating-d3js-visualizations-in-a-react-app/

Pros:

  • Can use arbitrary code & packages from outside the React ecosystem
  • Can reuse the D3 code outside React
  • Easy for developers already familiar with D3
  • Good performance - potentially the best especially for partial updates, but at a complexity cost

Cons:

  • Significant code in lifecycle methods. For simple use this is essentially boilerplate, but once you start worrying more about performance and complex interactions it's more than boilerplate, it can require tricky logic and operations.
  • Not React-idiomatic - doesn’t benefit from React diffing inside the plot
  • No Server-side rendering (SSR)

react-faux-dom

Pros:

  • Can use D3 idioms
  • Can use D3 code built outside of React (mostly - some references to the faux DOM end up sprinkled in with the D3 code)
  • Allows SSR

Cons:

  • Slower (two fake DOMs) although some clever usage can mitigate this at least partially.
  • Only pure D3 is intended to work - not all of the DOM API is supported, so arbitrary JS may or may not succeed.

Create/delete with React, style/update with D3

Pros:

  • Managing element creation/deletion is often easier with JSX than D3
  • But you can use more of D3 than just the math - in particular transitions (with caveat about exit transitions).
  • Good performance

Cons:

  • Hard to separate code cleanly - React & D3 mixed together
  • Can be tricky to know which parts of D3 you can/can’t use
  • No SSR

React for the DOM, D3 for the math

  • Use the mathematical parts of D3 to calculate attributes
  • Then pass those attributes to React for actual rendering
  • It's completely orthodox to use D3 this way. This is one of the big reasons D3 reorganized from one big package in v3 to many subpackages in v4, not just to reduce your bundle size. In fact the great majority of D3's subpackages don't touch the DOM, they're just there to help with all the little manipulations and edge cases needed to turn data into visual attributes.
  • Example: https://www.smashingmagazine.com/2018/02/react-d3-ecosystem/ (which also contains examples of several other of these strategies, as well as an appraisal of a few react-specific charting libraries)

Pros:

  • Pure React output
  • Allows SSR
  • Good performance

Cons

  • No reuse of outside D3 code, unfamiliar to D3 devs
  • Need to use D3 at a fairly low level
  • Need to reimplement the pieces of D3 that do create/manipulate DOM elements (which are some of the toughest pieces, like drawing axes)

Deep dive on lifecycle method wrapping

The rest of this discussion will delve into the first approach, lifecycle method wrapping, as it’s the most general and flexible (and the only real option for incorporating packages like plotly.js that use generic JS as well as D3). The articles above do a thorough job explaining the other options, and in particular the D3-only-for-the-math approach should already be quite familiar to a React developer.

The general pattern for this approach is:

  • Create a container element in render, that the D3 operations will be constrained to operate within.
  • Use ref to pass this element to D3.
  • Create the D3 visualization in componentDidMount
  • Tear it down in componentWillUnmount
  • Update it in componentDidUpdate
  • Make sure the D3 component has its dynamic appearance (including all user interactions except maybe transients like hover effects that you never want to impact any other components) fully specified by its state object(s)
  • Pass these state objects down from the React props

On the React side this looks like:

class RadarPie extends Component {
  constructor(props) {
    super(props);
    this.getRef = this.getRef.bind(this);
  }
  
  componentDidMount() {
    RadarPieD3.create(this.el, this.props.figure);
  }
  
  componentWillUnmount() {
    RadarPieD3.destroy(this.el);
  }
  
  componentDidUpdate() {
    RadarPieD3.update(this.el, this.props.figure);
  }
  
  render() {
    return (
      <div ref={el => this.el = el} />
    );
  }

And on the D3 side, something like:

const RadarPieD3 = {};

RadarPieD3.create = (el, figure) => {
  // Create any structure and attributes that are independent
  // of the chart's attributes
  const svg = d3.select(el).append('svg');

  svg.append('text')
    .classed('title', true)
    .attr({
      'text-anchor', 'middle',
      y: 30
    });
  
  RadarPieD3.update(el, figure);
};

RadarPieD3.update = (el, figure) => {
  const width = figure.width || 400;
  const height = figure.height || 500;
  const title = figure.title || '';
  
  const xCenter = width / 2;
  const yCenter = (height + (title ? 50 : 0)) / 2;
  const maxRadius = Math.min(xCenter, height - yCenter);
  
  const svg = d3.select(el).select('svg')
    .attr({
      width: width,
      height: height
    });
  
  svg.select('.title')
    .attr('x', xCenter)
    .text(title);
  
  const len = figure.data.length;
  
  const slices = svg.selectAll('path').data(figure.data);
  
  slices.enter().append('path');
  slices.exit().remove();
  
  const arc = d3.svg.arc() // this is for d3v3, it moved to just d3.arc in d3v4
    .innerRadius(0);
  
  const colors = c20 = d3.scale.category20();
  const angularScale = d3.scale.linear()
    .domain([0, figure.data.length])
    .range([0, 2 * Math.PI]);
  
  const radialScale = d3.scale.sqrt()
    .domain([0, d3.max(figure.data)])
    .range([0, maxRadius]);
  
  slices.each(function(d, i) {
    d3.select(this).attr('d', arc({
      startAngle: angularScale(i),
      endAngle: angularScale(i + 1),
      outerRadius: radialScale(d)
    }))
    .attr('fill', colors(i));
  })
  .attr('transform', 'translate(' + xCenter + ',' + yCenter + ')');
};

RadarPieD3.destroy = (el) => {
  // Nothing to do in this case, but if you create something disconnected,
  // like a WebGL context or elements elsewhere in the DOM (plotly.js does
  // this as an off-screen test container for example) it should be
  // cleaned up here.
};

Now we can call our RadarPie component with something like props: {figure: {data:[5, 1, 3, 4, 10], title: 'Sectors'}}, and updates to any of the figure options will be reflected on screen.

All of that is fairly straightforward. The challenges come from performant incremental updates to the D3 component, events generated inside the D3 component, and large data sets vis-a-vis mutable/immutable data structures.

Update performance

The normal D3 enter/exit/update pattern is already a good start at ensuring high performance. Reusing elements efficiently is important - if you’re using animation for object constancy you don’t have a choice about this - you must use a .data key function that uniquely identifies the object for modification. But if you aren’t animating, you can often see gains by omitting the key function entirely (which results in the array index being used as the key, ie maximal element reuse) and just restyling the same elements with each update.

The next level of performance improvements comes from short-circuiting updates or pieces of the update that won’t do anything. For example if you’re just changing color there’s no need to resize the elements; if you have several sets of bars and only one has new data, only that one needs to be updated. plotly.js accomplishes this by running its own diffing algorithm (within the Plotly.react method) that determines the minimal update path needed.

There are caveats to this particularly with charts: often a change in one object will have ripple effects on the others that aren’t really apparent in the data structure. For example with an autoranged axis, adding a new high point to one data series will require rescaling all points in all series. Or a stacked bar chart, changing data in a series in the middle of the stack will require shifting all the higher series but not the lower ones. This kind of coupling is much less common in the regular HTML portions of a React app. This is partly a result of the explicit layout required for SVG, but largely it’s inherent in the fact that encoding data in visual attributes needs to place that data in the context of all other related data.

Events from inside D3

A D3 chart component can generate a lot of internal events: hover, selection, zoom/pan, toggling visibility… the list goes on. There are broadly speaking three approaches to dealing with these in a React app. In all cases you want to keep the state changes associated with these events in sync with the app state:

  1. Have React bind to these events without the D3 component doing any DOM manipulation in the event handler, and use them to update the props at the app level, which then passes them back down to the D3 component, which then updates through componentDidUpdate. This is the most React-idiomatic method, but it can take some extra effort to ensure adequate performance of the D3 component’s update. It’s also generally not possible to do this with 3rd-party components or those written for use outside React.
  2. Have React bind to these events without the D3 component doing any DOM manipulation in the event handler, use them to update props that are then applied to a React (non-D3) sibling element, so the D3 component does not update at all. This can be a good solution for high-rate updates like hover effects that can be overlaid on the D3 output rather than integrated with it, but again is generally only possible in D3 components that are purpose-built for integration with React.
  3. The D3 component updates its own state (and resulting DOM) and then emits an event. The React component binds to the event and reads or calculates the updated state. This gets incorporated into the React app state and passed back down to the D3 component via componentDidUpdate. The trick then is to ensure the D3 component recognizes this state as unchanged from the state it already prepared for itself, so it doesn’t re-render (in the worst case leading to an infinite loop of events and DOM updates - it can be necessary to include some basic identity checks in shouldComponentUpdate to prevent this).

Approaches 1 & 2 are great for new D3 code you are writing explicitly for a React app - they fit the one-way data flow paradigm, making it easy to clearly and predictably update both the component that generated them and any other coupled components, for example down-selecting the data in one chart based on selection in another. For a non-React-specific integration like Plotly.js though these approaches won’t work, so we use the third approach and put significant effort into ensuring the component knows what constitutes a real change vs its own change feeding back in. Which brings us to our last point:

Large data and immutability

Immutable data structures, or at least immutable usage, are very common in React apps, because it makes the diffing process - central to efficient DOM updates - a simple identity check at each node of the state tree. But if you have large data sets that change quickly (streaming data or user edits, for example) immutable updates may not be feasible, either for speed or memory reasons. But you also can’t do a full element-by-element diff of these large data arrays with each update. This concern isn’t really specific to D3 at all, it could come up in a pure React app, but it’s more likely when D3 or other data visualization packages get involved since in pure HTML it’s difficult to display this much data on a single screen.

React’s declarative data model doesn’t allow us to annotate specific changes - all you have is the old state and the new state, so you can’t insert a flag like “the y data changed” or repeated updates with that same state would erroneously tell us to keep updating. The solution that plotly.js uses is a datarevision property. The value of this property is arbitrary - it could be a hashed version of the data or a serial number that gets incremented whenever the data changes for example. We just know that if this property changed there is an update somewhere in one of the data arrays in the plot, and if it didn’t change, the data arrays are the same as in the previous state.

This concept can be extended to whatever optimized update pathways your component makes available. If, for example, your component can update more efficiently when new data is appended to the end of the data arrays than if existing data have been altered, you could make two properties like datarevision and dataextent. You would increment datarevision only when previously existing data is changed, and dataextent when appending new data. If each data series has its own update pathway, give each series a separate revision property.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment