Skip to main content

Putting the Viz in Dataviz

January 08, 2016

Julian Deborre

Creating a custom visualisation framework using D3.js

When considering data visualisation in web applications there might only be a few underlying technologies to choose from, but the landscape of frameworks and libraries utilising HTML5 and JavaScript can quickly become confusing.

At the start of our journey to create visually appealing, but most importantly user-centric visualisations for effective communication of information, we decided on a small set of options:

  1. Premium or open source frameworks that offer a set number of pre-built visualisation types and sometimes additional options for customising design and interaction.
  2. Building our own framework, perhaps by utilising libraries for better data manipulation and handling.

The appeal with the first option is obvious: You’ll hit the ground running by simply making sure your data is in the framework’s expected format and by passing some configuration telling it what type of visualisation and surrounding ink to throw back at you.

Certainly, simple use cases will benefit from this option. But when settling for a framework, do make sure it meets your long-term requirements:

  • Will it support the types of plots you need?
  • Is the data structure it expects easy enough for you to generate?
  • Can you modify or configure the design and interactions to suit your needs?

Evaluating our options and use cases, we soon realised no out-of-the-box solution would quite cut it. And whilst making alterations or customisations to existing frameworks could be considered an option, it also means taking on some risk. Incorporating future updates to the underlying framework with your own adjustments added whilst keeping code maintainable can be a big challenge. So instead of ending up with a hybrid of framework and hack code that is hard to manage the clear choice was building something bespoke.

During our research, it was impossible to avoid mention of D3.js (which stands for Data Driven Documents) when reading of flexible data visualisation on the web. So it quickly became a very obvious choice and presented itself as one of the most mature viz libraries available.

However, do not mistake D3 for just another viz framework. It is a toolkit that enables you to manipulate data and create highly custom data visualisation from scratch. Although the core is well-documented and there are plenty of examples of different use cases, getting started will require some more in-depth tech savvy.

So, how did we go about planning and ultimately building a bespoke visualisation framework? Firstly, we gathered our requirements, which at a high-level looked something like this:

  • We need a framework allowing us to render an extensible number of visualisation types from a common data structure.
  • Visualisations should be able to interpret a single, standard data structure independently and render this model after a chart type’s minimum data requirements have been met.
  • Additional configuration needs to allow alternative interpretation of the data as well as custom display and interaction settings.
  • The framework needs to be easily extensible to new visualisation types and interactions.

Important in achieving this flexibility was getting the basic structure of our framework right. So we decided to structure the framework into components as follows:

  • Dependencies: These are libraries or applications our framework depends on.
  • Core: This is the core code base to our framework, all visualisations have access to. It is used to enable common functionality shared across the framework.
  • Modules: These are the pluggable components – mainly the different visualisation types themselves. Removing any of these will still leave the framework fully functional and only reduce that particular extension.

So let’s have a look at these three areas in a little more detail.

Dependencies

As I touched on earlier D3 is not really just another viz framework. It is much more a library helping us to firstly work with data in JavaScript and ultimately display it visually. To do that D3 supplies a whole range of helper functions we can use to structure and sort data, but also to apply common mathematical computations.
Further the underlying idea of D3 is not to give you out of the box viz functionality, but to help you “bind arbitrary data to a Document Object Model (DOM), and then apply data-driven transformations to the document” (from: Introduction to D3.js).

In a nutshell that means creating corresponding DOM elements to your data and being able to then manipulate them visually and interactively using CSS and JavaScript to create data-visualisations. I hope by now you can imagine we are great fans of D3.js and it is really our only main dependency.

Core

At the core of our framework sits a thin service layer and a common controller. Both are available to all modular components and supply a whole range of methods and options to use and handle all the groundwork required to set up a visualisation instance. That includes for example generating the basic common markup, retrieving data from APIs, supplying colour schemes and offering methods for number formatting and binding common interactions.

For illustration purposes here an example of a skeleton with potential core functions of such a framework. Variables are prefixed with $.

/*
A dedicated service to load d3.js and all other d3 related dependencies allows us to inject
this service as a dependency into other parts of our application. Further we can use promises
for example to ensure our dependecies have successfully loaded before attempting to make use of them.
*/
d3Service
/*
A common controller can be used with all types and instances of visualisations across the application.
It supplies general methods, functions and processes that we can maintain from one central location.
The main components could include the following:
*/
visualisationsController
/*
Internally we can declare and chain methods for initial setup, data retrieval, build and process.
i.e.
*/
visualisationsController
.setupVariables()
.setupVizElement()
.toggleLoading()
.queryData();
/*
Listeners for events are a great way to respond to user interactions or changes in the environment.
Setting them up in one central location makes them more maintainable and allows us to apply them to
all instances and types of visualisations.
*/
visualisationsController.listeners();
/*
A method for formatting commonly expected data types is a good thing to keep central in order to ensure
uniform and efficient processing.
It internally utilises other methods such as visualisationsController.formatNumber(), visualisationsController.parseDate() etc.
*/
visualisationsController.formatData();
/*
As we chain and invoke methods we can use simple conditionals to make them optional. For example a method invoking
data processing functions that can be defined as a standard in the common controller or overwritten
from modular components if needed.
*/
if( typeof visualisationsController.preprocessData === function )
visualisationsController.preprocessData();
/*
Similarly a setup function that can be defined as a standard in the common controller
or overwritten from modular components if needed.
*/
if( typeof visualisationsController.setupViz === function )
$setupStatus = visualisationsController.setupViz();
/*
And ultimately methods invoking a paint function that is used to actually render the visualisation’s relevant elements on the page.
As this is likely to be a very different process for the different visualiation types this would be defined in the particular modules.
*/
if( $setupStatus === true && typeof visualisationsController.paintViz === function )
$paintStatus = visualisationsController.paintViz();

Modules


We extend the core of the framework by setting up modules for each of our custom visualisations (e.g. there is a histogram module).


One approach to help make the code more maintainable would be to adhere to a fixed structure for these modules. By using a naming convention for frequently used methods we would be able to invoke these from the core of the framework rather than within the modules themselves. We can then handle event chaining and running order in one place, the core, and avoid introducing repetition of this process within the modules (see code samples above lines #38 -> #58).
One downside of this approach is the slightly tighter coupling between the core and the modules. In this setup, the automatic chains do not run if the module methods do not follow the naming convention. However, as they are conditional, they fail gracefully and modules can invoke their own methods if an alternative approach is needed.

On a high level the convention for methods within a module could look like this:

  • Process Data: A method that interprets our data structure in the way needed for the type of visualisation. This is an optional method as not all modules will require it.
  • Setup Viz: A method setting up all basic functions and generating all necessary base DOM elements required to paint this type of viz.
  • Paint Viz: A method actually painting the visualisation. This function needs to be set up in a way that it can be repeated in case of data change for example. This then needs to redraw the viz without adding unnecessary repetition.
/*
We’re using a pluggable module for each type of visualisation.
These are setup using a standard architecture and can make use of all the methods
and services supplied by the common visualisationsController.
*/
vizModule
/*
Make sure to utilise what your frameworks have to offer i.e. D3 methods for manipulating data structure and format
-> https://github.com/mbostock/d3/wiki/Arrays
-> https://github.com/mbostock/d3/wiki/Math
-> https://github.com/mbostock/d3/wiki/Formatting
-> https://github.com/mbostock/d3/wiki/Layouts
*/
vizModule.preprocessData = function( $rawData ){
/*
As an example we could make sure the data elements this type of visualisation depends on are actually present
within the data available.
*/
if( typeof $rawData !== object || typeof $rawData.nodes !== object )
return false;
var $processedData = $rawData;
/*
Additional structuring or filtering of the data could happen here.
*/
vizModule.processedData = $processedData;
return $processedData;
};
/*
This is where we set up functions to plot certain elements, create axis -> https://github.com/mbostock/d3/wiki/SVG-Axes
insert the basic DOM elements needed for this type of viz and generally do all operations that should only have to be
carried out once to enable painting.
*/
vizModule.setupGraph = function(){
/*
We can invoke methods from the core of the framework to make use of commonly used operations like clearing the contents
of the element we are drawing the viz in before setup.
*/
visualisationsController.clearCanvas();
/*
The setup method can also be used to create any additional methods we will need to paint the viz later.
For example we can prepare our scales and axis here.
*/
vizModule.xScale = d3.scale.linear().range(0,100);
vizModule.xAxis = d3.svg.axis().scale(vizModule.xScale).orient(bottom).ticks(10);
/*
Lets assume that visualisationsController.canvas holds the DOM element we are using to hold our viz and it is
a simple DIV we will use as a container. We now want to insert our blank SVG element to draw on later.
*/
vizModule.svg = d3.select(visualisationsController.canvas).append(svg);
/*
Since we have our SVG all setup know we can alreay draw our xAxis as we will not want to have to repeat that step
everytime we paint the other viz elements.
*/
vizModule.svg.append(g).attr(class,x-axis).call(vizModule.xAxis);
/*
Returning true is important to our method chaining we discussed earlier in the core of the framework.
It lets us check that steps depending on each other were executed correctly.
*/
return true;
};
/*
This is where the final magic happens and we actually draw our plots.
Make sure you understand and utilises D3 data joins and make use of .enter() and .exit() methods.
This method will be called whenever there is a change in data or the plot needs to be redrawn.
So we want to make sure to only run operations that are necessary to avoid repetition.
-> Great introduction here: http://bost.ocks.org/mike/join/
*/
vizModule.paintGraph = function(){
/*
Lets include some nodes in our SVG, we can do that by making use of the D3 data join methods.
Here we basically join HTML elements with the CSS class “.node” to our data structure.
*/
vizModule.nodes = vizModule.svg.selectAll(.node).data(vizModule.processedData.nodes);
/*
If there is too little corresponding HTML elements for our dataset we add more.
*/
vizModule.nodes.enter().append(circle).attr(class,node);
/*
And if there is too many corresponding HTML elements for our dataset we can remove them.
*/
vizModule.nodes.exit().remove();
/*
Now that we have exactly the right amount of DOM elements to match our data we can manipulate them to represent
our data. In this case lets position the circles we we have appended to the SVG element along our x-axis.
*/
vizModule.nodes.attr(cx,function( $nodeData ){
return( vizModule.xScale( $nodeData.x ) );
});
/*
Again we can finish the painting method by returning true and letting the core of the framework know that
everything went smoothly.
*/
return true;
};
view rawvizModule.js hosted with ❤ by GitHub

We believe that this general architecture and approach allows for the framework to be extended quickly through additional modular components, or by rolling out common features through the core layer of the framework.

Our main application is build using AngularJS so naturally we also make use of some of its features in our visualisation framework. More on integrating D3.js and AngularJS in a future blog post!