Tag Archives: sankey diagram

Sankey Diagrams

Whenever you want to show the flow of a quantity (such as energy or money) through a network of nodes you can use Sankey diagrams:

“A Sankey diagram is a directional flow chart where the width of the streams is proportional to the quantity of flow, and where the flows can be combined, split and traced through a series of events or stages.”
(source: CHEMICAL ENGINEERING Blog)

One area where this can be applied very well is that of costing. By modeling the flow of cost through a company one can analyze the aggregated cost and thus determine the profitability of individual products, customers or channels. Using the principles of activity-based costing one can create a cost-assignment network linking cost pools or accounts (as tracked in the General Ledger) via the employees and their activities to the products and customers. Such a Cost Flow can then be visualized using a Sankey diagram:

Cost Flow from Accounts via Expenses and Activities to Products

The direction of flow (here from left to right) is indicated by the color assignment from nodes to its outflowing streams. Note also the intuitive notion of zero-loss assignment: For each node the sum of the in- and outflowing streams (= height of that node) remains the same. Hence all the cost is accounted for, nothing is lost. If you stacked all nodes on top of one another they would rise to the same height. (Random data for illustration purposes only.)

The above diagram was created in Mathematica using modified source code originally from Sam Calisch who had posted it in 2011 here. Sam also included a “SankeyNotes.pdf” document explaining the details of the algorithms encoded in the source, such as how to arrange the node lists and how to draw the streams.

I find these a perfect example of how a manual drawing can go a long ways to illustrate the ideas behind an algorithm, which makes it a lot easier to understand and reuse the source code. Thanks to Sam for this code and documentation. Sam by the way used the code to illustrate the efficiency of energy use (vs. waste) in Australia:

Energy Flow comparison between New South Wales and Australia (Sam Calisch)

Note the sub-flows within each stream to compare a part (New South Wales) against the whole (Australia).

Another interesting use of Sankey Diagrams has been published a few weeks ago on ProPublica about campaign finance flow. This is particularly useful as it is interactive (click on image to get to interactive version).

Tangled Web of Campaign Finance Flow

Note the campaigns in green and the Super-PACs in brown color. The data is sourced from FEC and the New York Times Campaign Finance API. Note that in the interactive version you can click on any source on the left or any destination on the right to see the outgoing and incoming streams.

Finance Flow From Obama-For-America

Finance Flow to American Express

Here are some more examples. Sankey diagrams are also used in Google Flow Analytics (called Event Flow, Goal Flow, Visitor Flow). I wouldn’t be surprised to see Sankey Diagrams make their way into modern data visualization tools such as Tableau or QlikView, perhaps even into Excel some day… Here are some Visio shapes and links to other resources.