Since there is a lot of hierarchically structured information in the real world (e.g system folder structure, animal/plant classification, company management, etc.), and the tree chart is a perfect data structure for the storage of hierarchy information, an effective method of tree visualization becomes necessary in a lot of applications. This article goes over the visualization methods we considered and describes their pros, cons and most suitable use-cases.
Recently, our team faced this visualization challenge in a project for a big tech company that needed a solution that enabled managers to see an overview of the whole organization or its parts and find areas for improvement. For this task, we needed to visualize the organization structure and color-code each node in the tree based on its metric value.
The data structure looked like the tree of teams and projects inside them, where height can be up to hundreds of levels (teams can be nested inside teams and projects inside projects). However, it was a challenge to visualize the content of directories (organization structure) when there are a lot of child nodes. If a diagram consists of a small number of nodes, the data is easily decoded and understood, however, as more nodes are displayed, it becomes harder to scale them and use the space effectively. Since traditional node-link diagrams are impractical for the visualization of large data volumes (because of limited space), this solution cannot be used for the aforementioned purpose. Therefore, we needed to find an alternative solution for corporate data visualization purposes.
To find the most suitable solution for the problem, our team investigated and estimated how different hierarchical data visualizations would work with the given dataset.
Estimated visualizations can be used not only for this case, but also to solve the set of problems when a tree data structure needs to be visualized and each node has some metadata that also needs to be shown. Examples are file system folder structure, family tree, taxon trees (e.g. animal/plant classification), hierarchy of teams/projects inside a company, company management hierarchy, and many others.
We estimated a number of different hierarchical visualizations with different-size datasets to better understand their pros, cons and suitable dataset sizes. The options we describe in this article are:
We created a handy React + D3 proof-of-concept application that allows you to try out different datasets and charts. Try it here.
Also called flame chart or partition chart, it displays a hierarchical structure where nodes of a tree are represented by adjacent rectangles layed out progressively according to their depth. You can view it here.
Also called flame chart or partition chart, it displays a hierarchical structure where nodes of a tree are represented by adjacent rectangles layed out progressively according to their depth.
Pros of this chart are that it is easy to see the clusters, their size, and what level they are at. Additionally, they are great for exploring relationships within data especially with interactive features such as zooming and reclustering.
This chart doesn’t use the screen space as effectively as sunburst or treemap charts, so it’s more suitable for smaller-size datasets.
Sunburst visualization displays a hierarchical structure where each data node of a tree is represented by an angular segment within multi-layered rings. Basically, this chart is an Icicle tree, but with arcs instead of rectangles - it allows Sunburst to use the space more effectively. You can view it here.
Pros are similar to Icicle chart ones - cluster size, level and relations are visible at a glance and zoom allows quick and easy exploration. Sunburst also uses the space effectively, so it’s suitable for larger datasets (compared to Icicle chart).
Cons, on the other hand, are higher visual complexity - this chart is somewhat harder to read for first-time users than Icicle tree because the direction of segments changes. Arc segments also make it harder to read labels inside them.
Treemap charts take a different approach to representing hierarchy: they display child nodes inside their parents. This helps first-time users understand the relationship at a glance when visualizing hierarchies like folder structure or budget allocation because the relation of one thing inside another matches real world scenarios.
There are few different options for treemap visualizations described below.
The Rectangular or Squarified treemap displays a hierarchical structure where nodes of a tree are represented as nested rectangular tiles. You can view it here.
The main advantage of this chart is space usage. Rectangular segments can use most of the available monitor space, so a rectangular treemap chart will fit the most information on a single screen compared to all the other charts reviewed here. Other pros are good parent-child relation visibility and easy comparison of node sizes. This chart also becomes easy to read and explore when interactive zoom and pan are added.
Cons of this chart stem from its pros to a certain degree: nesting of children inside parents can make the treemap chart hard to read color-wise, and parent-child segments of the same color are hard to distinguish without additional borders or depth color-coding. Additionally, effective screen space usage can lead to informational overload of the chart for large and complex datasets; however, this can be avoided by showing/hiding tree levels based on zoom.
The Circular treemap displays a hierarchical structure where nodes of a tree are represented as nested circles.
This chart is structurally similar to the rectangular treemap chart and has similar advantages: parent-child relationships are visible at a glance and node sizes are easy to compare visually. It doesn’t use the screen space as effectively, but this can be an advantage because circle segments don’t get tightly packed like rectangular ones which can make the chart easier to read. You can view it here.
As you can see, there are many options when it comes to visualizing hierarchical datasets and each of them have their pros and cons that you need to know to understand which is the right one for your particular case. The best way to find the suitable visualization is to play around with your dataset in different visualizations in the POC app. We did this for the data our client had and chose the squarified treemap option because it used the screen space most efficiently and showed “parent includes children” relation the best.