Combing the Hairball: An Ode to the Node

BioFabric is all about giving network edges the recognition they deserve in network visualizations. Traditional node-link diagrams are happy to pile edges all on top of each other willy-nilly, thereby making individual edges impossible to trace. The innovation of edge bundling spruces things up quite a bit, but that approach is happy to pile all the edges on top of each other in a clean, well-planned, organized fashion... thereby making individual edges impossible to trace. And while adjacency matrices do put the edges front and center, they manage to do so by depicting the edges basically as itsy-bitsy little points; I happen to think that each edge deserve more ink than that. I like my network edges with some meat on their bones.

Yet while BioFabric gives edges the center stage, it still gives nodes their due as well, drawing them as horizontal lines lurking behind the edge wedges. But I thought my credentials as a real edge fanatic were solid, until I was recently asked an interesting question: why bother to show the nodes at all in a BioFabric network?

What's the point of all those horizontal lines, anyway?

The flippant answer to that question is that of course you need show them: they are, after all, the nodes in a node-link diagram of a network! But of course, it is a general rule that you don't want to add ink to your visualization unless it serves a purpose, and so it is indeed a valid question that deserves some serious thought... so should we just ditch the node lines entirely and let the edge wedges do all the talking? Maybe turn-about is fair play, and we should let the nodes take it on the chin for once after hogging the limelight for so long?

There is no question that by drawing in the nodes, we are creating a ginormous number of line intersections; this adds significantly to the visual complexity of the presentation. If no node lines were drawn, there would be no intersections at all, and a much cleaner view. And maybe since you really have no business exploring a large network unless you are using an interactive software tool, perhaps the edge wedges provide enough information all by themselves, and mousing over the links to find out the node names would be sufficient.

So what are the benefits of explicitly depicting the nodes? I can think of a few. First, one could argue that a BioFabric visualization is already somewhat more abstract than the traditional nodes-as-points depiction of the network. It is asking people to stop thinking of nodes as compact, individual entities, and instead to think of them as something a little more removed from common experience. Joe is no longer that nice round little circle over there, he is now a line on the plane. So I can argue that having Joe entirely disappear from the visualization might be a bridge too far, and will make it even harder for people to gain an intuitive understanding of what they are looking at.

Secondly, I happen to feel that the node lines provide a very useful implicit coordinate grid that helps the eye to trace horizontally across the network over long distances, while maintaining a sense of context. I find this can help a lot when I am scanning to find common link endings. As an aside, I contend that the color cycling BioFabric uses to draw links and edges is what makes it possible to maintain context while eyeballing across long distances. For this reason, the user cannot assign colors to nodes and links in BioFabric to convey additional information about them, and that is not a bug, but a feature.

Furthermore, it is true that a BioFabric network can easily create a situation where there are 100 million line intersections being shown. But I think the current implementation has served as a proof of concept to demonstrate that it is possible for the user to simply ignore the highly repetitive (fabric-like!) pattern this creates, and hone in on the important stuff that contrasts with this repetitive background. Perhaps a parallel situation is where people have no problem working with plots that are drawn on engineering graph paper with a one millimeter grid pattern?

Finally, the presence or absence of a node line passing through a link end in BioFabric is a crucial piece of information, because there is an underlying rule that nodes lines "are only as long as they have to be". To understand this, consider this detail from our old standby example, the Les Miserables network:

For example, the fact that Simplice has a node line emerging from the Valjean-Simplice link end and disappearing off to the right immediately tells us that she is not a one-degree node, but has other links as well that we cannot see. Compare this to e.g. Labarre and Gervais, who we know for certain only have links to Valjean. We can tell this because nodes are only drawn as long as needed, so nodes with no other connections have the shortest line possible, which is none at all. This line of thinking extends to higher-degree nodes as well, since we can say that when a node line ends, no further connections will appear. Of course, if we were not drawing nodes, we could infer nothing at all about the existence of additional links due to the lack of a node line.

On balance, I think the scales tip in favor of rendering nodes, with one caveat: when the network is drawn at very small scales (i.e. zoomed out to show distant, global views) it might make sense to drop the rendering of nodes. This is because BioFabric rendering has not been optimized to take into account the zoom level, and currently zoomed-out renderings need some work to improve the brightness and contrast of large full-network images. At large distances, the advantage of showing nodes decreases, and they just tend to reduce the contrast of the network image.

By the way, I was asked this interesting question about maybe not showing nodes during a busy poster session, and unfortunately I cannot properly thank that person or give them appropriate credit here for getting me to think about this question... but I do want to thank you!

I'm glad that we are knocking nodes off the pedestal a bit here, but I'm not yet ready to toss them in the rubbish bin. So I'm always going on about how (nodes == lines) -> !hairballs, but perhaps I will need to sometimes change my tune to (nodes == null) -> !hairballs?

What do you think?

Combing the Hairball

Thursday, April 4, 2013

An Ode to the Node

No comments:

Post a Comment

About Me