Sunday, February 24, 2013

Oh, the Shark, Babe, has Such Teeth, Dear...

A few years ago, I wondered what song was at the top of the Billboard Top 100 on the day I was born.   Who doesn't love Wikipedia? The answer was just a few clicks away: Bobby Darin's classic, memorable cover of Mack the Knife was dominating the charts in the fall of 1959, being in the top spot for nine weeks! Nine weeks out of ten, that is: the Fleetwoods knocked it out of the top spot for one week in November with a tune called Mr. Blue, and that was the week I was born. The Fleetwoods had another #1 hit earlier in 1959, Come Softly To Me, which has stood the test of time and is still heard today. But I had never heard Mr. Blue until I tracked it down (well, maybe I heard it over the car radio when my parents were driving me home from the hospital). Temperamentally, I'm a lot more like a Mr. Blue than a Mack the Knife (ask my wife), but still, it would have been a lot cooler to have been born under the sign of Mack the Knife.


Anyway, the opening line from Mack the Knife in today's title refers to shark's teeth, and that's germane to this post because I want to call your attention to the sharp end of the stick for a BioFabric plot: the upper left-hand corner. The image below is a detail of the sharp end of the wiki-vote network I discussed in a post a few days ago:  
BioFabric Network Visualization: Wiki-Vote network: left detail
Click to enlarge image
It's always fun and informative to study that part of the network, and the eye is drawn naturally to it up there in the upper left corner. So what's going on here? (You might want to fire up the network, available here in a ZIP archive, in BioFabric to follow along with this discussion.)

Again, it is important to remember that the default layout (used here) assigns node rows using a breadth-first search of the network, starting at the node with the highest degree, and visiting neighbors in order from highest degree to lowest. So that first node wedge on the far left (node 2565is for the highest degree node (in the case of a tie, we fall back on alphabetical ordering of the names), and in this example, all the edge wedges to the right of it are for nodes that are first neighbors of 2565.

A few days ago I talked about learning to read the top bumps (a.k.a. BioFabric Phrenology), and today I want to stress the importance of learning to read edge wedges. I think that edge wedges are one of the great features of BioFabric, in that they provide a great visual representation of the connectivity of the different nodes in the network, and they make it easy to compare, in a very visual fashion, how two nodes match up in terms of their connectivity. Admittedly, the same ability is provided by comparing two columns in an adjacency matrix, but I contend that the extra visual bulk we get by drawing the edges as lines instead of as points makes the comparison easier. The edge wedges also benefit from being able to use the slope of the left side of the wedge to gather clues about the connectivity patterns.

Always remember, unless you have multiple edges between two nodes, the shallowest angle you can get from the left side of a wedge for a single node is 45 degrees. This is a byproduct of the regular, square grid used for both nodes and edge lines, and the way the layout algorithm assigns edge columns according to increasing length for a given node. So if you see a shallower wedge angle, you are either looking at a directed graph with lots of reciprocal edge pairings, or a multigraph with multiple edges between two nodes (i.e. links tagged with different edge attributes). Also, note I qualified the 45 degree statement by saying I was referring to a single node's wedge. Multiple contiguous nodes can clearly create a run of wedges that produce a shallower angle than 45 degrees on the bottom of the plot, and this is the norm: that's why bottom edge of the plot typically flattens out as we travel to the right.

The wiki-votes graph shown above is a directed graph. If you look closely at the leftmost edge wedge, you can discern that the wedge angle is a little shallower than 45 degrees at the start, but pretty close to 45 near the end. Thus, we can see that node 2565 has some reciprocal directed relationships with its high-degree first neighbors, but this falls away to one-way relationships with its low-degree neighbors. In BioFabric, you can make this guess while looking at a wide angle zoom, and quickly zoom in to confirm this hypothesis with a few keystrokes.

Doing inter-wedge comparisons is another skill to become comfortable with. For example, compare the second wedge from the left (for node 1549) with the first node 2565 wedge in the above diagram. Node 1549 clearly shares lots of 2565's high-degree neighbors, since the wedge angles are similar near the top, but this similarity drops off with 2565's low-degree neighbors, as the 1549 wedge angle becomes almost vertical. Also look at the ratio of shared and unshared edges in the 1549 wedge: we can see that about 60% of 1549's neighbors are shared with 2565 (the top/left part of the wedge), and 40% are different (the bottom/right part).

Moving further over to the right to look at the other wedges, you can get an idea of which nodes have similar connectivity by looking for similar edge wedge shapes. Note also how the 13th node from the left (4037) has a small wedge of nodes that are connected only to it, and nobody else; this pops out at you because it creates the visible small gap in the node rows. These gaps are usually pretty common, and provide a nice visual navigation aid as you move around the network.

So pay attention to the shark's teeth (babe) when looking at a BioFabric plot!

No comments:

Post a Comment