Tuesday, March 26, 2013

Petite rue Picpus, No. 62 (On Becoming a Domain Expert)

In a recent post, I used Mother Innocent as an example to demonstrate how shadow links are a way to show a complete inventory of links incident on a node, and also to insure that each node has a presence on the main diagonal of a BioFabric graph:
 
Mother Innocent on the Diagonal (BioFabric)
Click on image to enlarge

Though a node named Mother Innocent may seem a little strange, it all makes sense when you know that she is a character in Victor Hugo's famed novel Les Miserables, and the network is Knuth's well-known graph of character occurrences in that novel [D. E. Knuth, The Stanford GraphBase: A Platform for Combinatorial Computing, Addison-Wesley, Reading, MA (1993)]. 

If you go back and read that post, I talked about the links between Valjean, Faucheleven, and Mother Innocent. But when I speculated on the latter's popularity and residence, I was coming from a position of (almost) complete ignorance about the actual domain that is modeled by the network. Since I had a conference trip coming up (go take a look at my posters!), I figured that the 1194-page novel (Julie Rose translation) was a great way to pass the travel time.  Plus, it allows me to conduct a little experiment. When I first played with the graph data and ran it through the various BioFabric layout alternatives, I was completely in the dark, but I could start to get a feel for the data and how the characters might interact. Now, by the time I get to the end of Hugo's novel, I will be enough of a domain expert to be able to look at the data with a different set of eyes.

As of this moment, I am only on page 480. And Hugo famously went off on tangents in this novel: Adam Gopnik, in the book introduction, refers to them as "the gassy bits", the parts that are in contrast to the dramatic sections of the novel.  In fact, I've just waded through 36 pages discussing the details of convent life and how monasticism was an anachronism in the modern world of the mid-19th century. But even with that, it's been a great read.

But I have covered enough ground to be able to say with authority that [SPOILERS AHEAD] Mother Innocent indeed had a home. She lived in the Convent of the Bernadines of Perpetual Admiration, at No. 62 petite rue Picpus, Paris. Few of the nuns interacted with the outside world, but Mother Innocent (aka Mademoiselle de Bleumeur) was the prioress, and therefore was the one who could talk with Faucheleven (the convent's gardener) and the soon-to-be assistant gardener Valjean. This discussion took place because Jean Valjean was not buried alive in the supposedly empty coffin in Vaugirard Cemetery. And I might argue that there should be a link between Gribier and Valjean in the network, since the former was quite stubbornly (albeit unknowingly) trying to accomplish the live-burial of the latter.

So, although I've recently been spotty with BioFabric postings due to travel, I've been putting my spare time to good use: reading an 1862 French novel to further the cause of network visualization.

Interesting side note: I seem to have cornered the market on image searches for "Mother Innocent" (in quotes). Go take a look: Google Image Search

Thursday, March 21, 2013

Broaden Your Thinking: Equal Rights for Edges!

I'm blogging to you today from the Broad Institute in Cambridge Massachusetts. I'm at the VIZBI 2013 conference, where I have two posters to spread the word that network edges should not be second-class citizens! They are both posted online, so go take a look. I presented the science one yesterday, and the Art and Biology one will be on display tonight. The online version of the latter is particularly nice, since you can zoom all the way in and read the node names.

Tuesday, March 12, 2013

Lamont Cranston Gives Mother Innocent a Home

In the last installment of our exciting radio drama, we learned that I was innocently occupied absorbing blatant Madison Avenue attempts to get me to nag my Mom and Dad to buy a brand-new RCA color TV, whilst said parents were happily enjoying the nostalgia of listening to The Shadow in the other room. Fans of The Shadow were well aware that perhaps the most famous identity our hero used to conceal himself was that of Lamont Cranston, a "wealthy young man about town."

This posting will talk about an important superpower possessed by Lamont, err... I mean the Shadow, umm... I mean BioFabric shadow links.  I introduced shadow links in my last posting, so if this is making even less sense than usual, go check that out first. But with that intro under our belts, let's pick up and continue working with the Les Miserables network from last time. Here it is again, in the non-shadow, default layout version:


BioFabric Network Visualization: default layout of Les Miserables network
Click on picture to enlarge 

It almost seems as if Lamont could be a name that shows up in the network along with Valjean, Gervais, and Labarre, but unfortunately he is not in there. Someone who is present, however, is Mother Innocent. She shows up over on the left near the bottom of Valjean's edge wedge.  Here is a close-up:

Mother Innocent in the BioFabric Les Miserables network
Click on picture to enlarge

One important feature of BioFabric node lines is that they are only as long as they have to be to get the job done. The node line starts when the first incident link is drawn, and ends once the last link is drawn.  This feature is actually what gives the non-shadow version of the network its distinctive shape. 

So, it turns out that Mother Innocent is not that popular. (Or maybe she is? I have not read the book, nor seen the play or the movie.)  She has only two connections in the network: one on the left end with Valjean, as shown above, and one with Fauchelevent, as shown below. The following detail shows the right end of Mother Innocent's node line. That's Mother Innocent's node line coming to an end in the lower right of the figure, after having the link to Fauchelevent laid down. She doesn't get to have her name in lights over there on the right edge, because she expires before she gets there:

Mother Innocent in BioFabric Les Miserables network: no shadow links
Click on picture to enlarge

So this is something to keep in mind when looking at a default layout BioFabric plot without shadow links: not every node ends up having a presence on the right/upper edge.  If it turns out that a node does not have any links to another node that has not yet been seen in the breadth-first search layout, the node line will quietly disappear before it shows up on the right/upper edge. Of course, that is also true of all those nodes with only one link; see e.g. Gervais, Isabeau, and Labarre in the top detail. But it can be easy to forget in the case of nodes with two or more incident links. Without shadow links, you will get empty row gaps on the right edge instead of a dedicated labeled "node zone".

So what happens when you introduce shadow links? Well, every node now gets a home on the main diagonal. That's because even though the "real" links got drawn somewhere over on the left, the shadow links, by design, are drawn as part of the dedicated node zone that appears on the diagonal. Below we show Mother Innocent's node zone in the shadow link version: she does not have any new "real" links (i.e. below the diagonal) to offer, but her links to Valjean and Fauchelevent show up as shadow links above the diagonal:

Mother Innocent in BioFabric Les Miserables network: with shadow links
Click on picture to enlarge

And note that all the other one-link worthies I mentioned above (Gervais, Isabeau, and Labarre) also appear here on the diagonal. So this is another compelling reason to get comfortable toggling to the shadow link display: it guarantees that by scanning along the main diagonal you will be sure to encounter the entire inventory of nodes.

So, just remember this important superpower of The Shadow Links, because Lamont Cranston can indeed insure that Mother Innocent has a home!

Thursday, March 7, 2013

The Shadow Knows!

"Who knows what evil lurks in the hearts of men? The Shadow knows!" So begins the popular radio drama The Shadow that ran from 1930 until 1954. When I was a tiny kid in the early '60s, the local radio station would rerun episodes on Sunday nights, much to my parent's delight. Of course, the mystery of The Shadow was lost on me, since I just wanted to go and watch Walt Disney's The Wonderful World of Color on TV!


Although the Shadow had "...the power to cloud men's minds so they cannot see him", an important feature of BioFabric, called shadow links, are intended to makes things clearer instead of cloudier.  So today's posting will provide a short introduction to get you started with shadow links.


The Les Miserables network I debuted in an earlier posting serves as a nice, compact example. But this time, I will present it with the default layout, which is how it would look after you first imported it from a .sif file. You might want to go back to the earlier posting to compare this version with the custom cluster-based layout:

BioFabric Network Visualization: Les Miserables
Click on picture to enlarge

Remember, Valjean gets top billing in the default layout because he is the highest degree node, and all of the links incident on Valjean get drawn before we move on the the highest-degree neighbor, Gavroche. Then, when we are done with Gavroche, all the remaining links incident on that node have been drawn as well. This means that when we get to e.g. the fifth node, Thenardier, the edge wedge we see for that node is not a complete inventory of all the links incident upon it, but is missing the four previous links attaching Thenardier to Valjean, Gavroche, Marius, and Javert. This is to be expected, since if we are drawing each link only once, we are not going to get a contiguous region of incident edges appearing for every node in the network: down that path lies the dread hairball

Given the distributed nature of drawing links that is a natural outcome of treating nodes as potentially infinitely long lines, is there a solution? Yes, and that's what the Shadow knows! Instead of drawing each link only once, we draw it twice: one "real" link, and one "shadow" link.

When BioFabric is first started, shadow links are turned off.  To see them, go to the main menu and select Edit->Set Display Options...:

Step one to add shadow links in BioFabric
Click on picture to enlarge

In the dialog box that appears, check all three middle boxes, providing shadow links, minimal submodel links, and node zone shading.  Click OK:

Step two to add shadow links in BioFabric
Click on picture to enlarge


The result adds shadow links to the network, as well as providing alternating subtle shading of the node zones to make the links belonging to each node stand out more distinctly:

Les Miserables network with shadow links
Click on picture to enlarge
Compare this shadow version with the one at the top of this posting.  See how the part of the network below the main diagonal just looks like a stretched version of the non-shadow network? In fact, all the links below the diagonal are the "real" links, while the ones above the diagonal are the duplicate "shadow" links. And since there are now twice as many links in the network, the network is twice as wide.

Let's take a closer look at some of the later nodes to be laid out, focusing on Joly, who stands out over there on the right side of the network.  This is what it looks like in the original, no-shadow version (though node shading is still active):
Joly detail in non-shadowed BioFabric network
Click on picture to enlarge


While this is what the same node zones look like in the shadowed version:

Joly detail in shadowed BioFabric network
Click on picture to enlarge

The part below the main diagonal matches the original, while the shadow links incident on each node are drawn above the diagonal, and to the left of the "real" links for the node.

The thing to keep in mind about the upper non-shadowed version is that although there is a prominent node label on the the contiguous group of links incident on a node (e.g. Joly), that group is not the full set of links incident on that node! Without shadows, I tend to think of each of those regions (what BioFabric calls the node zone) as the death-rattle of each node, the last chance to make its mark on the world before it rides off into the sunset.  Because of the way the default layout works, that region is only where the last links for a node are drawn. Though they are prominently labeled (because it would be perverse not to provide such a label on such a prominent feature), they are not the whole picture. To get the whole picture, consult the shadow links version!

The shadow links version is very egalitarian: everybody gets to share, and so each node has the full compliment of incident links in its node zone. If you are not careful, you might look at the non-shadow version and say that Joly is degree six, but looking at the shadowed version, you can see immediately that Joly is much more popular, with 12 links!  Indeed, the entire right end of the shadow network shows those nodes to be much more well-connected than a non-trained eye looking at the non-shadow version might be lead to believe. 

So why not have BioFabric only support the shadow link version of the network?  That's a good question, but I would argue that the regular version is certainly more compact, provides a cleaner profile of the network structure, is more faithful to the true topology of the network, and is even preferable with smaller networks where all the links can be scanned in a single view. The clustered version of the Les Miserables network is one such example.

So, in stark contrast to the famous radio character, shadow links have instead "... the power to clear men's minds so they can see..."!  

Who shows what links connect to the hearts of nodes? The Shadow Links show!