Monday, June 24, 2013

"This only is the wedge-craft I have used."


OK, the title for this posting is not exactly what Shakespeare said (Othello, Act 1, Scene 3). But it's becoming clear to me as I continue to use BioFabric that putting lots of thought into creating informative edge wedge shapes (wedge-craft?) can pay off. It is well-known that many aspects of form (e.g. length, width, size, and curvature) are detected via preattentive processing, and this implies that BioFabric's edge wedges can take advantage of this perceptual "fast track" to enhance network visualizations.  

I expect this to be my final post on my World Bank network (phew!), and I want to use this to show how edge wedges allow the viewer to quickly extract information from a BioFabric network. If you've been following this blog for a couple months, the World Bank network should be familiar to you. But if not, I'll provide a quick summary of the posts. The network was introduced on April 11th, and explained further on April 13th. It was used as an example to describe how to build submodels on April 21st, and also how to use the Compare Multiple Nodes feature on May 20th

But the most important background for today's post is the May 11th Pitching Wedges post, where I talked about how the edge wedge for each borrower country is organized. Reading that post is a prerequisite for understanding this one! Finally, the post on May 22nd highlighted a network feature (thick vs. thin umbilicals to global player suppliers) that is closely related to today's topic, and is good background as well.

This post will show how the judicious organization of edge wedges can make it easy to quickly compare how each country contracts out to suppliers using its World Bank loans. I'm going to do that by showing four different countries, which are shown below together in a single view. From left to right, the countries are Nicaragua, Madagascar, Peru, and Ethiopia. Let's look at each one in turn, though in a different order.

Comparing all four countries at once

OK, last chance to go back and review the Pitching Wedges post, or the following is not going to make any sense, since I'm not going to redefine terms like "global players", nor review how the wedges are organized. Also, the images below (though you can click on them to make them larger) are low resolution, so loading up the network directly in BioFabric using the downloadable .bif file is the best way to explore these countries in detail.

First, look at Peru, which is shown in detail below. From the shape of the edge wedge, i.e. the slope changes along the bottom of the edge wedge, we can instantly see that maybe 90% (just eyeballing it here) of suppliers are in Peru, and maybe two thirds of suppliers have single contracts. There are a fair number of suppliers with two to three contracts, but there are very few super-suppliers with many contracts. A close look shows that there are only 11 suppliers with more than three contracts, with one supplier having seven, and the biggest supplier having 11. Finally, we see the tiny number of contracts with the global-player suppliers. Note: if you look closely at the Peru data inside BioFabric, you will see that half of those few global-player suppliers in this case are actually based in Peru, so it is important to keep in mind that "global player" is not synonymous with a contract out of a country, though that is almost always the case.

Peru: Almost all in-country, few super-suppliers

Quickly contrast this with Nicaragua, shown below. A much larger fraction are multi-contract suppliers; maybe half, with about 40% single suppliers. Furthermore, the very sharp point on the left instantly reveals some super-suppliers with many contracts, which is in stark contrast to Peru's profile. Close inspection reveals that there are nine suppliers with 14 or more contracts, and the biggest supplier has 37 contracts. Finally, just like Peru, and in fact just like most Latin American countries (see the May 22nd post), Nicaragua has very few contracts with the global player suppliers.

Nicaragua: Almost all in-country, some big super-suppliers
Next, let's look at Madagascar. It looks like about half the contracts go out of the country, and a big chunk (maybe 15 percent?) go to the global player suppliers. Most of the out-of-country contracts are to multi-contract suppliers. And, like Nicaragua, the sharply pointed left side indicates a few big super-suppliers.

Madagascar: About half to out-of-country suppliers
Finally, consider Ethiopia. Maybe 80% of all the contracts go to suppliers out of the country, and the large multi-contract suppliers are out of country as well. There are no in-country super-suppliers, which we can glean from the blunt angle on the left of Ethiopia's wedge. Finally, like other African countries (again, see the May 22nd post), Ethiopia has many contracts with the global player suppliers.

Ethiopia: 80% out of country, many global player suppliers
One caveat on this World Bank network example I need to acknowledge is that I have just been talking about the number of contracts, and not in any way accounting for the dollar amounts of each contract (you can see dollar amounts in the tag on each contract edge when you view it in BioFabric). One can argue that dollar amount is the relevant metric to be using here, and I won't disagree. But I've been trying to keep this example simple, and it is possible to address dollar amounts in another fashion that I will cover in a future blog post.

Finally, this particular network has the unusual feature that most nodes (the suppliers) are each uniquely connected to just one borrower country, which allows those supplier nodes to be ordered independently, and precisely, on a country-by-country basis. Thus, this network  lends itself to the compact, highly organized edge wedges that I have been showing. Most network topologies are not nearly as cooperative, but it is still possible to organize meaningful edge wedges. One powerful tool for doing that easily is link groups, which are mentioned in the BioFabric paper. I will cover those in a future blog post, as well.

I hope this example has shown that a careful and well-thought-out approach to organizing BioFabric edge wedges allows the viewer to rapidly extract and compare network features. With that, it's time to finally move past the World Bank network and onto other data sets I have in the pipeline for future blog posts, but keep this network in mind as you go forth and practice entrancing wedgecraft

Saturday, June 22, 2013

BioFabric Boffin BoF!

My blog postings have been mighty thin (as in non-existent) so far this month, as I've been traveling for work. I'm now writing my next post, and it should show up soon (and I've got about a half-dozen in the pipeline right now).

This post is just a heads-up that I'm organizing a BioFabric Birds of a Feather (BoF) session at the upcoming 21st Annual International Conference on Intelligent Systems for Molecular Biology/12th European Conference on Computational Biology, i.e. ISMB/ECCB 2013. The conference will be in Berlin, Germany from July 21-23, 2013. The BoF sessions are scheduled for Monday July 22, 5:40 PM - 6:40 PM, with the rooms still to be announced.  Hope to see you there!