Saturday, April 13, 2013

The Ongoing Contemplation of Entangled Banks


Let's continue the discussion of the World Bank Major Contract Award network that I introduced in my last blog post.

To recap, the network is from a database of contract awards, where each row in the database creates an edge that is basically of the form (Borrower Country)  --> (Supplier Country:Supplier), and so the network nodes are either borrower countries or suppliers. By my count, there are 156 borrower "countries", though that group includes supranational entities like "Africa", "South Asia", "East Africa and Pacific", and even "World". All the rest of the 44,213 nodes are thus suppliers. Here's the 30,000 foot BioFabric view of the whole network:

World Bank Major Contract Awards 2007-2013: BioFabric
Click on image to enlarge


And here again is a close-up of looking at one wedge, for the country Niger:


Niger Screenshot, World Bank Major Contract Awards 2007-2013: BioFabric
Click on image to enlarge



There are a few important observations we can make about the nodes:

  1. A large fraction of the suppliers only show up once in the database. In terms of the network, that means those supplier nodes have degree one, with only one inbound edge. In terms of the BioFabric version, it means there is no visible node line for those nodes. The inbound edge terminates at the end glyph on the node's assigned row, and the node label is shown, but there does not have to be a line drawn. Thus, the network as a whole is mostly devoid of node lines.
  2. For those suppliers who do have multiple inbound edges, almost all of those are suppliers to a single borrower country. The net result of this fact, combined with the preceding one, is that for each borrower country, the great majority of attached nodes are exclusive to that country. From the BioFabric perspective, each of the 156 borrower countries has its own separate edge wedge, and those wedges are mostly self-contained communities consisting of the borrower country and its exclusive suppliers.
  3. Among the suppliers, there are some "global players" who have one or more contracts with more than one country. In the BioFabric network, these global players show up at the very bottom of the view, with node lines that span a good fraction of the width of the network. It turns out that almost every borrower country has some contracts to these global players, and it is these edges that appear as the long vertical "umbilical cord" leading down from each borrower to the common substrate of these global suppliers.

Note that I said that "almost every" borrower country has contracts with the global players. How many don't? That's pretty easy to answer with just the scrollbar and Ctrl-mouse drags (Command-mouse drags on the Mac) when viewing the model in BioFabric: I count 13 countries that don't link to the global players. Almost all these are the tiny wedges right near the lower right. The biggest wedge meeting this description is Bolivia, which you can spot with the naked eye even in the low-resolution global view above (after you click to enlarge!). It is about 70% across, going left to right.

User Tip: By the way, Ctrl-mouse drags, or Command-mouse drags on the Mac, are essential navigation tools!  With really big networks, the scroll bars become too sensitive to be really useful when you are zoomed in.  But those mouse drags while holding down the Ctrl or Command keys are always useful and scale-appropriate.

The network is drawn using a custom layout that was created simply by specifying a special node-row ordering. The country nodes were ordered, top to bottom, by decreasing degree, which is why the wedges get smaller as you go from top to bottom. Immediately below each country node, the supplier nodes exclusive to that country were laid out. Finally, the global supplier nodes were assigned to the bottom rows, again according to decreasing degree.

Thus, we have the distinctive shape of the network, and actually an immediate optimization leaps to mind. Since the country wedges are completely independent above the shared global substrate at the bottom, we could collapse the vertical dimension of the network by reusing node rows across countries. In other words, the long umbilicals could all be eliminated, with all the country wedges sitting directly over the shared global nodes. The current version of BioFabric can't do that, since it is hardwired to provide every node in the network with an exclusively assigned row, even if the node has degree one and does not require an explicitly drawn node line. This is what results in the stair-step appearance, since each of the 44,213 nodes goes into its own row. It's an interesting possibility for a future enhancement to allow for sharing node rows, but I am of two minds about this, since it removes the iron-clad "one node per row" rule in favor of a more compact, but more ambiguous representation. I'm also pondering the possibility, and advisability, of allowing edge column sharing as well. But the fact is, the first BioFabric prototypes allowed for this sharing in an attempt to compress the representation, and the results were confusing, not compelling. But this network is an example of a special case where the reverse may be true, so perhaps the feature should be allowed, but not encouraged?

Also, I mentioned above that this custom layout only required that I specify the node order. In fact, it seems to frequently be the case that the default edge-drawing algorithm does perfectly well creating a good custom layout after only needing to specify the node order.

That's it for tonight. I've tried to describe the logic behind the large-level structure of the BioFabric version of the network. The next installment will dive in and look at the detailed properties of the edge wedges, and what they can tell us about each borrower country!

No comments:

Post a Comment