Friday, October 4, 2013

How do I View Yeast? Let me Count the Ways

In my previous posts on the Caltech dorm network, I used link groups to separate intra- and inter- dorm links into distinct sets. I now want to shift away from social networks for a bit and look at biological ones. My goal will be to show how link groups can also be used to compare multiple experimental results in a single network view. That will take a couple posts; this first one will set the stage, and the follow-on will show how link groups can be applied. To do this, I will use network data from this paper from the Institute for Systems Biology:
Smith, J.J., Ramsey, S.A., Marelli, M., Marzolf, B., Hwang, D., Saleem, R.A., Rachubinski R.A., and Aitchison, J.D. Transcriptional responses to fatty acid are coordinated by combinatorial control Molecular Systems Biology, 2007,  3:115
The experimental data in the paper consists of two networks, which detail the targets of four different transcriptional regulators: Oaf1p, Pip2p, Oaf3p and Adr1p. They are labeled as O, P, Y, and A respectively. One of the networks is obtained under a yeast growth condition with low (0.1%) glucose, and the other is from a time point five hours after the sole carbon source has been switched to oleate (a fatty acid).

The original networks (Figure 1) show how there are relatively few targets that are under combined control in the glucose condition, and more complex control in the oleate condition. Go take a look at those networks, and then have a look at the BioFabric versions of these two networks. First, the glucose condition:


BioFabric Network Visualization of The Targets of Four Yeast Regulators
Click on picture for larger version
Note how the node ordering of the BioFabric network has been set so that the regulators appear in the top four node rows. (In both these examples, the node ordering was specified in a file using the Layout Using Node Attributes function.) All the target nodes are then arranged in the rows below these four regulators, in a very specific order, so that target nodes with the same input combinations will appear together in distinct, contiguous horizontal bands. You can think of it this way: with four inputs, there are 16 distinct input combinations (2^4). But since we are not showing any targets that are not regulated at all by these four, the 0000 state is omitted, leaving 15.

The 15 different input combinations can be represented by binary numbers, with a 1 indicating a target is regulated by one of the four inputs, 0 if not. With this scheme, we can sort the nodes using this number. Nodes with all four inputs are assigned binary number 1111 (= 15), and these are assigned to the top node rows of the fabric. At the bottom of the fabric, we assign the targets that are only regulated by A with the binary number 0001 (= 1). (When nodes have the same inputs, we sort them alphabetically by name.) Symbolically, the stack of sorted binary numbers looks like this:

OYPA
1111
1110
1101
1100
1011
1010
1001
1000
0111
0110
0101
0100
0011
0010
0001


If you now compare this pattern produced by a decreasing sort of the binary numbers 15 down to 1 with the above BioFabric glucose network, you will see the same pattern. You can check and see that there are no targets with all four inputs (it would be in the top row just under the four regulators). It is also obvious that the vast majority of targets have only one input, with O (Oaf1p) being the clear winner. You can also spot pretty quickly that there are only four targets with three inputs, just like in the original network diagram in the paper.

The next picture is the BioFabric version of the network under the oleate condition:

BioFabric Network Visualization of The Targets of Four Yeast Regulators
Click on picture for larger version


This diagram uses the same ordering scheme as the first. You can clearly see from the top node rows that there are now many targets with all four inputs (you can count 28). In fact, now all 15 of the different input states are represented. Finally, not only are there many more targets compared to the glucose condition, but the fraction of targets under the control of more than one regulator has increased.

So that's an introduction to the data and to the basic approach I'm using for ordering the node rows. In the next post, I'll discuss how to use link groups to take these two separate networks and combine them into one.