Saturday, September 7, 2013

New Kids in Town?


My attempts to keep the blog fresh and current this summer? EPIC FAIL! But I have a backlog of post topics, so I hope that this post marks the end of the dry spell.

In my last post, so very long ago, I introduced my BioFabric version of the Caltech Dorms Facebook Network, where nodes are students, edges are Facebook friend relationships, and the students have been grouped by dorm. Additionally, the edges for each student are grouped into two separate edge wedges: the first (left) one is for friend connections within a dorm, and the second (right) one is for connections between dorms. For a better view of that network, go to the scrollable version, but be warned that it is kinda big: 3.8 MB. 

There are a few interesting things in that network, so I'll be spending a couple of blog posts covering them. The first one is pretty simple, and you can spot it easily while scrolling across the network. It's at the tail end of the Dorm 5 cluster, and it shows up in the following figure. The figure shows two separate pieces of the network, divided by the vertical blue line, that are aligned so the nodes match up:

BioFabric Caltech Dorm Network Example
Click on picture to enlarge
Take a look at circled students 144 and 85, who are in Dorm 5; they are in the right half of the figure. What's interesting about them is that they look more like members of Dorm 4 than Dorm 5. For comparison, the left half of the figure shows some Dorm 4 students, and the horizontal red lines show the extent of the Dorm 4 cluster. Clearly, 144 and 85 have most of their friends in Dorm 4. And as the following detail shows, they don't know too many people at all in Dorm 5 (and they do know each other):

BioFabric Caltech Dorm Network Detail
Click on picture to enlarge

So perhaps we can hazard a guess that 144 and 85 are recent arrivals to Dorm 5, both coming from Dorm 4?

Of course, this sort of visual analysis can also be done using adjacency matrices that have been ordered to show the dorm groups on the diagonal. However, I will argue that the visual cues provided by the two-dimensional edge wedges of BioFabric makes them stand out better than a one-dimensional column of the matrix. This is particularly true when the resolution of the adjacency matrix falls below the threshold of one pixel per student, as we would expect in larger networks. Furthermore, at such resolutions, I think it would be very difficult to spot that a single column has a set of pixels in one set of rows (Dorm 4) while simultaneously missing pixels in another set of rows (Dorm 5).

Now let's see if I can manage to get my rate of blog posts back up to speed...

3 comments:

  1. Your interpretation of the students moving doesn't mesh with how the House system works at Caltech: students there don't move from a House where they do have friends to one where they don't, as you posited (if I understood your speculation correctly). A more likely explanation is that these people are in one House (the one _without_ their friends) but instead hang out with people in the other House. For some Houses, such a person might have become a "social member" or might eventually become one. (I think only some of the Houses do this.) But you could suggest that these people might _eventually_ change affiliation to the House where their friends are in practice, so this is something that could suggest _future_ moves rather than indicate recent ones.

    ReplyDelete
    Replies
    1. And to expand upon my comment below, I finally put two and two together, and realized you are indeed a true domain expert on the paper I'm using. Thanks for following along! I've got a couple more posts on this network in the pipeline. Hope you find the BioFabric visualizations useful for investigating your data set.

      Delete
  2. Agreed, that is a much better explanation. Great example of how domain knowledge is crucial to understanding a system. Thanks for the comment.

    ReplyDelete