pmic7074

Proteomics 2012, 12, 1669–1686 1669DOI 10.1002/pmic.201100454
REVIEW
Visualization of the interactome: What are we
looking at?
David C. Y. Fung1
, Simone S. Li1
, Apurv Goel1
, Seok-Hee Hong2
and Marc R. Wilkins1
1
New South Wales Systems Biology Initiative and School of Biotechnology and Biomolecular Sciences, The
University of New South Wales, New South Wales, Australia
2
School of Information Technologies, Faculty of Engineering and Information Technologies, The University of
Sydney, New South Wales, Australia
Network visualization of the interactome has been become routine in systems biology research.
Not only does it serve as an illustration on the cellular organization of protein–protein inter-
actions, it also serves as a biological context for gaining insights from high-throughput data.
However, the challenges to produce an effective visualization have been great owing to the
fact that the scale, biological context and dynamics of any given interactome are too large and
complex to be captured by a single visualization. Visualization design therefore requires a
pragmatic trade-off between capturing biological concept and being comprehensible. In this
review, we focus on the biological interpretation of different network visualizations. We will
draw on examples predominantly from our experiences but elaborate them in the context of
the broader ﬁeld. A rich variety of networks will be introduced including interactomes and the
complexome in 2D, interactomes in 2.5D and 3D and dynamic networks.
Keywords:
Bioinformatics / Interactome / Network visualization / Protein–protein interactions /
Systems biology / Visual analytics
Received: August 31, 2011
Revised: November 28, 2011
Accepted: December 19, 2011
1 Introduction
Intense interest in understanding the cellular organization of
protein–protein interactions (PPIs) has motivated the large-
scale projects of PPI mapping using a variety of methods
[1–3]. The massive scale of the binary interaction data gener-
ated has enabled the construction of interactomes. It has also
spurred the need for in silico network visualization because it
provides a simpliﬁed summary of what is otherwise a lengthy
adjacency list of protein pairs [4]. Network visualization thus
becomes an essential part of systems biology, not least an
essential analytical tool.
Correspondence: Professor Marc Wilkins, New South Wales Sys-
tems Biology Initiative, School of Biotechnology and Biomolec-
ular Sciences, The University of New South Wales, New South
Wales, 2052, Australia
E-mail: m.wilkins@unsw.edu.au
Fax: +612-93851483
Abbreviations: GO, Gene Ontology; PPI, protein–protein
interaction
Ideally, an effective interactome visualization should lever-
age the investigator’s ability to comprehend the collaborative
roles of various proteins in delivering cellular functions. How-
ever, this ideal is challenging to meet. As will be reviewed in
this paper, there is no universal method for visualizing a PPI
network or an interactome. Each method has its strengths
and limitations. The choice of method depends on the hu-
man and material factors. The human factors are the investi-
gator’s analytical objective, one’s expert knowledge in his/her
research discipline and his/her cognitive capacity to resolve
visualization scale and complexity. The material factors are
screen space and computational tractability of the visualiza-
tion rendering.
In this review, we will not review popular tools that have
been used for generating network visualizations, e.g. Cy-
toscape [5], VisANT [6], Osprey [7], ProViz [8] and Patika
[9] or commonly used graphical layouts and their variants,
e.g. circular [7, 10], force-directed [11, 12], hierarchical [13]
and parallel level layouts [14]. These have been reviewed else-
where [15]. For a more updated review on visual network an-
alytics of ‘omics data, we recommend the more recent pub-
lication [16]. We think that it will be of more value to the
C 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.com

1670 D. C. Y. Fung et al. Proteomics 2012, 12, 1669–1686
molecular cell biologist (aka. the investigator) if our review
focuses on the biological interpretation of different visual-
ized PPI networks and interactomes. Only graph-based visu-
alization will be discussed because most investigators first
familiarized themselves with the node-edge representation
of metabolic networks in the orthogonal layout. Therefore,
graph-based visualizations match their mental picture on
PPIs much better than the alternative adjacency matrices.
The remainder of this review is divided into five sections.
In Section 2, we give a formal definition of the PPI network vi-
sualization. In Section 3, we review the various visualizations
of the most basic representation of a PPI network, i.e. undi-
rected networks representing only the physical PPIs with-
out explicitly displaying any a priori biological knowledge. In
Section 4, we give our formal definition of an interactome be-
fore reviewing the various visualizations. These are visualiza-
tions that involve not only physical PPIs but other intracellu-
lar networks, e.g. signal transduction networks. In Section 5,
we review timescale dynamic network visualizations. These
are ones that visualize changes in network topology induced
by the temporal variation in protein abundance. In Section 6,
we review visualizations that explicitly present biological con-
text as part of the interactome. Finally, we will discuss the
general challenges of generating effective visualizations of
interactomes for biological analysis. In the review, we will
draw on examples predominantly from our experiences in
network visualization but elaborate on them in the context
of the broader field. The visualizations discussed in Sections
3.1, 4.2, 4.4, 5.2 and 6.3 were generated by the authors using
GEOMI [17]. The Interactorium mentioned in Section 4.3 is
an OpenGL/C++ application known as SkyRail. It should be
noted that, throughout this review, we do not use the terms
‘PPI network’ and ‘interactome’ interchangeably because we
have defined them differently.
2 Visualization is a model
In the simplest terms, a PPI network can be defined mathe-
matically as a node-edge graph denoted as G(V, E), also known
as the graph-theoretic model [18]. This model contains a com-
bination of nodes V and edges E where each node v represents
a certain protein. Each edge e represents the physical interac-
tion between the populations of two protein species v1 and v2,
i.e. physical PPIs in short. Rather than being an exact replica,
the graph G is only a symbolic form of a real PPI network
that approximates the variety of PPIs known to occur within
a living cell. Network visualization is therefore the process
of mapping the graph-theoretic model of the PPI network to
the visual elements drawn on a screen. In other words, the
node-edge graph is not visualized until its visual representa-
tion is being drawn by computation. That is why researchers
in the information visualization field recognize PPI network
visualization as largely a graph drawing problem.
Although the word ‘graph’ has often been used inter-
changeably with ‘network’, they are not the same. A graph
is just a combinatorial model of nodes and edges which, by
itself, does not model the function of a PPI network [19].
On the other hand, a network does because its edges repre-
sent interactions required for the functioning of one or more
pathways. For this reason, the graph-theoretic model of a PPI
network is usually a labelled graph. It has node and edge at-
tributes attached, i.e. the graph G is a set of V, E, ⌽ and ⌿,
denoted as G = (V, E, ⌽, ⌿) where ⌽ denotes the set of node
attributes and ⌿ denotes the set of edge attributes. The node
attributes can include the protein symbol, expression level
and node colour. The edge attributes can include the inter-
action mode (activation, inactivation, induction, repression
or physical), affinity, weights quantifying a certain statistical
score (e.g. correlation coefficient), or time span [20]. These
data, when retrieved and co-visualized, allow the investigator
to have a better understanding of the network.
The basis of network visualization is comprised of glyphs,
colour hues, lines or arcs and their layout on screen. It is
meant to serve as a visual analysis tool for the investigator to
interact with, not just a graphical representation of PPI data.
As such, it can be used for visualizing the complex spatial or
the spatio-temporal interactions between proteins. Those fea-
tures raise the investigator’s curiosity on the general proper-
ties of an interactome, such as asymmetric self-organization
[21], modularity [22] and fault tolerance due to functional
degeneracy [23]. Asymmetric self-organization refers to the
heterogeneous preferential attachment among proteins lead-
ing to the uneven distribution of hubs within a network [24].
Modularity refers to the property by which a network can be
subdivided into interconnected subnetworks with each serv-
ing a certain biological function [25,26]. Functional degener-
acy refers to the overlapping functions of different modules
hence compensating for the lost function when one of the
modules fails [23]. Not least, a PPI network visualization can
form the basis of an interactome which involves (sub)model
integration [27]. This submodel can be a signal transduction
network or a metabolic network. As will be discussed in Sec-
tion 4, the integrated visualization of the physical PPI to its
corresponding signalling network or its underlying gene reg-
ulatory network is much more informative than just the PPI
network alone.
Since visualization is only a model rather than the exact
replica of a real interactome, there is a need to prioritize the
network properties for visualization purpose. In other words,
what should be the focus of a visualization? Network theo-
rists in complex systems research called this the centrality
of a model [27]. In simpler terms, what is the pre-conceived
biological concept that the model is trying to represent? It is
the major consideration in deciding on a visualization design
especially when it is intended for visual analytics [28]. Visual
analytics is the science of analytical reasoning supported by
highly interactive visual interfaces. It shares the same tenet
as data or information visualization, i.e. using an interactive
visualization or a system of visualizations that allow the in-
vestigator to gain insight into the information hidden in the
large-scale data [29]. As the investigator’s analytical objective

Proteomics 2012, 12, 1669–1686 1671
changes, so is the biological concept represented by the visu-
alization.
3 Visualization on the cellular scale
3.1 Undirected network visualization
The most common visualization of a PPI network is an undi-
rected network in the force-directed layout [11]. The yeast
PPI network shown in Fig. 1A is a classic example. Each
protein is visualized as a coloured spherical node and the
physical interaction between two proteins is visualized as a
solid line representing an edge. This visualization provides
a static view of the known physical PPIs in the unicellular
Saccharomyces cerevisiae, devoid of interaction dynamics. It
represents all PPIs as undirected edges giving the investiga-
tor the false impression that each protein interacts with its
partners constitutively and collectively. For this reason, one
can argue that the undirected network is the coarsest approx-
imation of its real-life counterpart. It should be noted that
while the visualization exposes the pairwise interaction be-
tween two proteins, it does not mean that there is only one
molecule of a given protein interacting with its partners. In
effect, it ignores the stoichiometry of the protein complexes.
Rather, it is meant to be a concise summary of the interaction
between the populations of two proteins of interest.
Although it is only a coarse approximation, the undirected
network visualization does provide insights into the global
properties of the yeast PPI network which would otherwise
be obscured in the original binary interaction data set [30].
The most obvious one, as shown in Fig. 1A, is the hetero-
geneous connectivity of proteins. Some have more incident
edges than others which mean some proteins are more fre-
quently engaged in PPIs than others. Hence, those with con-
spicuously higher number of incident edges than others in-
dicate their higher usage within the network. In the yeast
PPI network, the number of interaction partners can vary
substantially from one protein to another.
Another obvious feature of the undirected PPI network is
that some parts of the network are denser, giving the entire
network a heterogeneous topology (Fig. 1A). The denser parts
contain multiple subnetworks called cliques which are en-
riched in protein complexes [20]. They often represent large
protein complexes, e.g. DNA polymerase or a proteasome.
The subunits within each complex have been denoted as party
hubs [26], which are more highly connected to one another
within the complex than without. The sparser parts of the
undirected PPI network contain some spoke-and-hub forma-
tions where each centre node is connected to multiple nodes
that do not necessarily interact with one another. The cen-
tre node has been denoted as a date hub protein which has
been proposed to interact with its partners more dynamically
than party hubs [26]. Their disparity in interaction dynamics
has recently been explained in terms of differences in their
3D structures. Date hubs have one or two interaction inter-
faces whereas party hubs have three or more [31]. Proteins
serving as connectors between multiple complexes have been
denoted as bottleneck proteins [32]. It has been suggested that
bottleneck proteins can be classified into hub-bottleneck and
non-hub bottleneck proteins.
Trying to visually identify date and party hubs is not with-
out its caveat, the investigator may notice that it is easier
to visually distinguish between date hubs and party hubs in
a large-scale PPI network than to distinguish between date
hubs and different types of bottleneck proteins. One example
is highlighted in Fig. 1A (dark blue boxed inset) where the dis-
tinctly sized protein node can be recognized as a party hub,
a date hub and also a hub-bottleneck protein. Even harder
to distinguish are the date hub and the non-hub bottleneck
protein. That is because both appear in a hub-and-spoke for-
mation in the force-directed layout. An example is also high-
lighted in Fig. 1A (green boxed inset). Because of its higher
edge density, a clique often draws the investigator’s atten-
tion more than the hub-and-spoke formation. If the yeast PPI
network is reduced to the size seen in Fig. 1B, the investi-
gator should find it easier to distinguish between the date
hub PCNA, and the hub-bottleneck protein CDC6 (Fig. 1B).
Here lies the limitation of undirected network visualization
in the force-directed layout. Sampling scale and/or bias can
affect the investigator’s perception on the local topology of
any proteins within the wider PPI network, and may lead to
very subjective interpretation.
3.2 PPI network visualization with topological
emphasis
The limitations seen with undirected network visualization
is largely due to its lack of biological context. An alternative
design adaptable to PPI network visualization is the between-
ness fast layout (BFL) which uses the biological relevance of
the shortest path betweenness centrality as a layout optimiza-
tion criterion [33]. The metric of betweenness centrality was
originally used for measuring the number of shortest paths
going through a certain node [34]. The design criterion of
BFL was based on the proposition that betweenness central-
ity is a useful predictor of essential proteins [32]. In the yeast
PPI network, nodes with the highest betweenness centrality
have been found to be bottleneck proteins that serve as con-
nectors between two or more complexes. It has been further
suggested that complexes interacting via bottleneck proteins
are in fact functional modules [35]. Recent study has sug-
gested that the biological context of betweenness centrality is
also applicable to Caenorhabditis elegans [36], thus increasing
the confidence that betweenness centrality is a universally
applicable metric for all species.
The BFL algorithm optimizes the positioning of high
betweenness nodes as the first priority followed by node
density, edge length and edge crossing minimization. An
example of a murine gene regulatory network visualiza-
tion generated by this method is shown in Fig. 2A. The

Figure 1. Visualization of the
yeast PPI network in the force-
directed layout generated with
the use of GEOMI. (A) The
largest connected component
of the yeast PPI network, rep-
resenting 1256 proteins and
1803 interactions. The network
was generated using yeast two-
hybrid data compiled by Bertin
et al. [78]. An example of a
clique is highlighted in red and
bound by the dark blue box.
The larger node may represent
a party hub, date hub or hub-
bottleneck protein. An example
of a date hub is highlighted
in the green box. (B) Visual-
ization of the DNA replication
(GO:0006260) PPI network, rep-
resenting 55 proteins and 83 in-
teractions [96]. Inset: PCNA hub
protein (indicated by the green
arrow) and its interactions.
visualization effectively highlights the multi-scalar nature
of the gene regulatory network by rendering the nodes of
low betweenness as smaller in size than their high between-
ness counterparts. The effect is the reduction in the draw-
ing area occupied by the hub-and-spoke or radial forma-
tion of the date hubs with their interaction partners but
much longer edges than those in the force-directed layout
(Fig. 2B).
The BFL network visualization requires a prior under-
standing on the biological context of betweenness centrality,
which may not be widely known among investigators. Some
may ﬁnd it more intuitive to relate node degree size to lethal-
ity. Hence, multi-plane or concentric spherical layouts that
stratify the PPI network by node degree ranges could be us-
able alternatives to them [37]. An equally useful alternative
is to highlight network motifs in the PPI network [38]. This
has also been applied to gene regulatory networks but should
be applicable to PPI networks. The investigator should note
that in an undirected PPI network, network motifs repre-
sent the probable PPIs that give rise to protein complexes
[30].
4 Visualization of an interactome
4.1 Deﬁnition of an interactome
In reality, a cellular interactome contains not only physical
PPIs but also those in other interaction modes, e.g. sig-
nalling interaction, transcriptional regulatory interaction and

Proteomics 2012, 12, 1669–1686 1673
Figure 2. Visualization of the mouse gene regulatory network
using the betweenness fast layout algorithm [33]. (A) The size
of each blue coloured node corresponds to its magnitude of the
shortest path betweenness centrality score. Inset: Close-up view
of the mouse gene regulatory network. (B) The same network
drawn using the force-directed layout.
metabolic reaction [19, 20]. Because the latter are initiator-
effector relationships in which the initiator acts on its effec-
tor(s), they can be represented by directed edges and visu-
alized as solid arrows. An interactome should therefore be
modelled mathematically as a node-edge graph containing
both directed and undirected edges [30], i.e. a semi-directed
network [39].
4.2 Overlapping network visualization
In practice, the visualization of multiple interaction types
within the same network is cognitively challenging to com-
prehend unless the investigator can mentally decompose the
interactome into layers of heterogeneous networks. Figure
3 shows how such a mental picture can be captured effec-
tively with the use of overlapping network visualization in
the parallel plane layout [40,41]. The interaction data for the
TGFβ signal transduction network was sourced from Cui et
al. [42] and that for the nuclear PPI network was sourced from
BioGRID [43] and ECHO [44] databases. The visualization not
only represents the interactions in the human TGFβ-activated
signalling network and in the nuclear PPI network, but also
the PPIs that participate in both. Each network is constrained
to a 2D plane. The planes are being stacked along the z-
axis; hence called the parallel plane layout [40]. The oblique
view shows the mapping between the signalling and the nu-
clear PPI networks (Fig. 3B). The signalling network is being
drawn in a grid layout with fixed coordinates being assigned
to each node whereas the nuclear PPI network is being drawn
in the force-directed layout. Nodes in the signalling network
which share identical Gene Symbols with their correspond-
ing nodes in the PPI network are drawn on the middle plane
which forms the ‘overlap’ network. The edges within the over-
lap network are derived from both the signalling and the PPI
networks. Contrasting colour coding has been used to create
a visual segregation of the different networks. In order to re-
duce the scale of the nuclear PPI network, only the largest
connected component has been drawn [41].
The top and oblique views inform the investigator on how
the phosphorylation signal is being diffused from the sig-
nalling network to the nuclear PPI network extensively via the
highly connected hubs, e.g. ATM, CDK4, FOXO1A, HDAC1
and EP300 (Fig. 3A) [41]. The overlap layer reveals the proteins
commonly represented in the TGFβ signalling and nuclear
PPI networks, e.g. EP300, RB1 and HDAC1 (Fig. 3B). From
this, one can see how the parallel plane layout highlights
the physical connection between TGFβ signalling and the
nuclear PPI networks using inter-plane edges while expos-
ing their difference in interaction types [40]. Furthermore, it
decomposes a large semi-directed network into two smaller
and more comprehensible networks [41]. Although the exact
dynamics of TGFβ-regulated PPIs have not been explicitly
shown in the nuclear PPI network, the functional dependency
implied by the visualization would initiate the investigator to
construct preliminary hypotheses worthy of further investi-
gation.
Overlapping network visualization does have its limita-
tions. The first is the dimensional increase of the draw-
ing from 2D to 2.5D. 2.5D is a representation in which
graph drawing is constrained to the first two dimensions
with the third dimension being used for a different pur-
pose [45]. The investigator may find it challenging to navigate
through the visualization using a mouse pointer device be-
cause mapping its 2D movement to motion in 3D space is not

Figure 3. Visualization of the human signalling-nuclear PPI over-
lapping network using the parallel plane layout [40] generated
with the use of GEOMI. (A) Top view. This view exposes the TGFβ
signalling layer with a semi-transparent view towards the overlap
layer, and the nuclear PPI network layer underneath. The direction
of each arrow represents the direction of a phosphorylation reac-
tion. The nodes representing ATM, CDK4, FOXO1A, HDAC1 and
EP300 in the TGFβ signalling layer are indicated by red arrows.
(B) Oblique view. The TGFβ signalling is drawn on the top plane
and stacked over the nuclear PPI network on the bottom plane.
The signalling proteins on the top plane are represented by blue-
coloured nodes and the signalling interactions are represented
by blue arrows. On the bottom plane, proteins and their phys-
ical interactions are represented as green-coloured nodes and
edges, respectively. The network in the middle plane represents
the overlap between the signalling and the nuclear PPI networks.
Red-coloured nodes in this plane represent proteins common to
the two networks. The blue lines represent the signalling inter-
actions, and the green lines represent physical PPIs. Nodes that
represent the same proteins are connected by yellow edges, e.g.
EP300, RB1 and HDAC1 are indicated by red arrows.
Figure 4. Yeast DNA-binding protein Rap1 and its interaction
partners in the nucleus, generated with the use of the Interacto-
rium [47]. Protein nodes are represented by circles in bright red;
those with structural data are highlighted by a small green cross
next to their gene name. PPIs are represented as the light pink
solid lines; the thickness of the lines correlates to the evidence
score of the interaction. The two crystal structures of the Rap1-
DNA complex, 3CZ6 (left) and 1IGN (right), are sourced from the
PDB [48]. They are shown in the ribbon and string conformation.
straightforward [46]. The second is the augmentation of vi-
sual complexity with the increase in the network size on each
plane. We found that large PPI networks of over 1000 nodes
and 2000 edges are poor choices for visualization of this kind.
The sheer amount of edge cluttering and node occlusion can
make the visualization unreadable. The key design criterion
lies in restricting the size of the network drawn on the top
layer.
4.3 Interactive 3D visualization
Although the 2.5D overlapping network can effectively cap-
ture the functional relationship between heterogeneous
networks, it does not capture the multi-scalar physical modu-
larity of an interactome. It is comprised of numerous subnet-
works localized in a highly organized set of compartments.
The compartment can be a protein complex, an organelle
or any subcellular localization. One tool that enables the
investigator to navigate in between an overview and a de-
tailed view of an interactome, and trigger on-the-fly protein
data display on demand is the Interactorium [47]. The entire
visualization resembles a video game application that pro-
vides smooth multi-scale navigation, including fly-through
(zooming in and out) and fly-over, throughout the network,
and provides automatic centering on any selected protein.
Any given protein, along with its known interactions, can
be viewed in the context of a virtual cell, a virtual organelle
or a protein complex. The latter two are represented by dif-
ferent geometric shapes highlighted by object radiance (Fig.
4). The multi-scale visualization extends to the level of pro-
tein 3D structures. On-the-fly display of 3D structures can be

Proteomics 2012, 12, 1669–1686 1675
triggered by the mouse pointer action over the green cross
next to the Gene Symbol (Fig. 4).
Figure 4 shows a view focusing on the S. cerevisiae DNA-
binding protein Rap1 with three of its five interaction part-
ners. Rap1 is shown to be part of a single protein complex
with Rif1, Rif2 and Gcr1, since all four nodes are bound by the
red dashed circular node. One can further deduce that this
complex is located in the nucleus; since the red dashed cir-
cular node is itself bound within a large spherical node with
a contoured surface representing the cell nucleus. Figure 4
also shows how relevant structural information of a selected
protein can be visualized. Where multiple 3D structures exist
for a particular protein, these can be compared side by side.
Figure 4 shows the two PDB entries of Rap1 with 3CZ6 on
the left and 1IGN on the right [48]. Structure 3CZ6 shows the
structure of the C-terminus of Rap1 [49] whereas 1IGN mod-
els the interaction of Rap1 with telomeric DNA sequences
[50].
Interactive visualization is particularly useful in ex-
ploratory analyses whose primary aim is hypotheses genera-
tion [51]. With large networks, this can be cognitively taxing
due to the overwhelming amount of information present,
not to mention the complexity of the visualization. Network
filtering and selective information display remain indispens-
able for alleviating cognitive burden, and hence reduces the
steep learning curve exerted when exploring complex net-
works. The Interactorium provides various filtering criteria
for scale reduction [47]. The investigator can filter networks
by cellular localization, quality and/or quantity of evidence,
or membership in protein families or protein complexes, or
a combination of these. It should be noted that network fil-
tering will lead to a loss of information. Therefore, caution
on the investigator’s part is required when analyzing the fil-
tered network at multiple scales. The closest 2D alternative
that we know of is Patika which uses compound graphs. It
has been used for visualizing selected pathways instead of a
global interactome [9].
4.4 Complexome network visualization
Recent advancement in detecting PPIs has enabled the sys-
tematic identification of protein complexes [52–55], which
place proteins in a cellular context. Initial complexome rep-
resentations were difficult to interpret and could not accom-
modate proteome-wide studies [56–58]. With the progression
of protein interaction studies to higher organisms that have
over 2000 proteins, the need for scalable methods of network
visualization is becoming increasingly clear.
Since many processes inside the cell are orchestrated by
protein complexes, a shift in visualization from the protein
molecular level to that of the complex offers an alternative and
meaningful representation that increases our understanding
of cellular function. Unlike interactomes, a node in the com-
plexome network represents a protein complex (a unique set
of proteins), but what should an edge represent here? A con-
Figure 5. Visualization of the yeast complexsome [62] generated
with the use of GEOMI. (A) Towards complexome visualization. In
PPI networks, a highly interconnected group of proteins is often
thought to denote a single protein complex, which may or may
not be biologically correct. In this example, the network actually
represents two complexes (shown by the orange and turquoise
circles) that share one protein. This parity relationship can be rep-
resented as a pair of nodes, representing two complexes, linked
by an edge. (B) Visualization of the yeast complexome repre-
senting 398 complexes and 992 parity relationships using the
force-directed layout. Parity relationships involving only core and
module protein subunits are displayed.
sensus method for intuitively linking protein complexes to-
gether has yet to be established. This is due to the fact that
proteins within a complex may not necessarily have a direct
interaction with all other subunits. Network representations
to date have connected protein complexes using: (i) binary
interaction data from other studies (such as yeast two-hybrid)
[59, 60] or (ii) the presence of common subunits [61, 62]
(Fig. 5A). Although biologically intelligible, these methods
have their respective limitations. Insufficient overlap between
current PPI and protein complex data sets [63] means that the
resulting networks may lack cohesion. Furthermore, interac-
tion detection methods are often biased towards proteins with
particular physicochemical properties. Consequently, result-
ing networks often contain highly connected parts, represent-
ing proteins with many binary interactions that are present

in multiple complexes, which may or may not be biologi-
cally relevant. One alternative complex-centric visualization
involves connecting complexes that share protein subunits.
This, in effect, is a parity network. The term ‘parity’ originally
referred to the quality of sameness or equivalence in message
transmission from node to node through a computer network
[64]. Using this type of network to represent the complexome
can still be visually complex and incomprehensible. The high
degree of subunit sharing between complexes imparts a high
edge density to the network visualization, giving rise to large
amount of edge over edge, and edge over node, crossings.
These issues can be resolved by limiting the number of con-
nections drawn using an arbitrary threshold [59–61], e.g. if
the complexes share at least two common protein subunits,
or if they contain subunits with at least two known binary
PPIs. This, however, may result in the loss of biologically rel-
evant connections in the network. Yet another alternative is
to present only those protein subunits that are strongly as-
sociated with the complex (deemed ‘cores’ or ‘modules’) and
exclude those that are not (‘attachments’) [54]. A recent study
demonstrated that a comprehensible and biologically accu-
rate complexome network representation can be achieved by
using shared core or module proteins to build inter-complex
connections [62] (Fig. 5B).
5 Dynamic interactome visualization
5.1 Purpose
The visualizations discussed so far can only provide a static
view of detectable PPIs. They can depict the combinatorial
complexity of a given interactome but not its temporal dy-
namics due to fluctuations in protein abundance. It is highly
unlikely that every PPI will be constitutively active through-
out cellular life in the face of external challenges. Rather,
PPIs are orchestrated in a temporal order. For this reason,
visualization of dynamic PPI networks is urgently needed
[65]. Time-course data sets have been generated with the use
of high-throughput microarray technology, providing a data
source of sufficient scale for the construction of dynamic net-
works. Hence, gene expression rather than protein expression
data have been commonly used as a proxy for protein abun-
dance. What is being visualized in a dynamic network is the
temporal fluctuation in transcript abundance being superim-
posed on a PPI network.
The visualization of temporal networks can provide insight
into the dynamic processes of the living cell [66,67]. The topo-
logical changes (often known as phase transition) [68] can
inform the investigator on the functional ordering of sub-
networks through time thereby tracking the pathway depen-
dencies of a developing phenotype or disease progression. In
the immediate future, we would expect dynamic networks to
have great potential applications in areas where the evolution
of cell lineages is being studied [65], e.g. cell transformation
in malignancy [69], lineage commitment of pluripotent stem
cells [70], dose–response studies or in case-control studies
where time-course measurements have been made [71, 72].
It is noteworthy that the visualization methods introduced in
this section are applicable to any comparative studies involv-
ing multiple organisms, tissues or cell types, e.g. correlated
gene expression dynamics between mouse tissues [73].
There are two ways of visualizing time-course dynamics.
One approach is to apply the 2.5D visualization in the parallel
plane layout by stacking the network drawings at different
time points together, thereby displaying the gradual transi-
tion from one to another topology through time (Fig. 6). The
advantage of this approach is the explicit display of all the
topological change throughout the time course in a single
drawing but shares the same limitation seen with 2.5D visu-
alizations (Fig. 3). Application of this approach, to the best
of our knowledge, has been limited to metabolic pathway
dynamics [74].
5.2 Animated network visualization
The other approach is to dynamically render the relevant
nodes and edges of the PPI network at different time points;
and then visualize the time course as an animation [75]. Nodes
and edges are hidden or made opaque without changing node
positioning, thus maintaining the investigator’s initial men-
tal picture built in the very first time frame. Such type of
animation, known as a static flip book [76], allows one to di-
rect his/her attention to study the subnetworks that exhibit
topological changes. An example is shown in Fig. 7, exhibit-
ing the dynamics of DNA mismatch proteins in cell cycle
progression based on a published time-course data [77]. The
underlying network is a published yeast PPI network [78]
drawn in a force-directed layout.
The temporal sequence in Fig. 7A shows the expression
dynamics of the DNA mismatch proteins as node colour
changes only. The re-colouring of nodes is triggered only
upon a non-zero change in expression value from one time
point to another. Visual inspection of Fig. 7A would help the
investigator to identify the four similarly expressed proteins,
i.e. Msh2, Msh6, Pms1 and Pol30, which interact with their
statically expressed partners. The temporal sequence also ex-
poses the fluctuation of the above proteins during cell cycle
progression. At the 30 min time point, all four proteins are
highly expressed followed by a markedly decreased expres-
sion at the 45 min time point. The three proteins then regain
their expression levels at the 75 min time point. Therefore the
decrease in expression occurs at the G2/M phase of the cell
cycle. The most obvious limitation of this visualization lies
in the use of chromatic scaling for representing expression
dynamics, but human perception is more adept at detecting
dimensional scaling, e.g. node size, node shape and edge
length [79]. An alternative design could use dynamic scaling
on node size in a dual colour mode throughout the temporal
sequence.

Proteomics 2012, 12, 1669–1686 1677
Figure 6. 2.5D network visual-
ization of the glycolytic-Krebs
cycle pathway dynamics in
Hordeum vulgare during seed
development over a period of
20 days [45]. The visualiza-
tion was generated with the
use of Wilmascope [112]. Path-
way drawings at successive
time points are stacked sequen-
tially along the z-axis with each
pathway being drawn on the
x-y plane. Each level repre-
sents a 2-day interval starting
from day zero. Each disc-shaped
node represents a metabo-
lite on the glycolytic path-
way schema (right); its size is
scaled to the empirical quan-
tity of that metabolite. The ﬂux
dynamics between fructose-
1,6-bisphosphate and the 3-
phosphoglycerate through time
is represented by small ma-
genta arrows between the three
cylinders (indicated by the
green arrow). In the schema
(right), each metabolite is pre-
sented by a rectangular node
coloured according to its path-
way membership. Red = glycol-
ysis; green = Krebs cycle; blue
= amino acid biosynthesis.
The same time course can also be visualized as an ani-
mation [75]. This gives a much clearer impression on the
disruption of the DNA mismatch recognition protein com-
plex represented by this subnetwork throughout the cell cycle
(Fig. 7B). Nodes and edges are visually hidden by manipulat-
ing opacity in the animation using a user-selectable thresh-
old. This functionality maps to the investigator’s assumption
that a certain interaction may be disrupted if the participat-
ing proteins are expressed below a certain level. The nodes
representing Msh2, Msh6, Pms1 and Pol30 are hidden based
on the assumption that their underexpression will eliminate
PPIs among themselves and with their interaction partners.
It has been known that Msh2 and Msh6 are subunits of the
MutS␣ complex and Pms1 interacts with Mlh1 to form the
MutL␣ complex [80]. Pol30 has been known to interact with
MutS␣ and MutL␣ complexes, and acts as the docking site
for subunits required for DNA replication and repair [81].
The animated sequence is obviously suggesting that the DNA
mismatch repair function is being downregulated during the
G2/M phase. This deduction complements the current un-
derstanding that DNA mismatch repair recognition proteins
are upregulated during the preceding S phase for efﬁcient
mismatch recognition and repair [82].
The above example demonstrated how the animation of
network changes can elicit biological insight. It allows the
investigator to identify the dynamics of different parts of the
interactome in cell cycle progression. One will then be able
to deduce the regulatory mechanism underlying the PPI dy-
namics observed.
5.3 Dynamic focus + context visualization
A recently published tool on dynamic network visualization
is the TVNViewer [83]. It does not offer any novel network lay-
outs but its functionality allows the spatio-temporal dynam-
ics of a PPI network to be visualized as a multidimensional
model. The design concept is closely related to the idea of

Figure 7. Visualizing the dynamic gene expression of DNA mismatch recognition proteins during the cell cycle [75] generated with the use
of GEOMI. (A) Rendering using changes in node colour. Red = upregulation; green = down-regulation. (B) Animated dynamics by real-time
rendering. Note that the nodes representing MSH2, MSH6, PMS1 and POL30, along with their interactions, are hidden throughout the
visualization as their expression levels are below the preset threshold of 0.45.

Proteomics 2012, 12, 1669–1686 1679
Figure 8. TVNViewer dynamic network visualization [83]. Temporal changes in interactions between the hypothetical COMPLEX_2424
(indicated by the green arrow) and various GO nodes through the cell cycle phases.
using a meta-network as the first dimension and the un-
derlying physical network as the second dimension with the
attempt to better expose the dynamic yet complex functional
dependencies among proteins. The same concept has been
applied previously to the modelling of socio-technological
complex systems for interests in national defense and security
[84]. The meta-network here is the Gene Ontology (GO) [85]
informational network containing GO nodes and meta-edges.
Each GO node encompasses a set of proteins annotated with
the node-specific GO term. The GO term can be a member
of either the Biological Process, Molecular Function or Com-
ponent category. Each meta-edge visualized as a solid curve
represents a meta-interaction. This is an abstraction of the in-
teractome where the actual PPIs are not explicitly visualized
but the ‘meta-interactions’ are, such that two GO nodes are
considered to be interacting if and only if they share the same
set of PPIs (see [27] for the original definition) [86]. Since
many proteins are annotated with multiple GO terms, the re-
lationship between a PPI and a GO meta-interaction should
be an m:n relationship. In other words, a given GO meta-
interaction is an abstraction of multiple PPIs and a given PPI
may be abstracted by multiple GO meta-interactions.
From the perspective of information visualization, TVN-
Viewer is a tool for providing Focus + Context visualization
in which the GO informational network provides the biologi-
cal context around the protein nodes, i.e. the focus, selected by
the investigator [87]. Figure 8 provides an example generated
from a synthetic cell cycle data set [83]. It shows the result of
using the mouse pointer action to achieve details on demand
by exposing protein members hidden within a GO node. The
transformed network has the selected protein node and their
GO counterparts arranged in a two-level circular layout with
the former node type being positioned along the circumfer-
ence of the outer level and the latter being positioned on the
inner level. The temporal sequence shows the stage-specific
dynamic interactions among different GO nodes through the
five phases of the cell cycle. Temporal changes in interactions
among the hypothetical COMPLEX_2424 and the various GO
nodes are displayed as the dynamic rendering of edge opacity.
In this way, the context of any selected focus whether a single
protein node or a PPI subnetwork is never lost.
Generally speaking, the biggest barrier against usability
of dynamic network visualization is the erosion of the in-
vestigator’s mental picture [88]. The longer the animated se-
quence or larger the network; the more challenging it is to
preserve the investigator’s original view. This is even more
pronounced with dynamic Focus + Context visualization be-
cause of its higher visual complexity than the mere interac-
tome in a semi-directed network. The other limitation lies in
the use of transcriptome data as a proxy for protein abun-
dance. While transcript abundance correlates with protein
abundance in a broad sense, the impact of degradation, trans-
lational control and post-translation modifications on individ-
ual proteins cannot be ignored. Recent work demonstrated
that the correlation between transcriptomic and proteomic
expression variation is either weak [89, 90] or is highly con-

ditional [91]. Therefore, the temporal dynamics visualized
do not necessarily reflect the real-time PPI dynamics. The
increased availability of proteome-scale time course protein
abundance data in the future (e.g. [92]) will help address this
issue.
6 Contextual visualization
6.1 Purpose
From the viewpoint of visual analytics, a visualization that
exposes one or more types of PPIs is merely displaying a set
of interaction data in a graphical form. It does not explicitly
communicate the consensus knowledge about a given PPI
network. The investigator will need to rely either on his/her
own knowledge and access to public resources to construct
hypotheses or trigger certain mouse pointer events to retrieve
the hidden information if available. In order to better assist
the investigator’s understanding in its biological relevance,
a PPI network needs to be co-visualized with some kind of
biological context, e.g. membership in pathways and/or sub-
cellular localization. To date, GO categories are commonly
used as a proxy for biological context especially the GO Pro-
cess and GO Component categories [93]. As will be shown
in the following sections, co-visualization of PPIs and GO
annotations provides added benefits for understanding the
modular nature of the PPI network.
6.2 Colour-coded visualization
While the Focus + Context visualization shown in Section
5.3 is one way of imparting biological context on a PPI net-
work, there are alternative approaches. One of these is to
colour code the GO Process or GO Component terms map-
pable to individual proteins. If both partners of a pairwise
interaction share the same context, it can be assumed that
the corresponding nodes and the edge should be painted
in the same colour (Fig. 9). This method is very useful for
highlighting functional homo- or heterogeneity among pro-
tein complexes (or subnetworks) when GO Process is being
used as the context (Fig. 9A). It is equally useful for high-
lighting the intracellular localization of certain interacting
complexes (Fig. 9B) [94]. Although apparently easy to im-
plement, the challenge is to select an informative subset of
ontology terms out of the GO hierarchy for the purpose of con-
textual mapping. Since human vision cannot easily discern
a broad colour spectrum, the set of applicable GO terms is
selected from the top hierarchal levels of GO Slim. This leads
to the underutilization of the GO hierarchy within a visual-
ization and hence other approaches, e.g. the Focus + Context
visualization using TVNViewer discussed in Section 5.3 is
needed.
Figure 9. Contextual visualization of the yeast PPI network [57]
generated with the use of GEOMI. (A) The Gene Ontology Slim
biological process annotation of each protein node is represented
by colour. Orange = transcription; yellow = transport; light green
= DNA metabolic process; green = protein modification process;
teal = RNA metabolic process; cyan = conjugation; light blue =
cytoskeleton organization; blue = cytokinesis; navy blue = gener-
ation of precursor metabolites and energy; purple = translation;
magenta = other; red = process unknown. (B) The subcellular
localization [94] of each protein node is represented by colour.
Orange = cytoplasm; yellow = nucleus; light green = bud neck;
green = nucleolus; cyan = spindle pole; light blue = endoplasmic
reticulum; blue = nuclear periphery; navy blue = actin; purple =
mitochondrion; magenta = other; red = localization unknown.

Proteomics 2012, 12, 1669–1686 1681
Figure 10. Visualization of
the human DNA replication
PPI network in the clustered
circular layout [96] generated
with the use of GEOMI. In-
set (a): Close-up view of the
chromatin (GO:0000785) and
alpha DNA polymerase:primase
(GO:0005658) complex clusters.
MCM2,3,7 and their edges are
highlighted in red. Inset (b):
Close-up view of the perin-
uclear region of cytoplasm
(GO:0048471) cluster. SET,
HMGB2 and their intra-cluster
edge are highlighted in red. In-
set (c): Close-up view which en-
compasses the DNA replication
factor C complex (GO:0005663),
intracellular (GO:0005622) and
chromatin assembly complex
(GO:0005678) clusters. PCNA,
CHAF1A and their inter-cluster
edge are highlighted in red.
6.3 Clustered network visualization
Another method is to exploit the nested modularity of the PPI
network meaning that any functional module, e.g. a pathway
or a biological process, can be further subdivided into mul-
tiple modules according to the physical localization of its
member proteins [95]. This type of organization can be cap-
tured by the context-specific clustered network visualization
[96].
Fig. 10 shows the visualization for the DNA replication
PPI network in the clustered circular layout. Each disc-shaped
cluster node represents a subcellular region, e.g. GO:0048471
perinuclear region; or an organelle, e.g. GO:0005634 nucleus;
or a protein complex, e.g. GO:0005663 DNA replication fac-
tor C. A protein known to localize in multiple subcellular
regions or organelles is represented as multiple nodes in dif-
ferent clusters. Because every node has been assigned a fixed
coordinate, the layout is highly reproducible on repeated ren-
dering. This is an advantage over the force-directed layout
since the investigator does not need to cognitively re-adapt to
a new layout. The strength of the clustered circular layout lies
its ability to capture the three types of biologically relevant
PPIs [96]. The first is the PPI(s) between two protein com-
plexes. The second type is the PPI(s) between a subunit of a
protein complex and other proteins localized in an organelle.
The third type is the PPI(s) that can occur in multiple or-
ganelles or subcellular regions.
An example of the first type can be seen in Fig. 10 in-
set (c). PCNA with its partners RFC2,3,4,5 are localized in
the cluster node labelled ‘GO:0005663 DNA replication fac-
tor C complex’. The other two mutually interacting partners
of PCNA, i.e. CHAF1A,1B, are localized in the cluster node
labelled ‘GO:0005678 chromatin assembly complex’. The in-
teraction between the two complexes is represented by the
inter-cluster edge between PCNA and CHAF1A, suggesting
that PCNA is more likely to be a hub-bottleneck protein than
a date hub.
Fig. 10 inset (a) gives an example of the second type of PPI.
It shows that MCM3 is bound within the cluster node labelled
‘GO:0005658 ␣ DNA polymerase:primase complex’, whereas
MCM2,7 and ORC2L are bound within the cluster node
labelled ‘GO:0000785 chromatin’. The interactions among
these proteins as represented by the inter-cluster edges im-
ply that the ␣ DNA polymerase:primase complex localizes
with the chromatin. It also implies that MCM2,7 and ORC2L
are chromatin-bound. These deductions map to the current
knowledge on the molecular structure of the pre-replication
complex in which MCM2,3,7 are subunits that imparts DNA
helicase activity [97] and ORC2L is one of the subunits that
recognizes the origin site of DNA replication [98].
The whole network shown in Fig. 10 with insets (a) and
(b) together gives an example of the third type of PPI. The
protein node HMGB2 shown in inset (a) is bound within
the ‘GO:0000785 chromatin’ cluster node whereas in inset
(b), it is shown to be bound in the cluster node labelled
‘GO:0048471 perinuclear region of cytoplasm’. In the latter
cluster, HMGB2 is linked to the protein node SET indicating
that they are interaction partners. The pair is also duplicated
in the cluster node labelled ‘nucleus’, strongly suggesting
that the SET-HMGB2 dimer coexists in both the nucleus and

the perinuclear region of cytoplasm. Furthermore, the clus-
ter membership of HMGB2 suggests that it is the subunit
that interacts with the chromatin. Both deductions have been
verified experimentally [99, 100]. It has been found that the
SET-HMGB2 dimer is part of the larger SET complex sus-
pected to have the function of nucleosome assembly. It may
have been assembled in the perinuclear region of the endo-
plasmic reticulum for regular transport into the nucleus [99].
The biggest limitation of the clustered circular layout is its
less compact drawing as compared to the force-directed lay-
out. It took 148 protein nodes, 13 cluster nodes and 153 edges
to represent the DNA replication PPI network. Yet the net-
work itself contains only 55 proteins and 83 interactions. The
substantially inflated network size seen is caused by node re-
dundancy and the positioning of protein nodes being limited
to the circumference of the cluster node. The other limitation
is the loss of the original topology seen in the force-directed
layout, making the identification of date and party hubs more
challenging [96], but this is compensated by a more effective
exposure of bottleneck proteins.
7 Challenges in interactome visualization
7.1 Biological fidelity
As researchers in information visualization rightly pointed
out, any visualization is only as good as the data that one pro-
vides to it [101]. The biological fidelity of a visualized interac-
tome is very much affected by technical and representational
artefacts. Both reduce the reliability of the visualization. Tech-
nical artefacts come from the false positives and negatives
generated by the experimental techniques. Representational
artefacts come from the underlying graph-theoretic model
and the layout of the visualization. The inclusion of false pos-
itive PPIs introduces noise to the visualized network in the
form of extraneous edges whereas false negative PPIs can dis-
tort global topology by underestimating protein connectivity.
Technical artefacts are introduced during the detection
of PPIs. False positives and/or false negatives come from
a variety of sources, i.e. the analytical technology employed,
the experimental design, laboratory conditions and the op-
erator’s competence in sample handling. The tandem affin-
ity purification-mass spectrometry technique, e.g. tends to
bias towards the detection of high-affinity PPIs [102]. Since
this technique requires the in vitro processing of protein ex-
tract, it is especially prone to operator error in sample han-
dling. The other commonly used detection method, yeast
two-hybrid assay, detects PPIs occurring in vivo but can un-
derdetect membrane-associated PPIs and those dependent
on post-translational modification [102]. There is the percep-
tion that large-scale curation of low-throughput experiments
should give a more reliable interactome, but a comparative
study showed that human PPIs collected from single low-
throughput studies are of poorer quality when compared to
high-quality data sets produced by stringent yeast two-hybrid
and PPI assays [103].
The biggest source of representational artefacts comes
from the graph-theoretic model used. It has been shown ex-
perimentally, e.g. by affinity purification, that protein com-
plexes are not exclusively dimeric. Yet the graph model de-
composes the m:n interaction stoichiometry, common among
PPIs, to a 1:1 relationship [104], regardless of whether the
PPIs are truly dimeric or not. This results in a failure to
capture the multi-scalar nature of PPIs in complexomes and
similarly the complexome interactions in the global network.
Hence, there is the argument that PPIs may be best rep-
resented by hypergraphs. A hypergraph denoted as H(V, E)
consists of a set of nodes V and a set of hyperedges E. Ev-
ery hyperedge e represents the physical interaction among
the populations of k proteins v1 to vk [104]. However, the
widespread use of graphs instead of hypergraphs has to do
with computational intractability and higher visual complex-
ity of the latter when the network size increases. Hence, the
dilemma of choosing between biological fidelity and com-
prehensibility is now confronting the investigator and also
visualization researchers.
7.2 Conceptual focus
For any interactome visualization to become a useful analyti-
cal tool, there is a need to match the investigator’s knowledge
precept with the biological relationships exposed by the visu-
alization. One contentious issue is whether the primary focus
of any visualization should be the interactome or complex-
ome. If it is the former, the focus will be protein connectivity
in the network. The assumption will be the functioning of
the interactome relies on how frequently each individual pro-
tein engages in PPIs. One can even argue that preferential
attachment explains the biological relevance of the interac-
tome rather than functional or physical modularity. If it is
the latter, the focus will be on protein complex connectivity
or even core-module complex-to-attachment protein connec-
tivity. The assumption will then be the functioning of an inter-
actome relies largely on how frequently each protein complex
engages either in multi-complex interactions or the dynamic
transaction of subunits. This issue can only be resolved once
current representations of the complexome are tested in the
community, and their utility and relevance established.
From an even broader perspective, the current understand-
ing on network biology is mostly derived from the yeast in-
teractome. It is not clear whether one can interpret a mam-
malian interactome visualization just like its yeast counter-
part. These are conceptual problems yet to be resolved. In
the face of conceptual uncertainty, the issue at heart is which
conceptual model should be presented to the investigator
by the visualization researcher? A possible solution is to
implement knowledge tracking as a functionality of a visual-
ization tool [105].

Proteomics 2012, 12, 1669–1686 1683
7.3 The curse of scale
Even for the single-cell eukaryote S. cerevisiae, the size of
the interactome has been estimated to contain approximately
5000 proteins with 20 000 physical interactions [3] forming
a probable 800 complexes [53]. The size of the human PPI
network is even bigger with an estimated 22 500 proteins in-
volved in a possible 130 000 physical interactions [103]. This
is more than six times the size of the yeast PPI network.
Visualization on such a big scale will only increase the in-
vestigator’s cognitive burden and stall his/her effort towards
extracting biological insight, let alone generating testable hy-
potheses. It can be expected that iterative scale reduction,
through the use of network filtering and abstraction [58,106],
will continue to serve an important place in interactome vi-
sualization. Computational methods such as pathway-based
enrichment analysis [107] can be used as the preceding step
to provide guidance on network exploration.
8 Visual analytics is the future
This review informed us that no single visualization can rep-
resent faithfully the biological context, the scale and the dy-
namics of an interactome, partly because of incomplete data
set and partly because an interactome is probably multi-scale
[108]. Therefore, the visual analysis of a single visualization
can only provide limited biological insight. The need for a vi-
sual analytical framework is becoming increasingly pressing
to advance systems biology research. The ideal framework
will not only need to provide multiple visualizations of an
interactome but also provide the statistical confidence [109],
evidence score [47] or the quality assessment [103] of every
PPI represented. The framework should facilitate hypothe-
sis construction by enabling the collaborative use of multiple
visualizations.
The biggest challenge yet to be answered is to find a
set of usability heuristics which can serve the broad range
of research interests among investigators. Usability heuris-
tics means the common design features that make a visu-
alization effective, as derived from past experiences [110].
By effectiveness, we refer to the ability of the visualization
to amplify the investigator’s cognition with the aim of en-
hancing one’s analytical capability. If available, the heuristics
can serve as a guide for designing a usable visual analytical
framework. Although there have been usability heuristics pro-
posed by information visualization researchers [111], it is not
clear which of them will be most suitable to systems biology
application.
As visual analytics becomes an integral part of systems
biology research, the need for user-participatory design be-
comes increasingly important. Information visualization re-
searchers need to know the investigator’s requirements. In
return, the investigator needs the expertise from the informa-
tion visualization community to design visualization(s) that
can capture the biological knowledge (or concept) of interest.
The hurdle, however, is that biology is a knowledge-intensive
field of science. It is difficult for an expert in information
visualization to grasp the contextual richness of an interac-
tome in a limited time frame. It is equally challenging for
an investigator to understand the biological relevance of a
visualized interactome. Therefore, the availability of interdis-
ciplinary experts able to bridge the two communities will be
critical to their open collaboration. One way to foster interdis-
ciplinary collaboration is to include information visualization
as part of the bioinformatician’s training. For the moment,
the development of information visualization for systems bi-
ology research will need extensive experimentation. This field
is still very much at its infancy and opportunity for further
growth abounds.
The research reviewed in this paper was supported by the Aus-
tralian Research Council Linkage Grant Scheme, the New South
Wales Office for Scientific and Medical Research, the EIF Super
Science Scheme, the University of Sydney and the University of
New South Wales.
The authors have declared no conflict of interest.
9 References
[1] Stelzl, U., Worm, U., Lalowski, M., Haenig, C. et al., A hu-
man protein-protein interaction network: a resource for
annotating the proteome. Cell 2005, 122, 957–968.
[2] Rual, J. F., Venkatesen, K., Hao, T., Hirozane-Kishikawa, T. et
al., Towards a proteome-scale map of the human protein-
protein interaction network. Nature 2005, 437, 1173–1178.
[3] Yu, H., Braun, P., Yildirim, M. A., Lemmens, I. et al., High-
quality binary protein interaction map of the yeast inter-
actome network. Science 2008, 322, 104–110.
[4] Merico, D., Gfeller, D., Bader, G. D., How to visually inter-
pret biological data using networks? Nat. Biotech. 2009,
27, 921–924.
[5] Cline, M., Smoot, M., Cerami, E., Kuchinsky, A. et al., Inte-
gration of biological networks and gene expression data
using Cytoscape. Nat. Protoc. 2007, 2, 2366–2382.
[6] Hu, Z., Hung, J-. H., Wang, Y., Chang, Y-. C. et al., VisANT
3.5: multi-scale network visualization, analysis and infer-
ence based on the gene ontology. Nucleic Acids Res. 2009,
37, W115–W121.
[7] Breitkreutz, B. J., Stark, C., Tyers, M., Osprey: a network
visualization system. Genome Biol. 2003, 4, R22.
[8] Iragne, F., Nikolski, M., Mathieu, B., Auber, D. et al., ProViz:
protein interaction visualization and exploration. Bioinfor-
matics 2005, 21, 272–274.
[9] Dogrus öz, U., Erson, E. Z., Giral, E., Demir, E. et al.,
PATIKAweb: a Web interface for analyzing biological
pathways through advanced querying and visualization.
Bioinformatics 2006, 22, 374–375.
[10] Lee, R. E., Megeney, L. A., The yeast kinome displays scale
free topology with functional hub clusters. BMC Bioinfor-
matics 2005, 6, 271.

[11] Fruchterman, T. M. J., Rheingold, E. M., Graph drawing by
force-directed placement. Software Pract. Exper. 1991, 21,
1129–1164.
[12] Eades, P., A heuristic for graph drawing. Congressus Nu-
merantium 1984, 42, 142–160.
[13] Sugiyama, K., Tagawa, S., Toda, M., Methods for visual un-
derstanding of hierarchical system structures. IEEE Trans.
Syst. Man. Cybernetics. 1981, 11, 109–125.
[14] Barsky, A., Gardy, J. L., Hancock, R. E. W., Munzner, T.,
Cerebral: a Cytoscape plugin for layout of and interaction
with biological networks using subcellular localization an-
notation. Bioinformatics 2007, 23, 1040–1042.
[15] Suderman, M., Hallett, M., Tools for visually exploring bi-
ological networks. Bioinformatics 2007, 23, 2651–2659.
[16] Gehlenborg, N., O’Donoghue, S. I., Baliga, N. S., Goes-
mann, A. et al., Visualization of omics data for systems
biology. Nat. Methods 2010, 7, S56–S68.
[17] Ahmed, A., Dwyer, T., Forster, M., Xu, K. et al., GEOMI:
geometry for maximum insight. Lect. Notes Comput. Sci.
2006, 3843, 468–479.
[18] Pavlopoulos, G. A., Secrier, M., Moschopoulos, C. N.,
Soldatos, T. G. et al., Using graph theory to analyze bi-
ological networks. BioData Min. 2011, 4, 10.
[19] Emmert-Streib, F., Dehmer, M., Networks for systems biol-
ogy: conceptual connection of data and function. IET Syst.
Biol. 2011, 5, 185–207.
[20] Christensen, C., Thakar, J., Albert, R., Systems-level in-
sights in cellular regulation: inferring, analyzing, and mod-
eling intracellular networks. IET Syst. Biol. 2007, 1, 61–77.
[21] Barabási, A. L., Oltvai, Z. N., Network biology: understand-
ing the cell’s functional organization. Nat. Rev. Gen. 2004,
5, 101–114.
[22] Rives, A. W., Galitski, T., Modular organization of cellu-
lar network. Proc. Natl. Acad. Sci. USA 2003, 100, 1128–
1133.
[23] Whitacre, J. M., Bender, A., Networked buffering: a basic
mechanism for distributed robustness in complex adap-
tive system. Theor. Biol. Med. Model 2010, 7, 20.
[24] Jeong, H., Mason, S. P., Barabási, A. L., Oltvai, Z. N., Lethal-
ity and centrality in protein networks. Nature 2001, 411,
41–42.
[25] Tornow, S., Mewes, H. W., Functional modules by relating
protein interaction networks and gene expression. Nucleic
Acids Res. 2003, 31, 6283–6289.
[26] Han, J. D., Bertin, N., Hao, T., Goldberg, D. S. et al., Evi-
dence for dynamically organized modularity in the yeast
protein-protein interaction network. Nature 2004, 430, 88–
95.
[27] Patel, M. I., Nagl, S., The Role of Model Integration in Com-
plex Systems Modeling: An Example from Cancer Biology,
Springer, Berlin, 2010, pp. 64–83.
[28] Saffer, J. D., Burnett, V. L., Chen, G., van der Spek, P., Visual
analytics in the pharmaceutical industry. IEEE Comput.
Graph. Appl. 2004, 24, 10–15.
[29] van Wijk, J. J., Guest editor’s introduction: special section
on IEEE symposium on visual analytics science and tech-
nology. IEEE Trans. Visual. Comput. Graphics 2011, 17,
555–556.
[30] Prˇzulj, N., Protein-protein interactions: making sense of
networks via graph-theoretic modeling. Bioessays 2010,
33, 115–123.
[31] Kim, P. M., Lu, L. J., Xia, Y., Gerstein, M. B., Relating three-
dimensional structures to protein networks provide evo-
lutionary insights. Science 2006, 314, 1938–1941.
[32] Yu, H., Kim, P. M., Sprecher, E., Trifonov, V. et al., The im-
portance of bottlenecks in protein networks: correlation
with gene essentiality and expression dynamics. PLoS
Comput. Biol. 2007, 3, e59.
[33] Hashimoto, T. B., Nagasaki, M., Kojima, K., Miyano, S.,
BFL: a node and edge betweenness based fast layout algo-
rithm for large-scale network. BMC Bioinformatics 2009,
10, 19.
[34] Freeman, L. C., Borgatti, S. P., White, D. R., Centrality in val-
ued graphs: a measure of betweenness based on network
flow. Soc. Networks 1991, 13, 141–154.
[35] Valente, A. N., Cusick, M. E., Yeast protein interactome
topology provides framework for co-ordinated functional-
ity. Nucleic Acids Res. 2006, 34, 2812–2819.
[36] Zou, L., Sriswasdi, S., Ross, B., Missiuro, P. V. et al., Sys-
tematic analysis of pleiotropy in C. elegans early embryo-
genesis. PLoS Comput. Biol. 2008, 4, e1000003.
[37] Ahmed, A, Dwyer, T., Hong, S-H., Murray, C. et al., Vi-
sualization and analysis of large and complex scale-free
networks. Proc. Eurographics – IEEE VGTC Symp. Visual-
ization 2005, 1–8.
[38] Huang, W., Murray, C., Shen, X., Song, L. et al., Visualisa-
tion and analysis of network motifs. Proc. 9th Intl. Conf.
Info. Vis. 2005, 697–702.
[39] Meyers, L. A., Newman, M. E. J., Pourbohloul, B., Predict-
ing epidemics on directed contact networks. J. Theor. Biol.
2006, 240, 400–418.
[40] Fung, D. C. Y., Hong, S-. H., Kosch ützki, D., Schreiber, F.
et al., 2.5D visualization of overlapping biological net-
works. J. Integr. Bioinform. 2008, 5, 90.
[41] Fung, D. C. Y., Hong, S-. H., Kosch ützki, D., Schreiber, F.
et al., Visual analysis of overlapping biological network.
Proc. 13th Intl. Conf. Info. Vis. 2009, 337–342.
[42] Cui, Q., Ma, Y., Jaramillo, M., Bari, H. et al., A map of
human cancer signaling. Mol. Syst. Biol. 2007, 3, 152.
[43] Breitkreutz, B. J., Stark, C., Reguly, T., Boucher, L. et al.,
The BioGRID Interaction database: 2008 update. Nucleic
Acids Res. (database issue) 2008, 36, D637–640.
[44] Hsu, C-. N., Lai, J-. M., Liu, C-. H., Tseng, H-. H. et al., De-
tection of the inferred interaction network in hepatocellu-
lar carcinoma from ECHO (Encyclopedia of hepatocellular
carcinoma genes online). BMC Bioinformatics 2007, 8, 66.
[45] Brande, U., Dwyer, T., Schreiber, F., Visualizing related
metabolic pathways in two and a half dimensions. Lect.
Notes Comput. Sci. 2004, 2912, 111–122.
[46] Tory, M., M öller, T., Human factors in visualization re-
search. IEEE Trans. Visual. Comput. Graphics 2004, 10,
72–84.

Proteomics 2012, 12, 1669–1686 1685
[47] Widjaja, Y. Y., Pang, C. N. I., Li, S. S., Wilkins, M. R. et al.,
The Interactorium: visualising proteins, complexes and in-
teraction networks in a virtual 3D cell. Proteomics 2009, 9,
5309–5315.
[48] Berman, H. M., Westbrook, J., Feng, Z., Gilliland, G. et al.,
Protein data bank. Nucleic Acids Res. 2000, 28, 235–242.
[49] Feeser, E. A., Wolberger, C., Structural and functional stud-
ies of the Rap1 C-terminus reveal novel separation-of-
function mutants. J. Mol. Biol. 2008, 380, 520–531.
[50] Konig, P., Giraldo, R., Chapman, L., Rhodes, D., DNA-
binding domain of Rap1 in complex with telomeric DNA
site. Cell 1996, 85, 125–136.
[51] Kelder, T., Conklin, B. R., Evelo, C. T., Pico, A. R., Finding the
right questions: exploratory pathway analysis to enhance
biological discovery in large dataset. PLoS Biol. 2010, 8,
e1000472.
[52] Ho, Y., Gruhler, A., Heilbut, A., Bader, G. D. et al., System-
atic identification of protein complexes in Saccharomyces
cerevisiae by mass spectrometry. Nature 2002, 415, 180–
183.
[53] Gavin, A. C., Bosche, M., Krause, R., Boesche, M. et al.,
Functional organization of the yeast proteome by system-
atic analysis of protein complexes. Nature 2002, 415, 141–
147.
[54] Gavin, A. C., Aloy, P., Grandi, P., Krause, R. et al., Proteome
survey reveals modularity of the yeast cell machinery. Na-
ture 2006, 440, 631–636.
[55] Krogan, N. J., Garney, G., Yu, H., Zhong, G. et al., Global
landscape of protein complexes in the yeast Saccha-
romyces cerevisiae. Nature 2006, 440, 637–643.
[56] Krause, R., von Mering, C., Bork, P., Dandekar, T., Shared
components of protein complexes – versatile building
blocks or biochemical artefacts? BioEssays 2004, 26, 1333–
1343.
[57] Ho, E., Webber, R., Wilkins, M. R., Interactive three-
dimensional visualization and contextual analysis of pro-
tein interaction networks. J. Proteome Res. 2008, 7, 104–
112.
[58] Hu, Z., Mellor, J., Wu, J., Kanehisa, M. et al., Towards
zoomable multidimensional maps of the cell. Nat. Biotech-
nol. 2007, 25, 547–554.
[59] Benschop, J. J., Brabers, N., van Leenen, D., Bakker, L. V.
et al., A consensus of core protein complex compositions
for Saccharomyces cerevisiae. Mol. Cell Proteomics 2010,
38, 916–928.
[60] Wang, H., Kakaradov, B., Collins, S. R., Karotki, L. et al.,
A complex-based reconstruction of the Saccharomyces
cerevisiae interactome. Mol. Cell Proteomics 2009, 8,
1361–1381.
[61] Hart, G. T., Lee, I., Marcotte, E. R., A high-accuracy con-
sensus map of yeast protein complexes reveals modular
nature of gene essentiality. BMC Bioinformatics 2007, 8,
236.
[62] Li, S. S., Xu, K., Wilkins, M. R., Visualization and analysis
of the complexome network of Saccharomyces cerevisiae.
J. Proteome Res. 2011, 10, 4744–4756.
[63] Fasolo, J., Sboner, A., Sun, M. G., Yu, H. et al., Diverse
protein kinase interactions identified by protein microar-
rays reveal novel connections between cellular processes.
Genes Dev. 2011, 25, 767–778.
[64] Karris, S. T., Networks: Design and Management, Orchard
Publications, Fremont, 2002, pp. 2–5.
[65] Przytycka, T. M., Singh, M., Slonim, D. K., Toward the dy-
namic interactome: it’s about time. Brief Bioinform. 2010,
11, 15–29.
[66] Komurov, K., White, M., Revealing static and dynamic
modular architecture of the eukaryotic protein interaction
network. Mol. Syst. Biol. 2007, 3, 110.
[67] de Lichtenberg, U., Jensen, L. J., Brunak, S., Bork, P., Dy-
namic complex formation during the yeast cell cycle. Sci-
ence 2005, 307, 724–727.
[68] Bianconi, G., Barabási, A. L., Bose-Einstein condensation
in complex networks. Phys. Rev. Lett. 2001, 86, 5632–5635.
[69] Edelman, E. J., Guinney, J., Chi, J-. T., Febbo, P. G. et al.,
Modeling cancer progression via pathway dependencies.
PLoS Comput. Biol. 2008, 4, e28.
[70] Kirouac, D. C., Ito, C., Csaszar, E., Roch, A. et al., Dynamic
interaction networks in a hierarchically organized tissue.
Mol. Syst. Biol. 2010, 6, 417.
[71] Dupuy, D., Bertin, N., Hidalgo, C. A., Venkatesen, K. et
al., Genome-scale analysis of in vivo spatiotemporal pro-
moter activity in Caenorhabditis elegans. Nat. Biotechnol.
2007, 25, 663–668.
[72] Jiao, Y., Tausta, S. L., Gandotra, N., Sun, N. et al., A tran-
scriptome altas of rice cell types uncovers cellular, func-
tional and developmental hierarchies. Nat. Genet. 2009,
41, 258–263.
[73] Keller, M. P., Choi, Y., Wang, P., Davis, D. B. et al., A gene ex-
pression network model of type 2 diabetes links cell cycle
regulation in islets with diabetes susceptibility. Genome
Res. 2008, 18, 706–716.
[74] Dwyer, T., Rolletschek, H., Schreiber, F., Representing ex-
perimental biological data in metabolic networks. Proc.
2nd Asia-Pacific Bioinform. Conf. 2004, 29, 13–20.
[75] Goel, A., Li, S. S., Wilkins, M. R., Four-dimensional visu-
alization and analysis of protein-protein interaction net-
works. Proteomics 2011, 11, 1–11.
[76] Moody, J., McFarland, D., Bender-deMoll, S., Dynamic
network visualization. Am. J. Sociol. 2005, 110, 1206–1241.
[77] Spellman, P. T., Sherlock, G., Zhang, M. Q., Iyer, V. R. et
al., Comprehensive identification of cell cycle-regulated
genes of the yeast Saccharomyces cerevisiae by microar-
ray hybridization. Mol. Biol. Cell 1998, 9, 3273–3297.
[78] Bertin, N., Simonis, N., Dupuy, D., Cusick, M. E. et al.,
Confirmation of organized modularity in the yeast inter-
actome. PLoS Biol. 2007, 5, e153.
[79] Tufte, E. R., The Visual Display of Quantitative Information,
2nd Edn., Graphics Press LLC, Cheshire 2001.
[80] Kadyrov, F. A., Holmes, S. F., Arana, M. E., Lukianov, O. A.
et al., Saccharomyces cerevisiae MutLalpha is a mismatch
repair endonuclease. J. Biol. Chem. 2007, 282, 37181–
37190.

[81] Marti, T. M., Kunz, C., Fleck, O., DNA mismatch repair and
mutation avoidance pathways. J. Cell Physiol. 2002, 191,
28–41.
[82] Kunkel, T. A., Erie, D. A., DNA mismatch repair. Ann. Rev.
Biochem. 2005, 74, 681–710.
[83] Curtis, R. E., Yuen, A., Song, L., Goyal, A. et al., TVNViewer:
an interactive visualization tool for exploring networks
that change over time or space. Bioinformatics 2011, 27,
1880–1881.
[84] Pestov, I., Verga, S., Dynamical networks as a tool for sys-
tem analysis and exploration. Proc. IEEE Symp. Comput.
Intell. Security and Defense Appl. 2009 (CISDA 2009), pa-
per no. 05356527.
[85] Gene Ontology Consortium, The Gene Ontology (GO)
project in 2006. Nucleic Acids Res. (database issue) 2006,
34, D322–326.
[86] Dotan-Cohen, D., Letovsky, S., Melkman, A. A., Kasif, S.,
Biological process linkage networks. PLoS One 2009, 4,
e5313.
[87] Card, S. K., MacKinlay, J. D., Shneiderman, B., Card, M.,
Readings in Information Visualization: Using Vision to
Think, Morgan Kaufmann Publishers, San Francisco 1999,
pp. 1–32.
[88] Branke, J., Dynamic graph drawing. Lect. Notes Comput.
Sci. 2001, 2025, 228–246.
[89] Lu, P., Vogel, C., Wang, R., Yao, X. et al., Absolute pro-
tein expression profiling estimates the relative contribu-
tions of transcriptional and translational regulation. Nat.
Biotechnol. 2007, 25, 117–124.
[90] Raj, A., Peskin, C. S., Tranchina, C. S., Vargas, D. Y. et
al., Stochastic mRNA synthesis in mammalian cells. PLoS
Biol. 2006, 4, e309.
[91] Lee, M. V., Topper, S. E., Hubler, S. L., Hose, J. et al., A
dynamic model of proteome changes reveals new roles
for transcript alteration in yeast. Mol. Syst. Biol. 2011, 7,
514.
[92] Ghaemmaghami, S., Huh, W-. K., Bower, K., Howson, R.
W. et al., Global analysis of protein expression in yeast.
Nature 2003, 425, 737–741.
[93] Rachlin, J., Cohen, D. D., Cantor, C., Kasif, S., Biological
context networks: a mosaic view of the interactome. Mol.
Syst. Biol. 2006, 2, 66.
[94] Huh, W. K., Falvo, J. V., Gerke, L. C., Carroll, A. S. et al.,
Global analysis of protein localization in budding yeast.
Nature 2003, 425, 686–691.
[95] Alon, U., Network motifs in developmental, signal
transduction, and the neuronal networks, in: An In-
troduction to Systems Biology: Design Principles of
Biological Circuits, Chapman & Hall/CRC Mathemati-
cal and Computational Biology Series 2007, pp. 97–
134.
[96] Fung, D. C. Y., Wilkins, M. R., Hart, D., Hong, S-. H., Using
clustered circular layout as an informative method for vi-
sualizing protein-protein interaction network. Proteomics
2010, 10, 2723–2727.
[97] Dafonesca, C. J., Shu, F., Zhang, J. J., Identification of two
residuals in MCM5 critical for the assembly of the MCM
complexes and Stat1-mediated transcription activation in
response to IFN-␥. Proc. Natl. Acad. Sci. USA 2001, 98,
3034–3039.
[98] Gonzalez, M. A., Tachibana, K. K., Laskey, R. A., Coleman,
N., Control of DNA replication and its potential clinical
exploitation. Nat. Rev. Cancer 2005, 5, 135–141.
[99] Fan, Z., Beresford, P. J., Zhang, D., Lieberman, J., HMG2 in-
teracts with the nucleosome assembly protein SET and is
a target of the cytotoxic T-lymphocyte protease granzyme
A. Mol. Cell. Biol. 2002, 22, 2810–2820.
[100] Bustin, M., Regulation of DNA-dependent activities by the
functional motifs of the high-mobility-group chromoso-
mal proteins. Mol. Cell. Biol. 1999, 19, 5237–5246.
[101] Amar, R. A., Stasko, J. T., Knowledge precepts for design
and evaluation of information visualization. IEEE Trans.
Visual. Comput. Graphics 2005, 11, 432–442.
[102] Br ückner, A., Polge, C., Lentze, N., Auerbach, D. et al., Yeast
two-hybrid, a powerful tool for systems biology. Int. J.
Mol. Sci. 2009, 10, 2763–2788.
[103] Venkatesen, K., Rual, J-. F., Vazquez, A., Stelzl, U. et al.,
An empirical framework for binary interactome mapping.
Nat. Methods 2009, 6, 83–90.
[104] Klamt, S., Haus, U-. U., Theis, F., Hypergraphs and cellular
networks. PLoS Comput. Biol. 2009, 5, e1000385.
[105] Tipney, H. J., Schuyler, R. P., Hunter, L., Consistent visual-
izations of changing knowledge. Summit Translat. Bioin-
form. 2009, 2009, 129–132.
[106] Praneenararat, T., Takagi, T., Iwasaki, W., Interactive, mul-
tiscale navigation of large and complicated biological net-
works. Bioinformatics 2011, 27, 1121–1127.
[107] Kelder, T., Conklin, B. R., Evelo, C. T., Pico, A. R., Finding the
right questions: Exploratory pathway analysis to enhance
biological discovery in large datasets. PLoS Biol. 2010, 8,
e1000472.
[108] Hase, T., Tanaka, H., Suzuki, Y., Nakagawa, S. et al., Struc-
ture of protein interaction networks and their implications
on drug design. PLoS Comput. Biol. 2009, 5, e1000550.
[109] Braun, P., Tasan, M., Dreze, M., Barrios-Rodiles, M. et al.,
An experimentally derived confidence score for binary
protein-protein interactions. Nat. Methods 2009, 6, 91–97.
[110] Nielsen, J., Heuristic Evaluation, in: Nielsen, J., Mack, R.
L. (Eds.), Usability Inspection Methods, John Wiley and
Sons Inc., New York 1994, pp. 25–62.
[111] Zuk, T., Schlesier, L., Neumann, P., Hancock, M. S. et al.,
Heuristics for information visualization evaluation. Proc.
2006 AVI workshop on beyond time and errors; novel eval-
uation methods for information visualization, Association
for Computing Machinery, New York 2006.
[112] Ahmed, A., Dwyer, T., Murray, C., Song, L. et al., Info-
Vis 2004 Contest: Wilmascope graph visualization. Proc.
IEEE Symposium on Information Visualization 2004 (In-
foVis 2004), IEEE Computer Society, Los Alamitos 2004,
p. r4.

pmic7074

Recommended

Recommended

More Related Content

What's hot

What's hot (17)

Viewers also liked

Viewers also liked (20)

Similar to pmic7074

Similar to pmic7074 (20)

pmic7074