This document discusses different methods for visualizing protein-protein interaction (PPI) networks and interactomes. It begins by defining a PPI network as a graph model containing nodes (proteins) and edges (interactions). Simple undirected network visualizations ignore interaction dynamics but reveal global properties like heterogeneous connectivity. More advanced methods incorporate biological context. Betweenness fast layout emphasizes bottleneck proteins that connect functional modules. Integrated interactome visualizations combine PPI networks with signaling and gene regulatory networks for more insight. Dynamic and modular visualizations capture temporal changes and biological functions. Effective visualization requires balancing biological fidelity with comprehensibility.
This document proposes using a Dynamic Bayesian Network approach to integrate multiple omics datasets (genomics, proteomics, metabolomics) to reconstruct gene regulatory networks and signaling pathways. It involves:
1) Learning separate Bayesian Networks from transcriptomics and other omics data
2) Using semi-parametric distributions like Gaussian Mixtures for local probabilities to account for multi-modality in omics data
3) Automatically incorporating prior biological knowledge and epigenetic variations into the network learning
4) Merging the learned networks using a consensus approach to produce the final causal network.
This integrated approach aims to more accurately reconstruct key cancer signaling pathways for applications in personalized medicine.
Clustering of high dimensionality data which can be seen in almost all fields these days is becoming
very tedious process. The key disadvantage of high dimensional data which we can pen down is curse
of dimensionality. As the magnitude of datasets grows the data points become sparse and density of
area becomes less making it difficult to cluster that data which further reduces the performance of
traditional algorithms used for clustering. Semi-supervised clustering algorithms aim to improve
clustering results using limited supervision. The supervision is generally given as pair wise
constraints; such constraints are natural for graphs, yet most semi-supervised clustering algorithms are
designed for data represented as vectors [2]. In this paper, we unify vector-based and graph-based
approaches. We first show that a recently-proposed objective function for semi-supervised clustering
based on Hidden Markov Random Fields, with squared Euclidean distance and a certain class of
constraint penalty functions, can be expressed as a special case of the global kernel k-means objective
[3]. A recent theoretical connection between global kernel k-means and several graph clustering
objectives enables us to perform semi-supervised clustering of data. In particular, some methods have
been proposed for semi supervised clustering based on pair wise similarity or dissimilarity
information. In this paper, we propose a kernel approach for semi supervised clustering and present in
detail two special cases of this kernel approach.
Sentence compression via clustering of dependency graph nodes - NLP-KE 2012Ayman El-Kilany
This paper proposes an unsupervised model for sentence compression based on clustering the nodes of a sentence's dependency graph. The model first clusters related nodes into chunks using the Louvain clustering method. It then merges chunks based on linguistic rules to improve coherence. Candidate compressions are generated by removing chunks, and scored based on language models and word importance to select the best compression. An experiment found the proposed method performed better than a recent supervised technique.
This document summarizes a research paper that presents a hybrid approach for detecting and localizing color text in natural scene images. The approach uses both region-based and connected component-based methods. In the preprocessing stage, a text region detector is used to detect text regions and generate candidate text components. A conditional random field model combines unary component properties and binary contextual relationships to filter non-text components. Finally, neighboring text components are grouped into text lines or words using a learning-based energy minimization method. The paper is evaluated on a natural scene image dataset and shows improvements over existing methods.
This document describes a novel graph embedding procedure based on simplicial complexes for graph classification tasks. Simplicial complexes are mathematical objects that can capture multi-way relationships in data beyond pairwise relationships. The proposed approach uses simplicial complexes to extract meaningful substructures from graphs, clusters these substructures to form an alphabet, and then embeds each graph as a symbolic histogram over the alphabet. This moves the problem into a metric space where standard machine learning algorithms can be applied. The approach is tested on 30 graph classification benchmarks and two protein analysis applications to demonstrate its effectiveness.
This document proposes using a Dynamic Bayesian Network approach to integrate multiple omics datasets (genomics, proteomics, metabolomics) to reconstruct gene regulatory networks and signaling pathways. It involves:
1) Learning separate Bayesian Networks from transcriptomics and other omics data
2) Using semi-parametric distributions like Gaussian Mixtures for local probabilities to account for multi-modality in omics data
3) Automatically incorporating prior biological knowledge and epigenetic variations into the network learning
4) Merging the learned networks using a consensus approach to produce the final causal network.
This integrated approach aims to more accurately reconstruct key cancer signaling pathways for applications in personalized medicine.
Clustering of high dimensionality data which can be seen in almost all fields these days is becoming
very tedious process. The key disadvantage of high dimensional data which we can pen down is curse
of dimensionality. As the magnitude of datasets grows the data points become sparse and density of
area becomes less making it difficult to cluster that data which further reduces the performance of
traditional algorithms used for clustering. Semi-supervised clustering algorithms aim to improve
clustering results using limited supervision. The supervision is generally given as pair wise
constraints; such constraints are natural for graphs, yet most semi-supervised clustering algorithms are
designed for data represented as vectors [2]. In this paper, we unify vector-based and graph-based
approaches. We first show that a recently-proposed objective function for semi-supervised clustering
based on Hidden Markov Random Fields, with squared Euclidean distance and a certain class of
constraint penalty functions, can be expressed as a special case of the global kernel k-means objective
[3]. A recent theoretical connection between global kernel k-means and several graph clustering
objectives enables us to perform semi-supervised clustering of data. In particular, some methods have
been proposed for semi supervised clustering based on pair wise similarity or dissimilarity
information. In this paper, we propose a kernel approach for semi supervised clustering and present in
detail two special cases of this kernel approach.
Sentence compression via clustering of dependency graph nodes - NLP-KE 2012Ayman El-Kilany
This paper proposes an unsupervised model for sentence compression based on clustering the nodes of a sentence's dependency graph. The model first clusters related nodes into chunks using the Louvain clustering method. It then merges chunks based on linguistic rules to improve coherence. Candidate compressions are generated by removing chunks, and scored based on language models and word importance to select the best compression. An experiment found the proposed method performed better than a recent supervised technique.
This document summarizes a research paper that presents a hybrid approach for detecting and localizing color text in natural scene images. The approach uses both region-based and connected component-based methods. In the preprocessing stage, a text region detector is used to detect text regions and generate candidate text components. A conditional random field model combines unary component properties and binary contextual relationships to filter non-text components. Finally, neighboring text components are grouped into text lines or words using a learning-based energy minimization method. The paper is evaluated on a natural scene image dataset and shows improvements over existing methods.
This document describes a novel graph embedding procedure based on simplicial complexes for graph classification tasks. Simplicial complexes are mathematical objects that can capture multi-way relationships in data beyond pairwise relationships. The proposed approach uses simplicial complexes to extract meaningful substructures from graphs, clusters these substructures to form an alphabet, and then embeds each graph as a symbolic histogram over the alphabet. This moves the problem into a metric space where standard machine learning algorithms can be applied. The approach is tested on 30 graph classification benchmarks and two protein analysis applications to demonstrate its effectiveness.
A survey on methods and applications of meta-learning with GNNsShreya Goyal
This survey paper has provided a comprehensive review of works that are a combination of graph neural networks (GNNs) and meta-learning. They have also provided a thorough review, summary of methods, and applications in these categories. The application of meta-learning to GNNs is a growing and exciting field; many graph problems will benefit immensely from the combination of the two approaches.
IRJET- An Analysis of Recent Advancements on the Dependency ParserIRJET Journal
This document summarizes recent advancements in dependency parsers. It discusses how dependency parsers have been used to parse languages with free word order like Hindi and analyze source code from various programming languages. Several studies are highlighted that have used dependency parsers to extract semantic relationships, identify errors in automatic speech recognition, incorporate long-distance dependencies, and address feature sparseness issues. Dependency parsers have been shown to outperform other models for tasks like topic detection and can parse biomedical text, though both Link Grammar and Connexor Machinese Syntax parsers were found to have limitations for the biomedical domain.
International Journal of Computer Science, Engineering and Information Techno...IJCSEIT Journal
In the field of proteomics because of more data is added, the computational methods need to be more
efficient. The part of molecular sequences is functionally more important to the molecule which is more
resistant to change. To ensure the reliability of sequence alignment, comparative approaches are used. The
problem of multiple sequence alignment is a proposition of evolutionary history. For each column in the
alignment, the explicit homologous correspondence of each individual sequence position is established. The
different pair-wise sequence alignment methods are elaborated in the present work. But these methods are
only used for aligning the limited number of sequences having small sequence length. For aligning
sequences based on the local alignment with consensus sequences, a new method is introduced. From NCBI
databank triticum wheat varieties are loaded. Phylogenetic trees are constructed for divided parts of
dataset. A single new tree is constructed from previous generated trees using advanced pruning technique.
Then, the closely related sequences are extracted by applying threshold conditions and by using shift
operations in the both directions optimal sequence alignment is obtained.
A Low Rank Mechanism to Detect and Achieve Partially Completed Image TagsIRJET Journal
1. The document proposes a low-rank mechanism to detect and complete partially tagged images by approximating a global nonlinear model with local linear models using locality sensitivity and low-rank factorization.
2. It describes searching images based on category, keywords, or non-similar images and re-ranking images based on user likes/dislikes to increase the rank of more viewed images.
3. The proposed method is evaluated on a dataset showing its effectiveness over previous approaches through improved accuracy.
Improved wolf algorithm on document images detection using optimum mean techn...journalBEEI
Detection text from handwriting in historical documents provides high-level features for the challenging problem of handwriting recognition. Such handwriting often contains noise, faint or incomplete strokes, strokes with gaps, and competing lines when embedded in a table or form, making it unsuitable for local line following algorithms or associated binarization schemes. In this paper, a proposed method based on the optimum threshold value and namely as the Optimum Mean method was presented. Besides, Wolf method unsuccessful in order to detect the thin text in the non-uniform input image. However, the proposed method was suggested to overcome the Wolf method problem by suggesting a maximum threshold value using optimum mean. Based on the calculation, the proposed method obtained a higher F-measure (74.53), PSNR (14.77) and lowest NRM (0.11) compared to the Wolf method. In conclusion, the proposed method successful and effective to solve the wolf problem by producing a high-quality output image.
GPCODON ALIGNMENT: A GLOBAL PAIRWISE CODON BASED SEQUENCE ALIGNMENT APPROACHijdms
The alignment of two DNA sequences is a basic step in the analysis of biological data. Sequencing a long
DNA sequence is one of the most interesting problems in bioinformatics. Several techniques have been
developed to solve this sequence alignment problem like dynamic programming and heuristic algorithms.
In this paper, we introduce (GPCodon alignment) a pairwise DNA-DNA method for global sequence
alignment that improves the accuracy of pairwise sequence alignment. We use a new scoring matrix to
produce the final alignment called the empirical codon substitution matrix. Using this matrix in our
technique enabled the discovery of new relationships between sequences that could not be discovered using
traditional matrices. In addition, we present experimental results that show the performance of the
proposed technique over eleven datasets of average length of 2967 bps. We compared the efficiency and
accuracy of our techniques against a comparable tool called “Pairwise Align Codons” [1].
Graph Algorithm to Find Core Periphery Structures using Mutual K-nearest Neig...gerogepatton
Core periphery structures exist naturally in many complex networks in the real-world like social, economic, biological and metabolic networks. Most of the existing research efforts focus on the identification of a meso scale structure called community structure. Core periphery structures are another equally important meso scale property in a graph that can help to gain deeper insights about the relationships between different nodes. In this paper, we provide a definition of core periphery structures suitable for weighted graphs. We further score and categorize these relationships into different types based upon the density difference between the core and periphery nodes. Next, we propose an algorithm called CP-MKNN (Core Periphery-Mutual K Nearest Neighbors) to extract core periphery structures from weighted graphs using a heuristic node affinity measure called Mutual K-nearest neighbors (MKNN). Using synthetic and real-world social and biological networks, we illustrate the effectiveness of developed core periphery structures.
This document discusses using hidden Markov models (HMMs) for unsupervised learning in hyperspectral image classification. It proposes an HMM-based probability density function classifier that models hyperspectral data using a reduced feature space. The approach uses an unsupervised learning scheme for maximum likelihood parameter estimation, combining both model selection and estimation. This HMM method can accurately model and synthesize approximate observations of true hyperspectral data in a reduced feature space without relying on supervised learning.
In this paper the design of an experiment is presented. An experiment was designed to select relevant and not redundant features or characterization functions, which allow quantitatively discriminating among different types of complex networks. As well there exist researchers given to the task of classifying some networks of the real world through characterization functions inside a type of complex network, they do not give enough evidences of detailed analysis of the functions that allow to determine if all of them are necessary to carry out an efficient discrimination or which are better functions for discriminating. Our results show that with a reduced number of characterization functions such as the degree dispersion coefficient can discriminate efficiently among the types of complex networks treated here.
COMPARISON BETWEEN GENETIC FUZZY METHODOLOGY AND Q-LEARNING FOR COLLABORATIVE...ijaia
A comparison between two machine learning approaches viz., Genetic Fuzzy Methodology and Q-learning,
is presented in this paper. The approaches are used to model controllers for a set of collaborative robots
that need to work together to bring an object to a target position. The robots are fixed and are attached to
the object through elastic cables. A major constraint considered in this problem is that the robots cannot
communicate with each other. This means that at any instant, each robot has no motion or control
information of the other robots and it can only pull or release its cable based only on the motion states of
the object. This decentralized control problem provides a good example to test the capabilities and
restrictions of these two machine learning approaches. The system is first trained using a set of training
scenarios and then applied to an extensive test set to check the generalization achieved by each method.
Centrality Prediction in Mobile Social NetworksIJERA Editor
By analyzing evolving centrality roles using time dependent graphs, researchers may predict future centrality values. This may prove invaluable in designing efficient routing and energy saving strategies and have profound implications on evolving social behavior in dynamic social networks. In this paper, we propose a new method to predict centrality values of nodes in a dynamic environment. The proposed method is based on calculating the correlation between current and past measure of centrality for each corresponding node, which is used to form a composite vector to represent the given state of centralities. The performance of the proposed method is evaluated through simulated predictions on data sets from real mobile networks. Results indicate significantly low prediction error rate occurs, with a suitable implementation of the proposed method.
Optimized Neural Network for Classification of Multispectral ImagesIDES Editor
This document summarizes an article that proposes using a multiobjective particle swarm optimization (MOPSO) approach to optimize the structure of an artificial neural network for classifying multispectral satellite images. Specifically, the MOPSO is used to simultaneously select the most discriminative spectral bands from the available options and determine the optimal number of nodes in the hidden layer of the neural network. The MOPSO approach is compared to traditional classifiers like maximum likelihood classification and Euclidean classifiers. The results show that the MOPSO-optimized neural network approach provides superior performance for remote sensing image classification problems.
Molecular docking is a computational method that predicts the preferred orientation of one molecule to another when bound and forming a stable complex. It involves finding the best match between two molecules and can be used for drug design and development by predicting the binding affinity between potential drug candidates and their protein targets. Common molecular docking approaches include shape complementarity, which describes interacting molecules as complementary surfaces, and simulation methods, which simulate the actual docking process and calculate interaction energies between molecules. Popular molecular docking software includes AutoDock, FlexX, and GOLD.
DCT AND DFT BASED BIOMETRIC RECOGNITION AND MULTIMODAL BIOMETRIC SECURITYIAEME Publication
This Research paper discusses the study and analysis conducted during this research on various techniques in biometric domain. A close glance on biometric enhancement techniques and their limitations are presented in this research paper. This process would enable researcher to understand the research contributions in the area of DCT and DFT based recognition and security, locate some crucial limitations of these notable research. This paper having summary about the different research papers that applicable to our topic of research which mentioned above. Biometric Recognition and security is a most important subject of research in this area of image processing.
AI approaches in healthcare - targeting precise and personalized medicine DayOne
1) AI approaches show promise in precision medicine by mining medical records to design personalized treatments and power virtual healthcare assistants.
2) A document discusses the role of AI in healthcare, predicting it could save $150 billion annually and reduce treatment costs by 50% while improving outcomes by 30-40%.
3) The document then describes several IBM Research projects applying AI to healthcare, including reconstructing molecular networks from literature, integrating data to stratify patients, and using network-based models to predict drug sensitivity and identify biomarkers.
Text documents clustering using modified multi-verse optimizerIJECEIAES
In this study, a multi-verse optimizer (MVO) is utilised for the text document clus- tering (TDC) problem. TDC is treated as a discrete optimization problem, and an objective function based on the Euclidean distance is applied as similarity measure. TDC is tackled by the division of the documents into clusters; documents belonging to the same cluster are similar, whereas those belonging to different clusters are dissimilar. MVO, which is a recent metaheuristic optimization algorithm established for continuous optimization problems, can intelligently navigate different areas in the search space and search deeply in each area using a particular learning mechanism. The proposed algorithm is called MVOTDC, and it adopts the convergence behaviour of MVO operators to deal with discrete, rather than continuous, optimization problems. For evaluating MVOTDC, a comprehensive comparative study is conducted on six text document datasets with various numbers of documents and clusters. The quality of the final results is assessed using precision, recall, F-measure, entropy accuracy, and purity measures. Experimental results reveal that the proposed method performs competitively in comparison with state-of-the-art algorithms. Statistical analysis is also conducted and shows that MVOTDC can produce significant results in comparison with three well-established methods.
CONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANSijseajournal
ABSTRACT
In this paper we propose a novel method to cluster categorical data while retaining their context. Typically, clustering is performed on numerical data. However it is often useful to cluster categorical data as well, especially when dealing with data in real-world contexts. Several methods exist which can cluster categorical data, but our approach is unique in that we use recent text-processing and machine learning advancements like GloVe and t- SNE to develop a a context-aware clustering approach (using pre-trained
word embeddings). We encode words or categorical data into numerical, context-aware, vectors that we use to cluster the data points using common clustering algorithms like K-means.
Este documento presenta breves descripciones de varias herramientas digitales como Blogger (un servicio de blogs), About.me (una tarjeta de visita online), WordCloud (una herramienta para crear nubes de palabras), Educaplay (una plataforma para crear actividades educativas), Storybird (una red social para cuentos ilustrados), Pixlr (una aplicación para editar imágenes), PicMonkey (una página para agregar efectos a fotos), Prezi (una herramienta para presentaciones con fotos y música) y SlideShare
Standiste à paris : Groupe Smart Dezign By RZZeroukhi
le Groupe Smart Dezign By RZ est Standiste situé à paris
Après l’installation et la décoration de votre stand, l’équipe de SMART DEZIGN vous accompagne durant toute votre exposition en vous donnant les meilleurs conseils. Ainsi, nous vous garantissons un excellent résultat de travail.
The document provides details about an IT professional's objective, qualifications, education, and extensive experience in networking roles for various organizations in Afghanistan, highlighting skills and responsibilities related to designing, implementing, configuring, and troubleshooting networks, servers, security, routing, switching, and more. The professional seeks to further their technical and professional excellence by contributing to consulting projects.
Blogger permite crear y publicar bitácoras en línea sin necesidad de escribir código o instalar programas. About.me centraliza vínculos y perfiles en redes sociales bajo una misma URL. Wordle genera nubes de palabras a partir de listas o textos elegidos por el usuario.
1) en world Australia is a recruitment firm that was founded in 1989 as Calibrate Recruitment and became part of the en world group in 2012, specializing in technical and professional recruitment across industries in Australia and Asia-Pacific.
2) Their mission is to create the best recruitment connections to support long-term client and candidate success, with a vision to be the most trusted recruitment partner in the region.
3) They have a network of over 700 employees operating across 7 countries in Asia-Pacific, with a track record of over 30,000 placements and 500,000 active candidates for 4,600 clients.
A survey on methods and applications of meta-learning with GNNsShreya Goyal
This survey paper has provided a comprehensive review of works that are a combination of graph neural networks (GNNs) and meta-learning. They have also provided a thorough review, summary of methods, and applications in these categories. The application of meta-learning to GNNs is a growing and exciting field; many graph problems will benefit immensely from the combination of the two approaches.
IRJET- An Analysis of Recent Advancements on the Dependency ParserIRJET Journal
This document summarizes recent advancements in dependency parsers. It discusses how dependency parsers have been used to parse languages with free word order like Hindi and analyze source code from various programming languages. Several studies are highlighted that have used dependency parsers to extract semantic relationships, identify errors in automatic speech recognition, incorporate long-distance dependencies, and address feature sparseness issues. Dependency parsers have been shown to outperform other models for tasks like topic detection and can parse biomedical text, though both Link Grammar and Connexor Machinese Syntax parsers were found to have limitations for the biomedical domain.
International Journal of Computer Science, Engineering and Information Techno...IJCSEIT Journal
In the field of proteomics because of more data is added, the computational methods need to be more
efficient. The part of molecular sequences is functionally more important to the molecule which is more
resistant to change. To ensure the reliability of sequence alignment, comparative approaches are used. The
problem of multiple sequence alignment is a proposition of evolutionary history. For each column in the
alignment, the explicit homologous correspondence of each individual sequence position is established. The
different pair-wise sequence alignment methods are elaborated in the present work. But these methods are
only used for aligning the limited number of sequences having small sequence length. For aligning
sequences based on the local alignment with consensus sequences, a new method is introduced. From NCBI
databank triticum wheat varieties are loaded. Phylogenetic trees are constructed for divided parts of
dataset. A single new tree is constructed from previous generated trees using advanced pruning technique.
Then, the closely related sequences are extracted by applying threshold conditions and by using shift
operations in the both directions optimal sequence alignment is obtained.
A Low Rank Mechanism to Detect and Achieve Partially Completed Image TagsIRJET Journal
1. The document proposes a low-rank mechanism to detect and complete partially tagged images by approximating a global nonlinear model with local linear models using locality sensitivity and low-rank factorization.
2. It describes searching images based on category, keywords, or non-similar images and re-ranking images based on user likes/dislikes to increase the rank of more viewed images.
3. The proposed method is evaluated on a dataset showing its effectiveness over previous approaches through improved accuracy.
Improved wolf algorithm on document images detection using optimum mean techn...journalBEEI
Detection text from handwriting in historical documents provides high-level features for the challenging problem of handwriting recognition. Such handwriting often contains noise, faint or incomplete strokes, strokes with gaps, and competing lines when embedded in a table or form, making it unsuitable for local line following algorithms or associated binarization schemes. In this paper, a proposed method based on the optimum threshold value and namely as the Optimum Mean method was presented. Besides, Wolf method unsuccessful in order to detect the thin text in the non-uniform input image. However, the proposed method was suggested to overcome the Wolf method problem by suggesting a maximum threshold value using optimum mean. Based on the calculation, the proposed method obtained a higher F-measure (74.53), PSNR (14.77) and lowest NRM (0.11) compared to the Wolf method. In conclusion, the proposed method successful and effective to solve the wolf problem by producing a high-quality output image.
GPCODON ALIGNMENT: A GLOBAL PAIRWISE CODON BASED SEQUENCE ALIGNMENT APPROACHijdms
The alignment of two DNA sequences is a basic step in the analysis of biological data. Sequencing a long
DNA sequence is one of the most interesting problems in bioinformatics. Several techniques have been
developed to solve this sequence alignment problem like dynamic programming and heuristic algorithms.
In this paper, we introduce (GPCodon alignment) a pairwise DNA-DNA method for global sequence
alignment that improves the accuracy of pairwise sequence alignment. We use a new scoring matrix to
produce the final alignment called the empirical codon substitution matrix. Using this matrix in our
technique enabled the discovery of new relationships between sequences that could not be discovered using
traditional matrices. In addition, we present experimental results that show the performance of the
proposed technique over eleven datasets of average length of 2967 bps. We compared the efficiency and
accuracy of our techniques against a comparable tool called “Pairwise Align Codons” [1].
Graph Algorithm to Find Core Periphery Structures using Mutual K-nearest Neig...gerogepatton
Core periphery structures exist naturally in many complex networks in the real-world like social, economic, biological and metabolic networks. Most of the existing research efforts focus on the identification of a meso scale structure called community structure. Core periphery structures are another equally important meso scale property in a graph that can help to gain deeper insights about the relationships between different nodes. In this paper, we provide a definition of core periphery structures suitable for weighted graphs. We further score and categorize these relationships into different types based upon the density difference between the core and periphery nodes. Next, we propose an algorithm called CP-MKNN (Core Periphery-Mutual K Nearest Neighbors) to extract core periphery structures from weighted graphs using a heuristic node affinity measure called Mutual K-nearest neighbors (MKNN). Using synthetic and real-world social and biological networks, we illustrate the effectiveness of developed core periphery structures.
This document discusses using hidden Markov models (HMMs) for unsupervised learning in hyperspectral image classification. It proposes an HMM-based probability density function classifier that models hyperspectral data using a reduced feature space. The approach uses an unsupervised learning scheme for maximum likelihood parameter estimation, combining both model selection and estimation. This HMM method can accurately model and synthesize approximate observations of true hyperspectral data in a reduced feature space without relying on supervised learning.
In this paper the design of an experiment is presented. An experiment was designed to select relevant and not redundant features or characterization functions, which allow quantitatively discriminating among different types of complex networks. As well there exist researchers given to the task of classifying some networks of the real world through characterization functions inside a type of complex network, they do not give enough evidences of detailed analysis of the functions that allow to determine if all of them are necessary to carry out an efficient discrimination or which are better functions for discriminating. Our results show that with a reduced number of characterization functions such as the degree dispersion coefficient can discriminate efficiently among the types of complex networks treated here.
COMPARISON BETWEEN GENETIC FUZZY METHODOLOGY AND Q-LEARNING FOR COLLABORATIVE...ijaia
A comparison between two machine learning approaches viz., Genetic Fuzzy Methodology and Q-learning,
is presented in this paper. The approaches are used to model controllers for a set of collaborative robots
that need to work together to bring an object to a target position. The robots are fixed and are attached to
the object through elastic cables. A major constraint considered in this problem is that the robots cannot
communicate with each other. This means that at any instant, each robot has no motion or control
information of the other robots and it can only pull or release its cable based only on the motion states of
the object. This decentralized control problem provides a good example to test the capabilities and
restrictions of these two machine learning approaches. The system is first trained using a set of training
scenarios and then applied to an extensive test set to check the generalization achieved by each method.
Centrality Prediction in Mobile Social NetworksIJERA Editor
By analyzing evolving centrality roles using time dependent graphs, researchers may predict future centrality values. This may prove invaluable in designing efficient routing and energy saving strategies and have profound implications on evolving social behavior in dynamic social networks. In this paper, we propose a new method to predict centrality values of nodes in a dynamic environment. The proposed method is based on calculating the correlation between current and past measure of centrality for each corresponding node, which is used to form a composite vector to represent the given state of centralities. The performance of the proposed method is evaluated through simulated predictions on data sets from real mobile networks. Results indicate significantly low prediction error rate occurs, with a suitable implementation of the proposed method.
Optimized Neural Network for Classification of Multispectral ImagesIDES Editor
This document summarizes an article that proposes using a multiobjective particle swarm optimization (MOPSO) approach to optimize the structure of an artificial neural network for classifying multispectral satellite images. Specifically, the MOPSO is used to simultaneously select the most discriminative spectral bands from the available options and determine the optimal number of nodes in the hidden layer of the neural network. The MOPSO approach is compared to traditional classifiers like maximum likelihood classification and Euclidean classifiers. The results show that the MOPSO-optimized neural network approach provides superior performance for remote sensing image classification problems.
Molecular docking is a computational method that predicts the preferred orientation of one molecule to another when bound and forming a stable complex. It involves finding the best match between two molecules and can be used for drug design and development by predicting the binding affinity between potential drug candidates and their protein targets. Common molecular docking approaches include shape complementarity, which describes interacting molecules as complementary surfaces, and simulation methods, which simulate the actual docking process and calculate interaction energies between molecules. Popular molecular docking software includes AutoDock, FlexX, and GOLD.
DCT AND DFT BASED BIOMETRIC RECOGNITION AND MULTIMODAL BIOMETRIC SECURITYIAEME Publication
This Research paper discusses the study and analysis conducted during this research on various techniques in biometric domain. A close glance on biometric enhancement techniques and their limitations are presented in this research paper. This process would enable researcher to understand the research contributions in the area of DCT and DFT based recognition and security, locate some crucial limitations of these notable research. This paper having summary about the different research papers that applicable to our topic of research which mentioned above. Biometric Recognition and security is a most important subject of research in this area of image processing.
AI approaches in healthcare - targeting precise and personalized medicine DayOne
1) AI approaches show promise in precision medicine by mining medical records to design personalized treatments and power virtual healthcare assistants.
2) A document discusses the role of AI in healthcare, predicting it could save $150 billion annually and reduce treatment costs by 50% while improving outcomes by 30-40%.
3) The document then describes several IBM Research projects applying AI to healthcare, including reconstructing molecular networks from literature, integrating data to stratify patients, and using network-based models to predict drug sensitivity and identify biomarkers.
Text documents clustering using modified multi-verse optimizerIJECEIAES
In this study, a multi-verse optimizer (MVO) is utilised for the text document clus- tering (TDC) problem. TDC is treated as a discrete optimization problem, and an objective function based on the Euclidean distance is applied as similarity measure. TDC is tackled by the division of the documents into clusters; documents belonging to the same cluster are similar, whereas those belonging to different clusters are dissimilar. MVO, which is a recent metaheuristic optimization algorithm established for continuous optimization problems, can intelligently navigate different areas in the search space and search deeply in each area using a particular learning mechanism. The proposed algorithm is called MVOTDC, and it adopts the convergence behaviour of MVO operators to deal with discrete, rather than continuous, optimization problems. For evaluating MVOTDC, a comprehensive comparative study is conducted on six text document datasets with various numbers of documents and clusters. The quality of the final results is assessed using precision, recall, F-measure, entropy accuracy, and purity measures. Experimental results reveal that the proposed method performs competitively in comparison with state-of-the-art algorithms. Statistical analysis is also conducted and shows that MVOTDC can produce significant results in comparison with three well-established methods.
CONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANSijseajournal
ABSTRACT
In this paper we propose a novel method to cluster categorical data while retaining their context. Typically, clustering is performed on numerical data. However it is often useful to cluster categorical data as well, especially when dealing with data in real-world contexts. Several methods exist which can cluster categorical data, but our approach is unique in that we use recent text-processing and machine learning advancements like GloVe and t- SNE to develop a a context-aware clustering approach (using pre-trained
word embeddings). We encode words or categorical data into numerical, context-aware, vectors that we use to cluster the data points using common clustering algorithms like K-means.
Este documento presenta breves descripciones de varias herramientas digitales como Blogger (un servicio de blogs), About.me (una tarjeta de visita online), WordCloud (una herramienta para crear nubes de palabras), Educaplay (una plataforma para crear actividades educativas), Storybird (una red social para cuentos ilustrados), Pixlr (una aplicación para editar imágenes), PicMonkey (una página para agregar efectos a fotos), Prezi (una herramienta para presentaciones con fotos y música) y SlideShare
Standiste à paris : Groupe Smart Dezign By RZZeroukhi
le Groupe Smart Dezign By RZ est Standiste situé à paris
Après l’installation et la décoration de votre stand, l’équipe de SMART DEZIGN vous accompagne durant toute votre exposition en vous donnant les meilleurs conseils. Ainsi, nous vous garantissons un excellent résultat de travail.
The document provides details about an IT professional's objective, qualifications, education, and extensive experience in networking roles for various organizations in Afghanistan, highlighting skills and responsibilities related to designing, implementing, configuring, and troubleshooting networks, servers, security, routing, switching, and more. The professional seeks to further their technical and professional excellence by contributing to consulting projects.
Blogger permite crear y publicar bitácoras en línea sin necesidad de escribir código o instalar programas. About.me centraliza vínculos y perfiles en redes sociales bajo una misma URL. Wordle genera nubes de palabras a partir de listas o textos elegidos por el usuario.
1) en world Australia is a recruitment firm that was founded in 1989 as Calibrate Recruitment and became part of the en world group in 2012, specializing in technical and professional recruitment across industries in Australia and Asia-Pacific.
2) Their mission is to create the best recruitment connections to support long-term client and candidate success, with a vision to be the most trusted recruitment partner in the region.
3) They have a network of over 700 employees operating across 7 countries in Asia-Pacific, with a track record of over 30,000 placements and 500,000 active candidates for 4,600 clients.
El Test de Matrices progresivas de Raven es un test de inteligencia no verbal que mide la capacidad de razonamiento y resolución de problemas. Se compone de una serie de matrices con una pieza faltante que el sujeto debe completar analizando las relaciones entre las piezas. El test evalúa la capacidad de deducción (factor g) y puede aplicarse de forma individual o colectiva a personas de diferentes edades y niveles educativos.
This document presents an analysis of gene co-expression networks in hepatocellular carcinoma (HCC) using actor-semiotic network modeling. The network was constructed by integrating gene co-expression, microRNA-gene, and protein interaction relationships. Topological analysis identified discrete clusters and emergent groups within the network. A graph signature method was used to analyze node centrality and identify biologically significant nodes. Based on the network and centrality analysis, several hypotheses about HCC pathology are proposed, including the roles of specific genes and microRNAs in processes like cell cycle regulation, angiogenesis, and intracellular trafficking.
Link between replication and cell cycle Shreya Ahuja
DNA replication is precisely controlled to occur once per cell cycle. It is linked to the cell cycle through two principles: 1) Initiation of DNA replication commits the cell to further division as replication must complete before division. 2) Signals ensure each replicon is activated only once per cell cycle. Replication initiates at origins through the assembly of pre-replication complexes containing the MCM helicase, which is activated by CDK and DDK to begin DNA unwinding and replication. The cell cycle is halted in response to DNA damage to allow for repair before replication and division resume.
There are three main levels of control to ensure DNA replication is initiated only once per cell cycle in bacteria:
1. ATP hydrolysis by beta-clamp protein
2. Sequestration of hemimethylated DNA by SeqA protein
3. Titration of DnaA protein levels through its regulatory locus
In eukaryotes, licensing factors like ORC, Cdc6, Cdt1 and MCM proteins bind to origins of replication and license them for a single round of replication. After replication begins, these factors dissociate from origins preventing re-replication. Geminin protein also prevents re-licensing of newly synthesized DNA in G2 phase.
DNA polymerase proofreads
Community management et l'utilisation des réseaux sociaux pour les entreprises... conférence donnée le 7 mars 2013 pour le compte de la société Media Buzz. Pour plus de détails n'hésitez pas à me contacter !
Prendre en compte la personnalité dans le processus de recrutementDrake International
La décision d’embauche est une décision importante pour laquelle il est essentiel de faire le bon choix du premier coup. Dans le contexte actuel de mondialisation et de compétitivité, embaucher une personne performante et en adéquation avec l’organisation, est plus crucial que jamais pour le succès d’une entreprise.
C’est pourquoi les professionnels RH doivent s’assurer qu’ils utilisent les bonnes stratégies de recrutement et d’intégration afin de gérer efficacement chaque étape du processus d’embauche.
Participez à ce wébinaire gratuit de 30 minutes le Jeudi 26 Février 2015 pour en apprendre davantage. Au cours de la présentation nous discuterons :
• Des éléments clés du processus de recrutement.
• De ce que sont l’adéquation, les critères d’embauche et le processus formalisé de recrutement.
• Des étapes du recrutement dans lesquelles la prise en compte de la personnalité est un facteur de réussite.
• De la façon d’attirer, d’identifier et de retenir le candidat adéquat pour un poste.
Eukaryotic DNA replication occurs in the cell nucleus and involves multiple protein complexes. It begins with the assembly of pre-replication complexes at origins of replication during G1 phase. During S phase, these complexes are activated by cyclin-dependent kinases and Dbf4-dependent kinases to initiate bidirectional replication forks. Leading strand synthesis is continuous while lagging strand occurs discontinuously in short Okazaki fragments. Replication terminates once the replication forks from opposing origins meet.
This document provides a high-level overview of protein-protein interaction networks and graph-theoretic modeling approaches. It discusses how experimental data on protein interactions is incomplete and noisy but still offers opportunities for biological insight. It describes how graph theory can be used to model these networks and compare them, such as by analyzing network properties and frequencies of small subgraph structures. Different random network models are also discussed as ways to understand properties of real biological networks. The goal is to provide a concise summary of this area and comment on possible future challenges through a graph-theoretic lens.
An information-theoretic, all-scales approach to comparing networksJim Bagrow
My presentation at NetSci 2018 on Portrait Divergence, a new approach to comparing networks that is simple, general-purpose, and easy to interpret.
The preprint: https://arxiv.org/abs/1804.03665
The code: https://github.com/bagrow/portrait-divergence
Java tutorial: Programmatic Access to Molecular InteractionsRafael C. Jimenez
This document provides information about a tutorial on programmatic access to molecular interactions. It discusses setting up the required software including Java, Maven, and a Java IDE. It also describes downloading the course source code and preparing the project. The tutorial will cover topics such as standards for molecular interaction data like PSIMITAB and PSICQUIC, as well as exercises on accessing and working with this data programmatically.
This document provides an annual progress report for the National Resource for Network Biology (NRNB) for the period of May 1, 2011 to April 30, 2012. It summarizes the following:
1) Advances made in developing algorithms to identify network modules and use modules as biomarkers for disease. This includes methods to capture complex logical relationships within modules.
2) Progress on tools to enable new network analysis and visualization capabilities, including a new version of Cytoscape.
3) Growth of collaborations through the NRNB, which have nearly doubled over the past year to around 100 projects.
4) Continued development of the Cytoscape App Store to support the user and developer community.
Poster presented at the Thirty-Second International Joint Conference on Artificial Intelligence, 2023, Macao, SAR. https://doi.org/10.24963/ijcai.2023/554
National Resource for Networks Biology's TR&D Theme 3: Although networks have been very useful for representing molecular interactions and mechanisms, network diagrams do not visually resemble the contents of cells. Rather, the cell involves a multi-scale hierarchy of components – proteins are subunits of protein complexes which, in turn, are parts of pathways, biological processes, organelles, cells, tissues, and so on. In this technology research project, we will pursue methods that move Network Biology towards such hierarchical, multi-scale views of cell structure and function.
Review on Computational Bioinformatics and Molecular Modelling Novel Tool for...ijtsrd
Advancement in science and technology has brought a remarkable change in the field of drug discovery. Earlier it was very difficult to predict the target for receptor but nowadays, it is easy and robust task to dock the target protein with ligand and binding affinity is calculated. Docking helps in the virtual screening of drug along with its hit identification. There are two approaches through which docking can be carried out, shape complementary and stimulation approach. There are many procedures involved in carrying out docking and all require different software's and algorithms. Molecular docking serves as a good platform to screen a large number of ligands and is useful in Drug-DNA studies. This review mainly focuses on the general idea of molecular docking and discusses its major applications, different types of interaction involved and types of docking. Rishabh Jain "Review on Computational Bioinformatics and Molecular Modelling: Novel Tool for Drug Discovery" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-1 , December 2018, URL: http://www.ijtsrd.com/papers/ijtsrd18914.pdf
http://www.ijtsrd.com/pharmacy/pharmacoinformatics/18914/review-on-computational-bioinformatics-and-molecular-modelling-novel-tool-for-drug-discovery/rishabh-jain
GRAPH ALGORITHM TO FIND CORE PERIPHERY STRUCTURES USING MUTUAL K-NEAREST NEIG...ijaia
Core periphery structures exist naturally in many complex networks in the real-world like social,
economic, biological and metabolic networks. Most of the existing research efforts focus on the
identification of a meso scale structure called community structure. Core periphery structures are another
equally important meso scale property in a graph that can help to gain deeper insights about the
relationships between different nodes. In this paper, we provide a definition of core periphery structures
suitable for weighted graphs. We further score and categorize these relationships into different types based
upon the density difference between the core and periphery nodes. Next, we propose an algorithm called
CP-MKNN (Core Periphery-Mutual K Nearest Neighbors) to extract core periphery structures from
weighted graphs using a heuristic node affinity measure called Mutual K-nearest neighbors (MKNN).
Using synthetic and real-world social and biological networks, we illustrate the effectiveness of developed
core periphery structures.
Graph Algorithm to Find Core Periphery Structures using Mutual K-nearest Neig...gerogepatton
Core periphery structures exist naturally in many complex networks in the real-world like social,
economic, biological and metabolic networks. Most of the existing research efforts focus on the
identification of a meso scale structure called community structure. Core periphery structures are another
equally important meso scale property in a graph that can help to gain deeper insights about the
relationships between different nodes. In this paper, we provide a definition of core periphery structures
suitable for weighted graphs. We further score and categorize these relationships into different types based
upon the density difference between the core and periphery nodes. Next, we propose an algorithm called
CP-MKNN (Core Periphery-Mutual K Nearest Neighbors) to extract core periphery structures from
weighted graphs using a heuristic node affinity measure called Mutual K-nearest neighbors (MKNN).
Using synthetic and real-world social and biological networks, we illustrate the effectiveness of developed
core periphery structures.
A novel optimized deep learning method for protein-protein prediction in bioi...IJECEIAES
Proteins have been shown to perform critical activities in cellular processes and are required for the organism's existence and proliferation. On complicated protein-protein interaction (PPI) networks, conventional centrality approaches perform poorly. Machine learning algorithms based on enormous amounts of data do not make use of biological information's temporal and spatial dimensions. As a result, we developed a sequence- dependent PPI prediction model using an Aquila and shark noses-based hybrid prediction technique. This model operates in two stages: feature extraction and prediction. The features are acquired using the semantic similarity technique for good results. The acquired features are utilized to predict the PPI using hybrid deep networks long short-term memory (LSTM) networks and restricted Boltzmann machines (RBMs). The weighting parameters of these neural networks (NNs) were changed using a novel optimization approach hybrid of aquila and shark noses (ASN), and the results revealed that our proposed ASN-based PPI prediction is more accurate and efficient than other existing techniques.
A TWO-STAGE HYBRID MODEL BY USING ARTIFICIAL NEURAL NETWORKS AS FEATURE CONST...IJDKP
We propose a two-stage hybrid approach with neural networks as the new feature construction algorithms for bankcard response classifications. The hybrid model uses a very simpleneural network structure as the new feature construction tool in the firststage, thenthe newly created features are used asthe additional input variables in logistic regression in the second stage. The modelis compared with the traditional onestage model in credit customer response classification. It is observed that the proposed two-stage model outperforms the one-stage model in terms of accuracy, the area under ROC curve, andKS statistic. By creating new features with theneural network technique, the underlying nonlinear relationships between variables are identified. Furthermore, by using a verysimple neural network structure, the model could overcome the drawbacks of neural networks interms of its long training time, complex topology, and limited interpretability.
A consistent and efficient graphical User Interface Design and Querying Organ...CSCJournals
We propose a software layer called GUEDOS-DB upon Object-Relational Database Management System ORDMS. In this work we apply it in Molecular Biology, more precisely Organelle complete genome. We aim to offer biologists the possibility to access in a unified way information spread among heterogeneous genome databanks. In this paper, the goal is firstly, to provide a visual schema graph through a number of illustrative examples. The adopted, human-computer interaction technique in this visual designing and querying makes very easy for biologists to formulate database queries compared with linear textual query representation.
This document discusses predicting new friendships in social networks using temporal information. It describes research on predicting new links in social networks over time using supervised learning models trained on temporal features from past network interactions. The researchers used anonymized Facebook data over 28 months to train decision tree and neural network classifiers to predict new relationships, finding models using temporal information performed better than those without it.
Criterion based Two Dimensional Protein Folding Using Extended GA IJCSEIT Journal
In the dynamite field of biological and protein research, the protein fold recognition for long pattern
protein sequences is a great confrontation for many years. With that consideration, this paper contributes
to the protein folding research field and presents a novel procedure for mapping appropriate protein
structure to its correct 2D fold by a concrete model using swarm intelligence. Moreover, the model
incorporates Extended Genetic Algorithm (EGA) with concealed Markov model (CMM) for effectively
folding the protein sequences that are having long chain lengths. The protein sequences are preprocessed,
classified and then, analyzed with some parameters (criterion) such as fitness, similarity and sequence gaps
for optimal formation of protein structures. Fitness correlation is evaluated for the determination of
bonding strength of molecules, thereby involves in efficient fold recognition task. Experimental results have
shown that the proposed method is more adept in 2D protein folding and outperforms the existing
algorithms.
Project report: Investigating the effect of cellular objectives on genome-sca...Jarle Pahr
Report from a half-semester master-level project carried out at the department of biotechnology, Norwegian University of Science and Technology. Describes a MATLAB-based framework for comparing experimental metabolic flux data with model predictions and evaluating objective functions.
New foreground markers for Drosophila cell segmentation using marker-controll...IJECEIAES
Image segmentation consists of partitioning the image into different objects of interest. For a biological image, the segmentation step is important to understand the biological process. However, it is a challenging task due to the presence of different dimensions for cells, intensity inhomogeneity, and clustered cells. The marker-controlled watershed (MCW) is proposed for segmentation, outperforming the classical watershed. Besides, the choice of markers for this algorithm is important and impacts the results. For this work, two foreground markers are proposed: kernels, constructed with the software Fiji and Obj.MPP markers, constructed with the framework Obj.MPP. The new proposed algorithms are compared to the basic MCW. Furthermore, we prove that Obj.MPP markers are better than kernels. Indeed, the Obj.MPP framework takes into account cell properties such as shape, radiometry, and local contrast. Segmentation results, using new markers and illustrated on real Drosophila dataset, confirm the good performance quality in terms of quantitative and qualitative evaluation.
During seizures, different types of communication between different parts of the brain are characterized by many state of the art connectivity measures. We propose to employ a set of undirected (spectral matrix, the inverse of the spectral matrix, coherence, partial coherence, and phase-locking value) and directed features (directed coherence, the partial directed coherence) to detect seizures using a deep neural network. Taking our data as a sequence of ten sub-windows, an optimal deep sequence learning architecture using attention, CNN, BiLstm, and fully connected neural networks is designed to output the detection label and the relevance of the features. The relevance is computed using the weights of the model in the activation values of the receptive fields at a particular layer. The best model resulted in 97.03% accuracy using balanced MIT-BIH data subset. Finally, an analysis of the relevance of the features is reported.
Brain-Computer Interfaces are communication
systems that use brain signals as commands to a device. Despite
being the only means by which severely paralysed people can
interact with the world most effort is focused on improving and
testing algorithms offline, not worrying about their validation in
real life conditions. The Cybathlon’s BCI-race offers a unique
opportunity to apply theory in real life conditions and fills
the gap. We present here a Neural Network architecture for
the 4-way classification paradigm of the BCI-race able to run
in real-time. The procedure to find the architecture and best
combination of mental commands best suiting this architecture
for personalised used are also described. Using spectral power
features and one layer convolutional plus one fully connected
layer network we achieve a performance similar to that in
literature for 4-way classification and prove that following our
method we can obtain similar accuracies online and offline
closing this well-known gap in BCI performances
The document describes the eBank UK project, which seeks to link e-research data, scholarly communication, and e-learning by building connections from data generated in experiments through publications and into educational resources. It discusses the scholarly knowledge cycle and how eBank UK is addressing the bottleneck of data publication by developing a distributed information architecture with common data standards and ontologies. This will allow an aggregator to harvest metadata from repositories holding experimental data and publications and provide a single access point for discovery across distributed resources through services like search and retrieval.
1. Proteomics 2012, 12, 1669–1686 1669DOI 10.1002/pmic.201100454
REVIEW
Visualization of the interactome: What are we
looking at?
David C. Y. Fung1
, Simone S. Li1
, Apurv Goel1
, Seok-Hee Hong2
and Marc R. Wilkins1
1
New South Wales Systems Biology Initiative and School of Biotechnology and Biomolecular Sciences, The
University of New South Wales, New South Wales, Australia
2
School of Information Technologies, Faculty of Engineering and Information Technologies, The University of
Sydney, New South Wales, Australia
Network visualization of the interactome has been become routine in systems biology research.
Not only does it serve as an illustration on the cellular organization of protein–protein inter-
actions, it also serves as a biological context for gaining insights from high-throughput data.
However, the challenges to produce an effective visualization have been great owing to the
fact that the scale, biological context and dynamics of any given interactome are too large and
complex to be captured by a single visualization. Visualization design therefore requires a
pragmatic trade-off between capturing biological concept and being comprehensible. In this
review, we focus on the biological interpretation of different network visualizations. We will
draw on examples predominantly from our experiences but elaborate them in the context of
the broader field. A rich variety of networks will be introduced including interactomes and the
complexome in 2D, interactomes in 2.5D and 3D and dynamic networks.
Keywords:
Bioinformatics / Interactome / Network visualization / Protein–protein interactions /
Systems biology / Visual analytics
Received: August 31, 2011
Revised: November 28, 2011
Accepted: December 19, 2011
1 Introduction
Intense interest in understanding the cellular organization of
protein–protein interactions (PPIs) has motivated the large-
scale projects of PPI mapping using a variety of methods
[1–3]. The massive scale of the binary interaction data gener-
ated has enabled the construction of interactomes. It has also
spurred the need for in silico network visualization because it
provides a simplified summary of what is otherwise a lengthy
adjacency list of protein pairs [4]. Network visualization thus
becomes an essential part of systems biology, not least an
essential analytical tool.
Correspondence: Professor Marc Wilkins, New South Wales Sys-
tems Biology Initiative, School of Biotechnology and Biomolec-
ular Sciences, The University of New South Wales, New South
Wales, 2052, Australia
E-mail: m.wilkins@unsw.edu.au
Fax: +612-93851483
Abbreviations: GO, Gene Ontology; PPI, protein–protein
interaction
Ideally, an effective interactome visualization should lever-
age the investigator’s ability to comprehend the collaborative
roles of various proteins in delivering cellular functions. How-
ever, this ideal is challenging to meet. As will be reviewed in
this paper, there is no universal method for visualizing a PPI
network or an interactome. Each method has its strengths
and limitations. The choice of method depends on the hu-
man and material factors. The human factors are the investi-
gator’s analytical objective, one’s expert knowledge in his/her
research discipline and his/her cognitive capacity to resolve
visualization scale and complexity. The material factors are
screen space and computational tractability of the visualiza-
tion rendering.
In this review, we will not review popular tools that have
been used for generating network visualizations, e.g. Cy-
toscape [5], VisANT [6], Osprey [7], ProViz [8] and Patika
[9] or commonly used graphical layouts and their variants,
e.g. circular [7, 10], force-directed [11, 12], hierarchical [13]
and parallel level layouts [14]. These have been reviewed else-
where [15]. For a more updated review on visual network an-
alytics of ‘omics data, we recommend the more recent pub-
lication [16]. We think that it will be of more value to the
C 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.com
2. 1670 D. C. Y. Fung et al. Proteomics 2012, 12, 1669–1686
molecular cell biologist (aka. the investigator) if our review
focuses on the biological interpretation of different visual-
ized PPI networks and interactomes. Only graph-based visu-
alization will be discussed because most investigators first
familiarized themselves with the node-edge representation
of metabolic networks in the orthogonal layout. Therefore,
graph-based visualizations match their mental picture on
PPIs much better than the alternative adjacency matrices.
The remainder of this review is divided into five sections.
In Section 2, we give a formal definition of the PPI network vi-
sualization. In Section 3, we review the various visualizations
of the most basic representation of a PPI network, i.e. undi-
rected networks representing only the physical PPIs with-
out explicitly displaying any a priori biological knowledge. In
Section 4, we give our formal definition of an interactome be-
fore reviewing the various visualizations. These are visualiza-
tions that involve not only physical PPIs but other intracellu-
lar networks, e.g. signal transduction networks. In Section 5,
we review timescale dynamic network visualizations. These
are ones that visualize changes in network topology induced
by the temporal variation in protein abundance. In Section 6,
we review visualizations that explicitly present biological con-
text as part of the interactome. Finally, we will discuss the
general challenges of generating effective visualizations of
interactomes for biological analysis. In the review, we will
draw on examples predominantly from our experiences in
network visualization but elaborate on them in the context
of the broader field. The visualizations discussed in Sections
3.1, 4.2, 4.4, 5.2 and 6.3 were generated by the authors using
GEOMI [17]. The Interactorium mentioned in Section 4.3 is
an OpenGL/C++ application known as SkyRail. It should be
noted that, throughout this review, we do not use the terms
‘PPI network’ and ‘interactome’ interchangeably because we
have defined them differently.
2 Visualization is a model
In the simplest terms, a PPI network can be defined mathe-
matically as a node-edge graph denoted as G(V, E), also known
as the graph-theoretic model [18]. This model contains a com-
bination of nodes V and edges E where each node v represents
a certain protein. Each edge e represents the physical interac-
tion between the populations of two protein species v1 and v2,
i.e. physical PPIs in short. Rather than being an exact replica,
the graph G is only a symbolic form of a real PPI network
that approximates the variety of PPIs known to occur within
a living cell. Network visualization is therefore the process
of mapping the graph-theoretic model of the PPI network to
the visual elements drawn on a screen. In other words, the
node-edge graph is not visualized until its visual representa-
tion is being drawn by computation. That is why researchers
in the information visualization field recognize PPI network
visualization as largely a graph drawing problem.
Although the word ‘graph’ has often been used inter-
changeably with ‘network’, they are not the same. A graph
is just a combinatorial model of nodes and edges which, by
itself, does not model the function of a PPI network [19].
On the other hand, a network does because its edges repre-
sent interactions required for the functioning of one or more
pathways. For this reason, the graph-theoretic model of a PPI
network is usually a labelled graph. It has node and edge at-
tributes attached, i.e. the graph G is a set of V, E, ⌽ and ⌿,
denoted as G = (V, E, ⌽, ⌿) where ⌽ denotes the set of node
attributes and ⌿ denotes the set of edge attributes. The node
attributes can include the protein symbol, expression level
and node colour. The edge attributes can include the inter-
action mode (activation, inactivation, induction, repression
or physical), affinity, weights quantifying a certain statistical
score (e.g. correlation coefficient), or time span [20]. These
data, when retrieved and co-visualized, allow the investigator
to have a better understanding of the network.
The basis of network visualization is comprised of glyphs,
colour hues, lines or arcs and their layout on screen. It is
meant to serve as a visual analysis tool for the investigator to
interact with, not just a graphical representation of PPI data.
As such, it can be used for visualizing the complex spatial or
the spatio-temporal interactions between proteins. Those fea-
tures raise the investigator’s curiosity on the general proper-
ties of an interactome, such as asymmetric self-organization
[21], modularity [22] and fault tolerance due to functional
degeneracy [23]. Asymmetric self-organization refers to the
heterogeneous preferential attachment among proteins lead-
ing to the uneven distribution of hubs within a network [24].
Modularity refers to the property by which a network can be
subdivided into interconnected subnetworks with each serv-
ing a certain biological function [25,26]. Functional degener-
acy refers to the overlapping functions of different modules
hence compensating for the lost function when one of the
modules fails [23]. Not least, a PPI network visualization can
form the basis of an interactome which involves (sub)model
integration [27]. This submodel can be a signal transduction
network or a metabolic network. As will be discussed in Sec-
tion 4, the integrated visualization of the physical PPI to its
corresponding signalling network or its underlying gene reg-
ulatory network is much more informative than just the PPI
network alone.
Since visualization is only a model rather than the exact
replica of a real interactome, there is a need to prioritize the
network properties for visualization purpose. In other words,
what should be the focus of a visualization? Network theo-
rists in complex systems research called this the centrality
of a model [27]. In simpler terms, what is the pre-conceived
biological concept that the model is trying to represent? It is
the major consideration in deciding on a visualization design
especially when it is intended for visual analytics [28]. Visual
analytics is the science of analytical reasoning supported by
highly interactive visual interfaces. It shares the same tenet
as data or information visualization, i.e. using an interactive
visualization or a system of visualizations that allow the in-
vestigator to gain insight into the information hidden in the
large-scale data [29]. As the investigator’s analytical objective
C 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.com
3. Proteomics 2012, 12, 1669–1686 1671
changes, so is the biological concept represented by the visu-
alization.
3 Visualization on the cellular scale
3.1 Undirected network visualization
The most common visualization of a PPI network is an undi-
rected network in the force-directed layout [11]. The yeast
PPI network shown in Fig. 1A is a classic example. Each
protein is visualized as a coloured spherical node and the
physical interaction between two proteins is visualized as a
solid line representing an edge. This visualization provides
a static view of the known physical PPIs in the unicellular
Saccharomyces cerevisiae, devoid of interaction dynamics. It
represents all PPIs as undirected edges giving the investiga-
tor the false impression that each protein interacts with its
partners constitutively and collectively. For this reason, one
can argue that the undirected network is the coarsest approx-
imation of its real-life counterpart. It should be noted that
while the visualization exposes the pairwise interaction be-
tween two proteins, it does not mean that there is only one
molecule of a given protein interacting with its partners. In
effect, it ignores the stoichiometry of the protein complexes.
Rather, it is meant to be a concise summary of the interaction
between the populations of two proteins of interest.
Although it is only a coarse approximation, the undirected
network visualization does provide insights into the global
properties of the yeast PPI network which would otherwise
be obscured in the original binary interaction data set [30].
The most obvious one, as shown in Fig. 1A, is the hetero-
geneous connectivity of proteins. Some have more incident
edges than others which mean some proteins are more fre-
quently engaged in PPIs than others. Hence, those with con-
spicuously higher number of incident edges than others in-
dicate their higher usage within the network. In the yeast
PPI network, the number of interaction partners can vary
substantially from one protein to another.
Another obvious feature of the undirected PPI network is
that some parts of the network are denser, giving the entire
network a heterogeneous topology (Fig. 1A). The denser parts
contain multiple subnetworks called cliques which are en-
riched in protein complexes [20]. They often represent large
protein complexes, e.g. DNA polymerase or a proteasome.
The subunits within each complex have been denoted as party
hubs [26], which are more highly connected to one another
within the complex than without. The sparser parts of the
undirected PPI network contain some spoke-and-hub forma-
tions where each centre node is connected to multiple nodes
that do not necessarily interact with one another. The cen-
tre node has been denoted as a date hub protein which has
been proposed to interact with its partners more dynamically
than party hubs [26]. Their disparity in interaction dynamics
has recently been explained in terms of differences in their
3D structures. Date hubs have one or two interaction inter-
faces whereas party hubs have three or more [31]. Proteins
serving as connectors between multiple complexes have been
denoted as bottleneck proteins [32]. It has been suggested that
bottleneck proteins can be classified into hub-bottleneck and
non-hub bottleneck proteins.
Trying to visually identify date and party hubs is not with-
out its caveat, the investigator may notice that it is easier
to visually distinguish between date hubs and party hubs in
a large-scale PPI network than to distinguish between date
hubs and different types of bottleneck proteins. One example
is highlighted in Fig. 1A (dark blue boxed inset) where the dis-
tinctly sized protein node can be recognized as a party hub,
a date hub and also a hub-bottleneck protein. Even harder
to distinguish are the date hub and the non-hub bottleneck
protein. That is because both appear in a hub-and-spoke for-
mation in the force-directed layout. An example is also high-
lighted in Fig. 1A (green boxed inset). Because of its higher
edge density, a clique often draws the investigator’s atten-
tion more than the hub-and-spoke formation. If the yeast PPI
network is reduced to the size seen in Fig. 1B, the investi-
gator should find it easier to distinguish between the date
hub PCNA, and the hub-bottleneck protein CDC6 (Fig. 1B).
Here lies the limitation of undirected network visualization
in the force-directed layout. Sampling scale and/or bias can
affect the investigator’s perception on the local topology of
any proteins within the wider PPI network, and may lead to
very subjective interpretation.
3.2 PPI network visualization with topological
emphasis
The limitations seen with undirected network visualization
is largely due to its lack of biological context. An alternative
design adaptable to PPI network visualization is the between-
ness fast layout (BFL) which uses the biological relevance of
the shortest path betweenness centrality as a layout optimiza-
tion criterion [33]. The metric of betweenness centrality was
originally used for measuring the number of shortest paths
going through a certain node [34]. The design criterion of
BFL was based on the proposition that betweenness central-
ity is a useful predictor of essential proteins [32]. In the yeast
PPI network, nodes with the highest betweenness centrality
have been found to be bottleneck proteins that serve as con-
nectors between two or more complexes. It has been further
suggested that complexes interacting via bottleneck proteins
are in fact functional modules [35]. Recent study has sug-
gested that the biological context of betweenness centrality is
also applicable to Caenorhabditis elegans [36], thus increasing
the confidence that betweenness centrality is a universally
applicable metric for all species.
The BFL algorithm optimizes the positioning of high
betweenness nodes as the first priority followed by node
density, edge length and edge crossing minimization. An
example of a murine gene regulatory network visualiza-
tion generated by this method is shown in Fig. 2A. The
C 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.com
4. 1672 D. C. Y. Fung et al. Proteomics 2012, 12, 1669–1686
Figure 1. Visualization of the
yeast PPI network in the force-
directed layout generated with
the use of GEOMI. (A) The
largest connected component
of the yeast PPI network, rep-
resenting 1256 proteins and
1803 interactions. The network
was generated using yeast two-
hybrid data compiled by Bertin
et al. [78]. An example of a
clique is highlighted in red and
bound by the dark blue box.
The larger node may represent
a party hub, date hub or hub-
bottleneck protein. An example
of a date hub is highlighted
in the green box. (B) Visual-
ization of the DNA replication
(GO:0006260) PPI network, rep-
resenting 55 proteins and 83 in-
teractions [96]. Inset: PCNA hub
protein (indicated by the green
arrow) and its interactions.
visualization effectively highlights the multi-scalar nature
of the gene regulatory network by rendering the nodes of
low betweenness as smaller in size than their high between-
ness counterparts. The effect is the reduction in the draw-
ing area occupied by the hub-and-spoke or radial forma-
tion of the date hubs with their interaction partners but
much longer edges than those in the force-directed layout
(Fig. 2B).
The BFL network visualization requires a prior under-
standing on the biological context of betweenness centrality,
which may not be widely known among investigators. Some
may find it more intuitive to relate node degree size to lethal-
ity. Hence, multi-plane or concentric spherical layouts that
stratify the PPI network by node degree ranges could be us-
able alternatives to them [37]. An equally useful alternative
is to highlight network motifs in the PPI network [38]. This
has also been applied to gene regulatory networks but should
be applicable to PPI networks. The investigator should note
that in an undirected PPI network, network motifs repre-
sent the probable PPIs that give rise to protein complexes
[30].
4 Visualization of an interactome
4.1 Definition of an interactome
In reality, a cellular interactome contains not only physical
PPIs but also those in other interaction modes, e.g. sig-
nalling interaction, transcriptional regulatory interaction and
C 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.com
5. Proteomics 2012, 12, 1669–1686 1673
Figure 2. Visualization of the mouse gene regulatory network
using the betweenness fast layout algorithm [33]. (A) The size
of each blue coloured node corresponds to its magnitude of the
shortest path betweenness centrality score. Inset: Close-up view
of the mouse gene regulatory network. (B) The same network
drawn using the force-directed layout.
metabolic reaction [19, 20]. Because the latter are initiator-
effector relationships in which the initiator acts on its effec-
tor(s), they can be represented by directed edges and visu-
alized as solid arrows. An interactome should therefore be
modelled mathematically as a node-edge graph containing
both directed and undirected edges [30], i.e. a semi-directed
network [39].
4.2 Overlapping network visualization
In practice, the visualization of multiple interaction types
within the same network is cognitively challenging to com-
prehend unless the investigator can mentally decompose the
interactome into layers of heterogeneous networks. Figure
3 shows how such a mental picture can be captured effec-
tively with the use of overlapping network visualization in
the parallel plane layout [40,41]. The interaction data for the
TGFβ signal transduction network was sourced from Cui et
al. [42] and that for the nuclear PPI network was sourced from
BioGRID [43] and ECHO [44] databases. The visualization not
only represents the interactions in the human TGFβ-activated
signalling network and in the nuclear PPI network, but also
the PPIs that participate in both. Each network is constrained
to a 2D plane. The planes are being stacked along the z-
axis; hence called the parallel plane layout [40]. The oblique
view shows the mapping between the signalling and the nu-
clear PPI networks (Fig. 3B). The signalling network is being
drawn in a grid layout with fixed coordinates being assigned
to each node whereas the nuclear PPI network is being drawn
in the force-directed layout. Nodes in the signalling network
which share identical Gene Symbols with their correspond-
ing nodes in the PPI network are drawn on the middle plane
which forms the ‘overlap’ network. The edges within the over-
lap network are derived from both the signalling and the PPI
networks. Contrasting colour coding has been used to create
a visual segregation of the different networks. In order to re-
duce the scale of the nuclear PPI network, only the largest
connected component has been drawn [41].
The top and oblique views inform the investigator on how
the phosphorylation signal is being diffused from the sig-
nalling network to the nuclear PPI network extensively via the
highly connected hubs, e.g. ATM, CDK4, FOXO1A, HDAC1
and EP300 (Fig. 3A) [41]. The overlap layer reveals the proteins
commonly represented in the TGFβ signalling and nuclear
PPI networks, e.g. EP300, RB1 and HDAC1 (Fig. 3B). From
this, one can see how the parallel plane layout highlights
the physical connection between TGFβ signalling and the
nuclear PPI networks using inter-plane edges while expos-
ing their difference in interaction types [40]. Furthermore, it
decomposes a large semi-directed network into two smaller
and more comprehensible networks [41]. Although the exact
dynamics of TGFβ-regulated PPIs have not been explicitly
shown in the nuclear PPI network, the functional dependency
implied by the visualization would initiate the investigator to
construct preliminary hypotheses worthy of further investi-
gation.
Overlapping network visualization does have its limita-
tions. The first is the dimensional increase of the draw-
ing from 2D to 2.5D. 2.5D is a representation in which
graph drawing is constrained to the first two dimensions
with the third dimension being used for a different pur-
pose [45]. The investigator may find it challenging to navigate
through the visualization using a mouse pointer device be-
cause mapping its 2D movement to motion in 3D space is not
C 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.com
6. 1674 D. C. Y. Fung et al. Proteomics 2012, 12, 1669–1686
Figure 3. Visualization of the human signalling-nuclear PPI over-
lapping network using the parallel plane layout [40] generated
with the use of GEOMI. (A) Top view. This view exposes the TGFβ
signalling layer with a semi-transparent view towards the overlap
layer, and the nuclear PPI network layer underneath. The direction
of each arrow represents the direction of a phosphorylation reac-
tion. The nodes representing ATM, CDK4, FOXO1A, HDAC1 and
EP300 in the TGFβ signalling layer are indicated by red arrows.
(B) Oblique view. The TGFβ signalling is drawn on the top plane
and stacked over the nuclear PPI network on the bottom plane.
The signalling proteins on the top plane are represented by blue-
coloured nodes and the signalling interactions are represented
by blue arrows. On the bottom plane, proteins and their phys-
ical interactions are represented as green-coloured nodes and
edges, respectively. The network in the middle plane represents
the overlap between the signalling and the nuclear PPI networks.
Red-coloured nodes in this plane represent proteins common to
the two networks. The blue lines represent the signalling inter-
actions, and the green lines represent physical PPIs. Nodes that
represent the same proteins are connected by yellow edges, e.g.
EP300, RB1 and HDAC1 are indicated by red arrows.
Figure 4. Yeast DNA-binding protein Rap1 and its interaction
partners in the nucleus, generated with the use of the Interacto-
rium [47]. Protein nodes are represented by circles in bright red;
those with structural data are highlighted by a small green cross
next to their gene name. PPIs are represented as the light pink
solid lines; the thickness of the lines correlates to the evidence
score of the interaction. The two crystal structures of the Rap1-
DNA complex, 3CZ6 (left) and 1IGN (right), are sourced from the
PDB [48]. They are shown in the ribbon and string conformation.
straightforward [46]. The second is the augmentation of vi-
sual complexity with the increase in the network size on each
plane. We found that large PPI networks of over 1000 nodes
and 2000 edges are poor choices for visualization of this kind.
The sheer amount of edge cluttering and node occlusion can
make the visualization unreadable. The key design criterion
lies in restricting the size of the network drawn on the top
layer.
4.3 Interactive 3D visualization
Although the 2.5D overlapping network can effectively cap-
ture the functional relationship between heterogeneous
networks, it does not capture the multi-scalar physical modu-
larity of an interactome. It is comprised of numerous subnet-
works localized in a highly organized set of compartments.
The compartment can be a protein complex, an organelle
or any subcellular localization. One tool that enables the
investigator to navigate in between an overview and a de-
tailed view of an interactome, and trigger on-the-fly protein
data display on demand is the Interactorium [47]. The entire
visualization resembles a video game application that pro-
vides smooth multi-scale navigation, including fly-through
(zooming in and out) and fly-over, throughout the network,
and provides automatic centering on any selected protein.
Any given protein, along with its known interactions, can
be viewed in the context of a virtual cell, a virtual organelle
or a protein complex. The latter two are represented by dif-
ferent geometric shapes highlighted by object radiance (Fig.
4). The multi-scale visualization extends to the level of pro-
tein 3D structures. On-the-fly display of 3D structures can be
C 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.com
7. Proteomics 2012, 12, 1669–1686 1675
triggered by the mouse pointer action over the green cross
next to the Gene Symbol (Fig. 4).
Figure 4 shows a view focusing on the S. cerevisiae DNA-
binding protein Rap1 with three of its five interaction part-
ners. Rap1 is shown to be part of a single protein complex
with Rif1, Rif2 and Gcr1, since all four nodes are bound by the
red dashed circular node. One can further deduce that this
complex is located in the nucleus; since the red dashed cir-
cular node is itself bound within a large spherical node with
a contoured surface representing the cell nucleus. Figure 4
also shows how relevant structural information of a selected
protein can be visualized. Where multiple 3D structures exist
for a particular protein, these can be compared side by side.
Figure 4 shows the two PDB entries of Rap1 with 3CZ6 on
the left and 1IGN on the right [48]. Structure 3CZ6 shows the
structure of the C-terminus of Rap1 [49] whereas 1IGN mod-
els the interaction of Rap1 with telomeric DNA sequences
[50].
Interactive visualization is particularly useful in ex-
ploratory analyses whose primary aim is hypotheses genera-
tion [51]. With large networks, this can be cognitively taxing
due to the overwhelming amount of information present,
not to mention the complexity of the visualization. Network
filtering and selective information display remain indispens-
able for alleviating cognitive burden, and hence reduces the
steep learning curve exerted when exploring complex net-
works. The Interactorium provides various filtering criteria
for scale reduction [47]. The investigator can filter networks
by cellular localization, quality and/or quantity of evidence,
or membership in protein families or protein complexes, or
a combination of these. It should be noted that network fil-
tering will lead to a loss of information. Therefore, caution
on the investigator’s part is required when analyzing the fil-
tered network at multiple scales. The closest 2D alternative
that we know of is Patika which uses compound graphs. It
has been used for visualizing selected pathways instead of a
global interactome [9].
4.4 Complexome network visualization
Recent advancement in detecting PPIs has enabled the sys-
tematic identification of protein complexes [52–55], which
place proteins in a cellular context. Initial complexome rep-
resentations were difficult to interpret and could not accom-
modate proteome-wide studies [56–58]. With the progression
of protein interaction studies to higher organisms that have
over 2000 proteins, the need for scalable methods of network
visualization is becoming increasingly clear.
Since many processes inside the cell are orchestrated by
protein complexes, a shift in visualization from the protein
molecular level to that of the complex offers an alternative and
meaningful representation that increases our understanding
of cellular function. Unlike interactomes, a node in the com-
plexome network represents a protein complex (a unique set
of proteins), but what should an edge represent here? A con-
Figure 5. Visualization of the yeast complexsome [62] generated
with the use of GEOMI. (A) Towards complexome visualization. In
PPI networks, a highly interconnected group of proteins is often
thought to denote a single protein complex, which may or may
not be biologically correct. In this example, the network actually
represents two complexes (shown by the orange and turquoise
circles) that share one protein. This parity relationship can be rep-
resented as a pair of nodes, representing two complexes, linked
by an edge. (B) Visualization of the yeast complexome repre-
senting 398 complexes and 992 parity relationships using the
force-directed layout. Parity relationships involving only core and
module protein subunits are displayed.
sensus method for intuitively linking protein complexes to-
gether has yet to be established. This is due to the fact that
proteins within a complex may not necessarily have a direct
interaction with all other subunits. Network representations
to date have connected protein complexes using: (i) binary
interaction data from other studies (such as yeast two-hybrid)
[59, 60] or (ii) the presence of common subunits [61, 62]
(Fig. 5A). Although biologically intelligible, these methods
have their respective limitations. Insufficient overlap between
current PPI and protein complex data sets [63] means that the
resulting networks may lack cohesion. Furthermore, interac-
tion detection methods are often biased towards proteins with
particular physicochemical properties. Consequently, result-
ing networks often contain highly connected parts, represent-
ing proteins with many binary interactions that are present
C 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.com
8. 1676 D. C. Y. Fung et al. Proteomics 2012, 12, 1669–1686
in multiple complexes, which may or may not be biologi-
cally relevant. One alternative complex-centric visualization
involves connecting complexes that share protein subunits.
This, in effect, is a parity network. The term ‘parity’ originally
referred to the quality of sameness or equivalence in message
transmission from node to node through a computer network
[64]. Using this type of network to represent the complexome
can still be visually complex and incomprehensible. The high
degree of subunit sharing between complexes imparts a high
edge density to the network visualization, giving rise to large
amount of edge over edge, and edge over node, crossings.
These issues can be resolved by limiting the number of con-
nections drawn using an arbitrary threshold [59–61], e.g. if
the complexes share at least two common protein subunits,
or if they contain subunits with at least two known binary
PPIs. This, however, may result in the loss of biologically rel-
evant connections in the network. Yet another alternative is
to present only those protein subunits that are strongly as-
sociated with the complex (deemed ‘cores’ or ‘modules’) and
exclude those that are not (‘attachments’) [54]. A recent study
demonstrated that a comprehensible and biologically accu-
rate complexome network representation can be achieved by
using shared core or module proteins to build inter-complex
connections [62] (Fig. 5B).
5 Dynamic interactome visualization
5.1 Purpose
The visualizations discussed so far can only provide a static
view of detectable PPIs. They can depict the combinatorial
complexity of a given interactome but not its temporal dy-
namics due to fluctuations in protein abundance. It is highly
unlikely that every PPI will be constitutively active through-
out cellular life in the face of external challenges. Rather,
PPIs are orchestrated in a temporal order. For this reason,
visualization of dynamic PPI networks is urgently needed
[65]. Time-course data sets have been generated with the use
of high-throughput microarray technology, providing a data
source of sufficient scale for the construction of dynamic net-
works. Hence, gene expression rather than protein expression
data have been commonly used as a proxy for protein abun-
dance. What is being visualized in a dynamic network is the
temporal fluctuation in transcript abundance being superim-
posed on a PPI network.
The visualization of temporal networks can provide insight
into the dynamic processes of the living cell [66,67]. The topo-
logical changes (often known as phase transition) [68] can
inform the investigator on the functional ordering of sub-
networks through time thereby tracking the pathway depen-
dencies of a developing phenotype or disease progression. In
the immediate future, we would expect dynamic networks to
have great potential applications in areas where the evolution
of cell lineages is being studied [65], e.g. cell transformation
in malignancy [69], lineage commitment of pluripotent stem
cells [70], dose–response studies or in case-control studies
where time-course measurements have been made [71, 72].
It is noteworthy that the visualization methods introduced in
this section are applicable to any comparative studies involv-
ing multiple organisms, tissues or cell types, e.g. correlated
gene expression dynamics between mouse tissues [73].
There are two ways of visualizing time-course dynamics.
One approach is to apply the 2.5D visualization in the parallel
plane layout by stacking the network drawings at different
time points together, thereby displaying the gradual transi-
tion from one to another topology through time (Fig. 6). The
advantage of this approach is the explicit display of all the
topological change throughout the time course in a single
drawing but shares the same limitation seen with 2.5D visu-
alizations (Fig. 3). Application of this approach, to the best
of our knowledge, has been limited to metabolic pathway
dynamics [74].
5.2 Animated network visualization
The other approach is to dynamically render the relevant
nodes and edges of the PPI network at different time points;
and then visualize the time course as an animation [75]. Nodes
and edges are hidden or made opaque without changing node
positioning, thus maintaining the investigator’s initial men-
tal picture built in the very first time frame. Such type of
animation, known as a static flip book [76], allows one to di-
rect his/her attention to study the subnetworks that exhibit
topological changes. An example is shown in Fig. 7, exhibit-
ing the dynamics of DNA mismatch proteins in cell cycle
progression based on a published time-course data [77]. The
underlying network is a published yeast PPI network [78]
drawn in a force-directed layout.
The temporal sequence in Fig. 7A shows the expression
dynamics of the DNA mismatch proteins as node colour
changes only. The re-colouring of nodes is triggered only
upon a non-zero change in expression value from one time
point to another. Visual inspection of Fig. 7A would help the
investigator to identify the four similarly expressed proteins,
i.e. Msh2, Msh6, Pms1 and Pol30, which interact with their
statically expressed partners. The temporal sequence also ex-
poses the fluctuation of the above proteins during cell cycle
progression. At the 30 min time point, all four proteins are
highly expressed followed by a markedly decreased expres-
sion at the 45 min time point. The three proteins then regain
their expression levels at the 75 min time point. Therefore the
decrease in expression occurs at the G2/M phase of the cell
cycle. The most obvious limitation of this visualization lies
in the use of chromatic scaling for representing expression
dynamics, but human perception is more adept at detecting
dimensional scaling, e.g. node size, node shape and edge
length [79]. An alternative design could use dynamic scaling
on node size in a dual colour mode throughout the temporal
sequence.
C 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.com
9. Proteomics 2012, 12, 1669–1686 1677
Figure 6. 2.5D network visual-
ization of the glycolytic-Krebs
cycle pathway dynamics in
Hordeum vulgare during seed
development over a period of
20 days [45]. The visualiza-
tion was generated with the
use of Wilmascope [112]. Path-
way drawings at successive
time points are stacked sequen-
tially along the z-axis with each
pathway being drawn on the
x-y plane. Each level repre-
sents a 2-day interval starting
from day zero. Each disc-shaped
node represents a metabo-
lite on the glycolytic path-
way schema (right); its size is
scaled to the empirical quan-
tity of that metabolite. The flux
dynamics between fructose-
1,6-bisphosphate and the 3-
phosphoglycerate through time
is represented by small ma-
genta arrows between the three
cylinders (indicated by the
green arrow). In the schema
(right), each metabolite is pre-
sented by a rectangular node
coloured according to its path-
way membership. Red = glycol-
ysis; green = Krebs cycle; blue
= amino acid biosynthesis.
The same time course can also be visualized as an ani-
mation [75]. This gives a much clearer impression on the
disruption of the DNA mismatch recognition protein com-
plex represented by this subnetwork throughout the cell cycle
(Fig. 7B). Nodes and edges are visually hidden by manipulat-
ing opacity in the animation using a user-selectable thresh-
old. This functionality maps to the investigator’s assumption
that a certain interaction may be disrupted if the participat-
ing proteins are expressed below a certain level. The nodes
representing Msh2, Msh6, Pms1 and Pol30 are hidden based
on the assumption that their underexpression will eliminate
PPIs among themselves and with their interaction partners.
It has been known that Msh2 and Msh6 are subunits of the
MutS␣ complex and Pms1 interacts with Mlh1 to form the
MutL␣ complex [80]. Pol30 has been known to interact with
MutS␣ and MutL␣ complexes, and acts as the docking site
for subunits required for DNA replication and repair [81].
The animated sequence is obviously suggesting that the DNA
mismatch repair function is being downregulated during the
G2/M phase. This deduction complements the current un-
derstanding that DNA mismatch repair recognition proteins
are upregulated during the preceding S phase for efficient
mismatch recognition and repair [82].
The above example demonstrated how the animation of
network changes can elicit biological insight. It allows the
investigator to identify the dynamics of different parts of the
interactome in cell cycle progression. One will then be able
to deduce the regulatory mechanism underlying the PPI dy-
namics observed.
5.3 Dynamic focus + context visualization
A recently published tool on dynamic network visualization
is the TVNViewer [83]. It does not offer any novel network lay-
outs but its functionality allows the spatio-temporal dynam-
ics of a PPI network to be visualized as a multidimensional
model. The design concept is closely related to the idea of
C 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.com
10. 1678 D. C. Y. Fung et al. Proteomics 2012, 12, 1669–1686
Figure 7. Visualizing the dynamic gene expression of DNA mismatch recognition proteins during the cell cycle [75] generated with the use
of GEOMI. (A) Rendering using changes in node colour. Red = upregulation; green = down-regulation. (B) Animated dynamics by real-time
rendering. Note that the nodes representing MSH2, MSH6, PMS1 and POL30, along with their interactions, are hidden throughout the
visualization as their expression levels are below the preset threshold of 0.45.
C 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.com
11. Proteomics 2012, 12, 1669–1686 1679
Figure 8. TVNViewer dynamic network visualization [83]. Temporal changes in interactions between the hypothetical COMPLEX_2424
(indicated by the green arrow) and various GO nodes through the cell cycle phases.
using a meta-network as the first dimension and the un-
derlying physical network as the second dimension with the
attempt to better expose the dynamic yet complex functional
dependencies among proteins. The same concept has been
applied previously to the modelling of socio-technological
complex systems for interests in national defense and security
[84]. The meta-network here is the Gene Ontology (GO) [85]
informational network containing GO nodes and meta-edges.
Each GO node encompasses a set of proteins annotated with
the node-specific GO term. The GO term can be a member
of either the Biological Process, Molecular Function or Com-
ponent category. Each meta-edge visualized as a solid curve
represents a meta-interaction. This is an abstraction of the in-
teractome where the actual PPIs are not explicitly visualized
but the ‘meta-interactions’ are, such that two GO nodes are
considered to be interacting if and only if they share the same
set of PPIs (see [27] for the original definition) [86]. Since
many proteins are annotated with multiple GO terms, the re-
lationship between a PPI and a GO meta-interaction should
be an m:n relationship. In other words, a given GO meta-
interaction is an abstraction of multiple PPIs and a given PPI
may be abstracted by multiple GO meta-interactions.
From the perspective of information visualization, TVN-
Viewer is a tool for providing Focus + Context visualization
in which the GO informational network provides the biologi-
cal context around the protein nodes, i.e. the focus, selected by
the investigator [87]. Figure 8 provides an example generated
from a synthetic cell cycle data set [83]. It shows the result of
using the mouse pointer action to achieve details on demand
by exposing protein members hidden within a GO node. The
transformed network has the selected protein node and their
GO counterparts arranged in a two-level circular layout with
the former node type being positioned along the circumfer-
ence of the outer level and the latter being positioned on the
inner level. The temporal sequence shows the stage-specific
dynamic interactions among different GO nodes through the
five phases of the cell cycle. Temporal changes in interactions
among the hypothetical COMPLEX_2424 and the various GO
nodes are displayed as the dynamic rendering of edge opacity.
In this way, the context of any selected focus whether a single
protein node or a PPI subnetwork is never lost.
Generally speaking, the biggest barrier against usability
of dynamic network visualization is the erosion of the in-
vestigator’s mental picture [88]. The longer the animated se-
quence or larger the network; the more challenging it is to
preserve the investigator’s original view. This is even more
pronounced with dynamic Focus + Context visualization be-
cause of its higher visual complexity than the mere interac-
tome in a semi-directed network. The other limitation lies in
the use of transcriptome data as a proxy for protein abun-
dance. While transcript abundance correlates with protein
abundance in a broad sense, the impact of degradation, trans-
lational control and post-translation modifications on individ-
ual proteins cannot be ignored. Recent work demonstrated
that the correlation between transcriptomic and proteomic
expression variation is either weak [89, 90] or is highly con-
C 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.com
12. 1680 D. C. Y. Fung et al. Proteomics 2012, 12, 1669–1686
ditional [91]. Therefore, the temporal dynamics visualized
do not necessarily reflect the real-time PPI dynamics. The
increased availability of proteome-scale time course protein
abundance data in the future (e.g. [92]) will help address this
issue.
6 Contextual visualization
6.1 Purpose
From the viewpoint of visual analytics, a visualization that
exposes one or more types of PPIs is merely displaying a set
of interaction data in a graphical form. It does not explicitly
communicate the consensus knowledge about a given PPI
network. The investigator will need to rely either on his/her
own knowledge and access to public resources to construct
hypotheses or trigger certain mouse pointer events to retrieve
the hidden information if available. In order to better assist
the investigator’s understanding in its biological relevance,
a PPI network needs to be co-visualized with some kind of
biological context, e.g. membership in pathways and/or sub-
cellular localization. To date, GO categories are commonly
used as a proxy for biological context especially the GO Pro-
cess and GO Component categories [93]. As will be shown
in the following sections, co-visualization of PPIs and GO
annotations provides added benefits for understanding the
modular nature of the PPI network.
6.2 Colour-coded visualization
While the Focus + Context visualization shown in Section
5.3 is one way of imparting biological context on a PPI net-
work, there are alternative approaches. One of these is to
colour code the GO Process or GO Component terms map-
pable to individual proteins. If both partners of a pairwise
interaction share the same context, it can be assumed that
the corresponding nodes and the edge should be painted
in the same colour (Fig. 9). This method is very useful for
highlighting functional homo- or heterogeneity among pro-
tein complexes (or subnetworks) when GO Process is being
used as the context (Fig. 9A). It is equally useful for high-
lighting the intracellular localization of certain interacting
complexes (Fig. 9B) [94]. Although apparently easy to im-
plement, the challenge is to select an informative subset of
ontology terms out of the GO hierarchy for the purpose of con-
textual mapping. Since human vision cannot easily discern
a broad colour spectrum, the set of applicable GO terms is
selected from the top hierarchal levels of GO Slim. This leads
to the underutilization of the GO hierarchy within a visual-
ization and hence other approaches, e.g. the Focus + Context
visualization using TVNViewer discussed in Section 5.3 is
needed.
Figure 9. Contextual visualization of the yeast PPI network [57]
generated with the use of GEOMI. (A) The Gene Ontology Slim
biological process annotation of each protein node is represented
by colour. Orange = transcription; yellow = transport; light green
= DNA metabolic process; green = protein modification process;
teal = RNA metabolic process; cyan = conjugation; light blue =
cytoskeleton organization; blue = cytokinesis; navy blue = gener-
ation of precursor metabolites and energy; purple = translation;
magenta = other; red = process unknown. (B) The subcellular
localization [94] of each protein node is represented by colour.
Orange = cytoplasm; yellow = nucleus; light green = bud neck;
green = nucleolus; cyan = spindle pole; light blue = endoplasmic
reticulum; blue = nuclear periphery; navy blue = actin; purple =
mitochondrion; magenta = other; red = localization unknown.
C 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.com
13. Proteomics 2012, 12, 1669–1686 1681
Figure 10. Visualization of
the human DNA replication
PPI network in the clustered
circular layout [96] generated
with the use of GEOMI. In-
set (a): Close-up view of the
chromatin (GO:0000785) and
alpha DNA polymerase:primase
(GO:0005658) complex clusters.
MCM2,3,7 and their edges are
highlighted in red. Inset (b):
Close-up view of the perin-
uclear region of cytoplasm
(GO:0048471) cluster. SET,
HMGB2 and their intra-cluster
edge are highlighted in red. In-
set (c): Close-up view which en-
compasses the DNA replication
factor C complex (GO:0005663),
intracellular (GO:0005622) and
chromatin assembly complex
(GO:0005678) clusters. PCNA,
CHAF1A and their inter-cluster
edge are highlighted in red.
6.3 Clustered network visualization
Another method is to exploit the nested modularity of the PPI
network meaning that any functional module, e.g. a pathway
or a biological process, can be further subdivided into mul-
tiple modules according to the physical localization of its
member proteins [95]. This type of organization can be cap-
tured by the context-specific clustered network visualization
[96].
Fig. 10 shows the visualization for the DNA replication
PPI network in the clustered circular layout. Each disc-shaped
cluster node represents a subcellular region, e.g. GO:0048471
perinuclear region; or an organelle, e.g. GO:0005634 nucleus;
or a protein complex, e.g. GO:0005663 DNA replication fac-
tor C. A protein known to localize in multiple subcellular
regions or organelles is represented as multiple nodes in dif-
ferent clusters. Because every node has been assigned a fixed
coordinate, the layout is highly reproducible on repeated ren-
dering. This is an advantage over the force-directed layout
since the investigator does not need to cognitively re-adapt to
a new layout. The strength of the clustered circular layout lies
its ability to capture the three types of biologically relevant
PPIs [96]. The first is the PPI(s) between two protein com-
plexes. The second type is the PPI(s) between a subunit of a
protein complex and other proteins localized in an organelle.
The third type is the PPI(s) that can occur in multiple or-
ganelles or subcellular regions.
An example of the first type can be seen in Fig. 10 in-
set (c). PCNA with its partners RFC2,3,4,5 are localized in
the cluster node labelled ‘GO:0005663 DNA replication fac-
tor C complex’. The other two mutually interacting partners
of PCNA, i.e. CHAF1A,1B, are localized in the cluster node
labelled ‘GO:0005678 chromatin assembly complex’. The in-
teraction between the two complexes is represented by the
inter-cluster edge between PCNA and CHAF1A, suggesting
that PCNA is more likely to be a hub-bottleneck protein than
a date hub.
Fig. 10 inset (a) gives an example of the second type of PPI.
It shows that MCM3 is bound within the cluster node labelled
‘GO:0005658 ␣ DNA polymerase:primase complex’, whereas
MCM2,7 and ORC2L are bound within the cluster node
labelled ‘GO:0000785 chromatin’. The interactions among
these proteins as represented by the inter-cluster edges im-
ply that the ␣ DNA polymerase:primase complex localizes
with the chromatin. It also implies that MCM2,7 and ORC2L
are chromatin-bound. These deductions map to the current
knowledge on the molecular structure of the pre-replication
complex in which MCM2,3,7 are subunits that imparts DNA
helicase activity [97] and ORC2L is one of the subunits that
recognizes the origin site of DNA replication [98].
The whole network shown in Fig. 10 with insets (a) and
(b) together gives an example of the third type of PPI. The
protein node HMGB2 shown in inset (a) is bound within
the ‘GO:0000785 chromatin’ cluster node whereas in inset
(b), it is shown to be bound in the cluster node labelled
‘GO:0048471 perinuclear region of cytoplasm’. In the latter
cluster, HMGB2 is linked to the protein node SET indicating
that they are interaction partners. The pair is also duplicated
in the cluster node labelled ‘nucleus’, strongly suggesting
that the SET-HMGB2 dimer coexists in both the nucleus and
C 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.com
14. 1682 D. C. Y. Fung et al. Proteomics 2012, 12, 1669–1686
the perinuclear region of cytoplasm. Furthermore, the clus-
ter membership of HMGB2 suggests that it is the subunit
that interacts with the chromatin. Both deductions have been
verified experimentally [99, 100]. It has been found that the
SET-HMGB2 dimer is part of the larger SET complex sus-
pected to have the function of nucleosome assembly. It may
have been assembled in the perinuclear region of the endo-
plasmic reticulum for regular transport into the nucleus [99].
The biggest limitation of the clustered circular layout is its
less compact drawing as compared to the force-directed lay-
out. It took 148 protein nodes, 13 cluster nodes and 153 edges
to represent the DNA replication PPI network. Yet the net-
work itself contains only 55 proteins and 83 interactions. The
substantially inflated network size seen is caused by node re-
dundancy and the positioning of protein nodes being limited
to the circumference of the cluster node. The other limitation
is the loss of the original topology seen in the force-directed
layout, making the identification of date and party hubs more
challenging [96], but this is compensated by a more effective
exposure of bottleneck proteins.
7 Challenges in interactome visualization
7.1 Biological fidelity
As researchers in information visualization rightly pointed
out, any visualization is only as good as the data that one pro-
vides to it [101]. The biological fidelity of a visualized interac-
tome is very much affected by technical and representational
artefacts. Both reduce the reliability of the visualization. Tech-
nical artefacts come from the false positives and negatives
generated by the experimental techniques. Representational
artefacts come from the underlying graph-theoretic model
and the layout of the visualization. The inclusion of false pos-
itive PPIs introduces noise to the visualized network in the
form of extraneous edges whereas false negative PPIs can dis-
tort global topology by underestimating protein connectivity.
Technical artefacts are introduced during the detection
of PPIs. False positives and/or false negatives come from
a variety of sources, i.e. the analytical technology employed,
the experimental design, laboratory conditions and the op-
erator’s competence in sample handling. The tandem affin-
ity purification-mass spectrometry technique, e.g. tends to
bias towards the detection of high-affinity PPIs [102]. Since
this technique requires the in vitro processing of protein ex-
tract, it is especially prone to operator error in sample han-
dling. The other commonly used detection method, yeast
two-hybrid assay, detects PPIs occurring in vivo but can un-
derdetect membrane-associated PPIs and those dependent
on post-translational modification [102]. There is the percep-
tion that large-scale curation of low-throughput experiments
should give a more reliable interactome, but a comparative
study showed that human PPIs collected from single low-
throughput studies are of poorer quality when compared to
high-quality data sets produced by stringent yeast two-hybrid
and PPI assays [103].
The biggest source of representational artefacts comes
from the graph-theoretic model used. It has been shown ex-
perimentally, e.g. by affinity purification, that protein com-
plexes are not exclusively dimeric. Yet the graph model de-
composes the m:n interaction stoichiometry, common among
PPIs, to a 1:1 relationship [104], regardless of whether the
PPIs are truly dimeric or not. This results in a failure to
capture the multi-scalar nature of PPIs in complexomes and
similarly the complexome interactions in the global network.
Hence, there is the argument that PPIs may be best rep-
resented by hypergraphs. A hypergraph denoted as H(V, E)
consists of a set of nodes V and a set of hyperedges E. Ev-
ery hyperedge e represents the physical interaction among
the populations of k proteins v1 to vk [104]. However, the
widespread use of graphs instead of hypergraphs has to do
with computational intractability and higher visual complex-
ity of the latter when the network size increases. Hence, the
dilemma of choosing between biological fidelity and com-
prehensibility is now confronting the investigator and also
visualization researchers.
7.2 Conceptual focus
For any interactome visualization to become a useful analyti-
cal tool, there is a need to match the investigator’s knowledge
precept with the biological relationships exposed by the visu-
alization. One contentious issue is whether the primary focus
of any visualization should be the interactome or complex-
ome. If it is the former, the focus will be protein connectivity
in the network. The assumption will be the functioning of
the interactome relies on how frequently each individual pro-
tein engages in PPIs. One can even argue that preferential
attachment explains the biological relevance of the interac-
tome rather than functional or physical modularity. If it is
the latter, the focus will be on protein complex connectivity
or even core-module complex-to-attachment protein connec-
tivity. The assumption will then be the functioning of an inter-
actome relies largely on how frequently each protein complex
engages either in multi-complex interactions or the dynamic
transaction of subunits. This issue can only be resolved once
current representations of the complexome are tested in the
community, and their utility and relevance established.
From an even broader perspective, the current understand-
ing on network biology is mostly derived from the yeast in-
teractome. It is not clear whether one can interpret a mam-
malian interactome visualization just like its yeast counter-
part. These are conceptual problems yet to be resolved. In
the face of conceptual uncertainty, the issue at heart is which
conceptual model should be presented to the investigator
by the visualization researcher? A possible solution is to
implement knowledge tracking as a functionality of a visual-
ization tool [105].
C 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.com
15. Proteomics 2012, 12, 1669–1686 1683
7.3 The curse of scale
Even for the single-cell eukaryote S. cerevisiae, the size of
the interactome has been estimated to contain approximately
5000 proteins with 20 000 physical interactions [3] forming
a probable 800 complexes [53]. The size of the human PPI
network is even bigger with an estimated 22 500 proteins in-
volved in a possible 130 000 physical interactions [103]. This
is more than six times the size of the yeast PPI network.
Visualization on such a big scale will only increase the in-
vestigator’s cognitive burden and stall his/her effort towards
extracting biological insight, let alone generating testable hy-
potheses. It can be expected that iterative scale reduction,
through the use of network filtering and abstraction [58,106],
will continue to serve an important place in interactome vi-
sualization. Computational methods such as pathway-based
enrichment analysis [107] can be used as the preceding step
to provide guidance on network exploration.
8 Visual analytics is the future
This review informed us that no single visualization can rep-
resent faithfully the biological context, the scale and the dy-
namics of an interactome, partly because of incomplete data
set and partly because an interactome is probably multi-scale
[108]. Therefore, the visual analysis of a single visualization
can only provide limited biological insight. The need for a vi-
sual analytical framework is becoming increasingly pressing
to advance systems biology research. The ideal framework
will not only need to provide multiple visualizations of an
interactome but also provide the statistical confidence [109],
evidence score [47] or the quality assessment [103] of every
PPI represented. The framework should facilitate hypothe-
sis construction by enabling the collaborative use of multiple
visualizations.
The biggest challenge yet to be answered is to find a
set of usability heuristics which can serve the broad range
of research interests among investigators. Usability heuris-
tics means the common design features that make a visu-
alization effective, as derived from past experiences [110].
By effectiveness, we refer to the ability of the visualization
to amplify the investigator’s cognition with the aim of en-
hancing one’s analytical capability. If available, the heuristics
can serve as a guide for designing a usable visual analytical
framework. Although there have been usability heuristics pro-
posed by information visualization researchers [111], it is not
clear which of them will be most suitable to systems biology
application.
As visual analytics becomes an integral part of systems
biology research, the need for user-participatory design be-
comes increasingly important. Information visualization re-
searchers need to know the investigator’s requirements. In
return, the investigator needs the expertise from the informa-
tion visualization community to design visualization(s) that
can capture the biological knowledge (or concept) of interest.
The hurdle, however, is that biology is a knowledge-intensive
field of science. It is difficult for an expert in information
visualization to grasp the contextual richness of an interac-
tome in a limited time frame. It is equally challenging for
an investigator to understand the biological relevance of a
visualized interactome. Therefore, the availability of interdis-
ciplinary experts able to bridge the two communities will be
critical to their open collaboration. One way to foster interdis-
ciplinary collaboration is to include information visualization
as part of the bioinformatician’s training. For the moment,
the development of information visualization for systems bi-
ology research will need extensive experimentation. This field
is still very much at its infancy and opportunity for further
growth abounds.
The research reviewed in this paper was supported by the Aus-
tralian Research Council Linkage Grant Scheme, the New South
Wales Office for Scientific and Medical Research, the EIF Super
Science Scheme, the University of Sydney and the University of
New South Wales.
The authors have declared no conflict of interest.
9 References
[1] Stelzl, U., Worm, U., Lalowski, M., Haenig, C. et al., A hu-
man protein-protein interaction network: a resource for
annotating the proteome. Cell 2005, 122, 957–968.
[2] Rual, J. F., Venkatesen, K., Hao, T., Hirozane-Kishikawa, T. et
al., Towards a proteome-scale map of the human protein-
protein interaction network. Nature 2005, 437, 1173–1178.
[3] Yu, H., Braun, P., Yildirim, M. A., Lemmens, I. et al., High-
quality binary protein interaction map of the yeast inter-
actome network. Science 2008, 322, 104–110.
[4] Merico, D., Gfeller, D., Bader, G. D., How to visually inter-
pret biological data using networks? Nat. Biotech. 2009,
27, 921–924.
[5] Cline, M., Smoot, M., Cerami, E., Kuchinsky, A. et al., Inte-
gration of biological networks and gene expression data
using Cytoscape. Nat. Protoc. 2007, 2, 2366–2382.
[6] Hu, Z., Hung, J-. H., Wang, Y., Chang, Y-. C. et al., VisANT
3.5: multi-scale network visualization, analysis and infer-
ence based on the gene ontology. Nucleic Acids Res. 2009,
37, W115–W121.
[7] Breitkreutz, B. J., Stark, C., Tyers, M., Osprey: a network
visualization system. Genome Biol. 2003, 4, R22.
[8] Iragne, F., Nikolski, M., Mathieu, B., Auber, D. et al., ProViz:
protein interaction visualization and exploration. Bioinfor-
matics 2005, 21, 272–274.
[9] Dogrus ¨oz, U., Erson, E. Z., Giral, E., Demir, E. et al.,
PATIKAweb: a Web interface for analyzing biological
pathways through advanced querying and visualization.
Bioinformatics 2006, 22, 374–375.
[10] Lee, R. E., Megeney, L. A., The yeast kinome displays scale
free topology with functional hub clusters. BMC Bioinfor-
matics 2005, 6, 271.
C 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.com
16. 1684 D. C. Y. Fung et al. Proteomics 2012, 12, 1669–1686
[11] Fruchterman, T. M. J., Rheingold, E. M., Graph drawing by
force-directed placement. Software Pract. Exper. 1991, 21,
1129–1164.
[12] Eades, P., A heuristic for graph drawing. Congressus Nu-
merantium 1984, 42, 142–160.
[13] Sugiyama, K., Tagawa, S., Toda, M., Methods for visual un-
derstanding of hierarchical system structures. IEEE Trans.
Syst. Man. Cybernetics. 1981, 11, 109–125.
[14] Barsky, A., Gardy, J. L., Hancock, R. E. W., Munzner, T.,
Cerebral: a Cytoscape plugin for layout of and interaction
with biological networks using subcellular localization an-
notation. Bioinformatics 2007, 23, 1040–1042.
[15] Suderman, M., Hallett, M., Tools for visually exploring bi-
ological networks. Bioinformatics 2007, 23, 2651–2659.
[16] Gehlenborg, N., O’Donoghue, S. I., Baliga, N. S., Goes-
mann, A. et al., Visualization of omics data for systems
biology. Nat. Methods 2010, 7, S56–S68.
[17] Ahmed, A., Dwyer, T., Forster, M., Xu, K. et al., GEOMI:
geometry for maximum insight. Lect. Notes Comput. Sci.
2006, 3843, 468–479.
[18] Pavlopoulos, G. A., Secrier, M., Moschopoulos, C. N.,
Soldatos, T. G. et al., Using graph theory to analyze bi-
ological networks. BioData Min. 2011, 4, 10.
[19] Emmert-Streib, F., Dehmer, M., Networks for systems biol-
ogy: conceptual connection of data and function. IET Syst.
Biol. 2011, 5, 185–207.
[20] Christensen, C., Thakar, J., Albert, R., Systems-level in-
sights in cellular regulation: inferring, analyzing, and mod-
eling intracellular networks. IET Syst. Biol. 2007, 1, 61–77.
[21] Barab´asi, A. L., Oltvai, Z. N., Network biology: understand-
ing the cell’s functional organization. Nat. Rev. Gen. 2004,
5, 101–114.
[22] Rives, A. W., Galitski, T., Modular organization of cellu-
lar network. Proc. Natl. Acad. Sci. USA 2003, 100, 1128–
1133.
[23] Whitacre, J. M., Bender, A., Networked buffering: a basic
mechanism for distributed robustness in complex adap-
tive system. Theor. Biol. Med. Model 2010, 7, 20.
[24] Jeong, H., Mason, S. P., Barab´asi, A. L., Oltvai, Z. N., Lethal-
ity and centrality in protein networks. Nature 2001, 411,
41–42.
[25] Tornow, S., Mewes, H. W., Functional modules by relating
protein interaction networks and gene expression. Nucleic
Acids Res. 2003, 31, 6283–6289.
[26] Han, J. D., Bertin, N., Hao, T., Goldberg, D. S. et al., Evi-
dence for dynamically organized modularity in the yeast
protein-protein interaction network. Nature 2004, 430, 88–
95.
[27] Patel, M. I., Nagl, S., The Role of Model Integration in Com-
plex Systems Modeling: An Example from Cancer Biology,
Springer, Berlin, 2010, pp. 64–83.
[28] Saffer, J. D., Burnett, V. L., Chen, G., van der Spek, P., Visual
analytics in the pharmaceutical industry. IEEE Comput.
Graph. Appl. 2004, 24, 10–15.
[29] van Wijk, J. J., Guest editor’s introduction: special section
on IEEE symposium on visual analytics science and tech-
nology. IEEE Trans. Visual. Comput. Graphics 2011, 17,
555–556.
[30] Prˇzulj, N., Protein-protein interactions: making sense of
networks via graph-theoretic modeling. Bioessays 2010,
33, 115–123.
[31] Kim, P. M., Lu, L. J., Xia, Y., Gerstein, M. B., Relating three-
dimensional structures to protein networks provide evo-
lutionary insights. Science 2006, 314, 1938–1941.
[32] Yu, H., Kim, P. M., Sprecher, E., Trifonov, V. et al., The im-
portance of bottlenecks in protein networks: correlation
with gene essentiality and expression dynamics. PLoS
Comput. Biol. 2007, 3, e59.
[33] Hashimoto, T. B., Nagasaki, M., Kojima, K., Miyano, S.,
BFL: a node and edge betweenness based fast layout algo-
rithm for large-scale network. BMC Bioinformatics 2009,
10, 19.
[34] Freeman, L. C., Borgatti, S. P., White, D. R., Centrality in val-
ued graphs: a measure of betweenness based on network
flow. Soc. Networks 1991, 13, 141–154.
[35] Valente, A. N., Cusick, M. E., Yeast protein interactome
topology provides framework for co-ordinated functional-
ity. Nucleic Acids Res. 2006, 34, 2812–2819.
[36] Zou, L., Sriswasdi, S., Ross, B., Missiuro, P. V. et al., Sys-
tematic analysis of pleiotropy in C. elegans early embryo-
genesis. PLoS Comput. Biol. 2008, 4, e1000003.
[37] Ahmed, A, Dwyer, T., Hong, S-H., Murray, C. et al., Vi-
sualization and analysis of large and complex scale-free
networks. Proc. Eurographics – IEEE VGTC Symp. Visual-
ization 2005, 1–8.
[38] Huang, W., Murray, C., Shen, X., Song, L. et al., Visualisa-
tion and analysis of network motifs. Proc. 9th Intl. Conf.
Info. Vis. 2005, 697–702.
[39] Meyers, L. A., Newman, M. E. J., Pourbohloul, B., Predict-
ing epidemics on directed contact networks. J. Theor. Biol.
2006, 240, 400–418.
[40] Fung, D. C. Y., Hong, S-. H., Kosch ¨utzki, D., Schreiber, F.
et al., 2.5D visualization of overlapping biological net-
works. J. Integr. Bioinform. 2008, 5, 90.
[41] Fung, D. C. Y., Hong, S-. H., Kosch ¨utzki, D., Schreiber, F.
et al., Visual analysis of overlapping biological network.
Proc. 13th Intl. Conf. Info. Vis. 2009, 337–342.
[42] Cui, Q., Ma, Y., Jaramillo, M., Bari, H. et al., A map of
human cancer signaling. Mol. Syst. Biol. 2007, 3, 152.
[43] Breitkreutz, B. J., Stark, C., Reguly, T., Boucher, L. et al.,
The BioGRID Interaction database: 2008 update. Nucleic
Acids Res. (database issue) 2008, 36, D637–640.
[44] Hsu, C-. N., Lai, J-. M., Liu, C-. H., Tseng, H-. H. et al., De-
tection of the inferred interaction network in hepatocellu-
lar carcinoma from ECHO (Encyclopedia of hepatocellular
carcinoma genes online). BMC Bioinformatics 2007, 8, 66.
[45] Brande, U., Dwyer, T., Schreiber, F., Visualizing related
metabolic pathways in two and a half dimensions. Lect.
Notes Comput. Sci. 2004, 2912, 111–122.
[46] Tory, M., M ¨oller, T., Human factors in visualization re-
search. IEEE Trans. Visual. Comput. Graphics 2004, 10,
72–84.
C 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.com
17. Proteomics 2012, 12, 1669–1686 1685
[47] Widjaja, Y. Y., Pang, C. N. I., Li, S. S., Wilkins, M. R. et al.,
The Interactorium: visualising proteins, complexes and in-
teraction networks in a virtual 3D cell. Proteomics 2009, 9,
5309–5315.
[48] Berman, H. M., Westbrook, J., Feng, Z., Gilliland, G. et al.,
Protein data bank. Nucleic Acids Res. 2000, 28, 235–242.
[49] Feeser, E. A., Wolberger, C., Structural and functional stud-
ies of the Rap1 C-terminus reveal novel separation-of-
function mutants. J. Mol. Biol. 2008, 380, 520–531.
[50] Konig, P., Giraldo, R., Chapman, L., Rhodes, D., DNA-
binding domain of Rap1 in complex with telomeric DNA
site. Cell 1996, 85, 125–136.
[51] Kelder, T., Conklin, B. R., Evelo, C. T., Pico, A. R., Finding the
right questions: exploratory pathway analysis to enhance
biological discovery in large dataset. PLoS Biol. 2010, 8,
e1000472.
[52] Ho, Y., Gruhler, A., Heilbut, A., Bader, G. D. et al., System-
atic identification of protein complexes in Saccharomyces
cerevisiae by mass spectrometry. Nature 2002, 415, 180–
183.
[53] Gavin, A. C., Bosche, M., Krause, R., Boesche, M. et al.,
Functional organization of the yeast proteome by system-
atic analysis of protein complexes. Nature 2002, 415, 141–
147.
[54] Gavin, A. C., Aloy, P., Grandi, P., Krause, R. et al., Proteome
survey reveals modularity of the yeast cell machinery. Na-
ture 2006, 440, 631–636.
[55] Krogan, N. J., Garney, G., Yu, H., Zhong, G. et al., Global
landscape of protein complexes in the yeast Saccha-
romyces cerevisiae. Nature 2006, 440, 637–643.
[56] Krause, R., von Mering, C., Bork, P., Dandekar, T., Shared
components of protein complexes – versatile building
blocks or biochemical artefacts? BioEssays 2004, 26, 1333–
1343.
[57] Ho, E., Webber, R., Wilkins, M. R., Interactive three-
dimensional visualization and contextual analysis of pro-
tein interaction networks. J. Proteome Res. 2008, 7, 104–
112.
[58] Hu, Z., Mellor, J., Wu, J., Kanehisa, M. et al., Towards
zoomable multidimensional maps of the cell. Nat. Biotech-
nol. 2007, 25, 547–554.
[59] Benschop, J. J., Brabers, N., van Leenen, D., Bakker, L. V.
et al., A consensus of core protein complex compositions
for Saccharomyces cerevisiae. Mol. Cell Proteomics 2010,
38, 916–928.
[60] Wang, H., Kakaradov, B., Collins, S. R., Karotki, L. et al.,
A complex-based reconstruction of the Saccharomyces
cerevisiae interactome. Mol. Cell Proteomics 2009, 8,
1361–1381.
[61] Hart, G. T., Lee, I., Marcotte, E. R., A high-accuracy con-
sensus map of yeast protein complexes reveals modular
nature of gene essentiality. BMC Bioinformatics 2007, 8,
236.
[62] Li, S. S., Xu, K., Wilkins, M. R., Visualization and analysis
of the complexome network of Saccharomyces cerevisiae.
J. Proteome Res. 2011, 10, 4744–4756.
[63] Fasolo, J., Sboner, A., Sun, M. G., Yu, H. et al., Diverse
protein kinase interactions identified by protein microar-
rays reveal novel connections between cellular processes.
Genes Dev. 2011, 25, 767–778.
[64] Karris, S. T., Networks: Design and Management, Orchard
Publications, Fremont, 2002, pp. 2–5.
[65] Przytycka, T. M., Singh, M., Slonim, D. K., Toward the dy-
namic interactome: it’s about time. Brief Bioinform. 2010,
11, 15–29.
[66] Komurov, K., White, M., Revealing static and dynamic
modular architecture of the eukaryotic protein interaction
network. Mol. Syst. Biol. 2007, 3, 110.
[67] de Lichtenberg, U., Jensen, L. J., Brunak, S., Bork, P., Dy-
namic complex formation during the yeast cell cycle. Sci-
ence 2005, 307, 724–727.
[68] Bianconi, G., Barab´asi, A. L., Bose-Einstein condensation
in complex networks. Phys. Rev. Lett. 2001, 86, 5632–5635.
[69] Edelman, E. J., Guinney, J., Chi, J-. T., Febbo, P. G. et al.,
Modeling cancer progression via pathway dependencies.
PLoS Comput. Biol. 2008, 4, e28.
[70] Kirouac, D. C., Ito, C., Csaszar, E., Roch, A. et al., Dynamic
interaction networks in a hierarchically organized tissue.
Mol. Syst. Biol. 2010, 6, 417.
[71] Dupuy, D., Bertin, N., Hidalgo, C. A., Venkatesen, K. et
al., Genome-scale analysis of in vivo spatiotemporal pro-
moter activity in Caenorhabditis elegans. Nat. Biotechnol.
2007, 25, 663–668.
[72] Jiao, Y., Tausta, S. L., Gandotra, N., Sun, N. et al., A tran-
scriptome altas of rice cell types uncovers cellular, func-
tional and developmental hierarchies. Nat. Genet. 2009,
41, 258–263.
[73] Keller, M. P., Choi, Y., Wang, P., Davis, D. B. et al., A gene ex-
pression network model of type 2 diabetes links cell cycle
regulation in islets with diabetes susceptibility. Genome
Res. 2008, 18, 706–716.
[74] Dwyer, T., Rolletschek, H., Schreiber, F., Representing ex-
perimental biological data in metabolic networks. Proc.
2nd Asia-Pacific Bioinform. Conf. 2004, 29, 13–20.
[75] Goel, A., Li, S. S., Wilkins, M. R., Four-dimensional visu-
alization and analysis of protein-protein interaction net-
works. Proteomics 2011, 11, 1–11.
[76] Moody, J., McFarland, D., Bender-deMoll, S., Dynamic
network visualization. Am. J. Sociol. 2005, 110, 1206–1241.
[77] Spellman, P. T., Sherlock, G., Zhang, M. Q., Iyer, V. R. et
al., Comprehensive identification of cell cycle-regulated
genes of the yeast Saccharomyces cerevisiae by microar-
ray hybridization. Mol. Biol. Cell 1998, 9, 3273–3297.
[78] Bertin, N., Simonis, N., Dupuy, D., Cusick, M. E. et al.,
Confirmation of organized modularity in the yeast inter-
actome. PLoS Biol. 2007, 5, e153.
[79] Tufte, E. R., The Visual Display of Quantitative Information,
2nd Edn., Graphics Press LLC, Cheshire 2001.
[80] Kadyrov, F. A., Holmes, S. F., Arana, M. E., Lukianov, O. A.
et al., Saccharomyces cerevisiae MutLalpha is a mismatch
repair endonuclease. J. Biol. Chem. 2007, 282, 37181–
37190.
C 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.com
18. 1686 D. C. Y. Fung et al. Proteomics 2012, 12, 1669–1686
[81] Marti, T. M., Kunz, C., Fleck, O., DNA mismatch repair and
mutation avoidance pathways. J. Cell Physiol. 2002, 191,
28–41.
[82] Kunkel, T. A., Erie, D. A., DNA mismatch repair. Ann. Rev.
Biochem. 2005, 74, 681–710.
[83] Curtis, R. E., Yuen, A., Song, L., Goyal, A. et al., TVNViewer:
an interactive visualization tool for exploring networks
that change over time or space. Bioinformatics 2011, 27,
1880–1881.
[84] Pestov, I., Verga, S., Dynamical networks as a tool for sys-
tem analysis and exploration. Proc. IEEE Symp. Comput.
Intell. Security and Defense Appl. 2009 (CISDA 2009), pa-
per no. 05356527.
[85] Gene Ontology Consortium, The Gene Ontology (GO)
project in 2006. Nucleic Acids Res. (database issue) 2006,
34, D322–326.
[86] Dotan-Cohen, D., Letovsky, S., Melkman, A. A., Kasif, S.,
Biological process linkage networks. PLoS One 2009, 4,
e5313.
[87] Card, S. K., MacKinlay, J. D., Shneiderman, B., Card, M.,
Readings in Information Visualization: Using Vision to
Think, Morgan Kaufmann Publishers, San Francisco 1999,
pp. 1–32.
[88] Branke, J., Dynamic graph drawing. Lect. Notes Comput.
Sci. 2001, 2025, 228–246.
[89] Lu, P., Vogel, C., Wang, R., Yao, X. et al., Absolute pro-
tein expression profiling estimates the relative contribu-
tions of transcriptional and translational regulation. Nat.
Biotechnol. 2007, 25, 117–124.
[90] Raj, A., Peskin, C. S., Tranchina, C. S., Vargas, D. Y. et
al., Stochastic mRNA synthesis in mammalian cells. PLoS
Biol. 2006, 4, e309.
[91] Lee, M. V., Topper, S. E., Hubler, S. L., Hose, J. et al., A
dynamic model of proteome changes reveals new roles
for transcript alteration in yeast. Mol. Syst. Biol. 2011, 7,
514.
[92] Ghaemmaghami, S., Huh, W-. K., Bower, K., Howson, R.
W. et al., Global analysis of protein expression in yeast.
Nature 2003, 425, 737–741.
[93] Rachlin, J., Cohen, D. D., Cantor, C., Kasif, S., Biological
context networks: a mosaic view of the interactome. Mol.
Syst. Biol. 2006, 2, 66.
[94] Huh, W. K., Falvo, J. V., Gerke, L. C., Carroll, A. S. et al.,
Global analysis of protein localization in budding yeast.
Nature 2003, 425, 686–691.
[95] Alon, U., Network motifs in developmental, signal
transduction, and the neuronal networks, in: An In-
troduction to Systems Biology: Design Principles of
Biological Circuits, Chapman & Hall/CRC Mathemati-
cal and Computational Biology Series 2007, pp. 97–
134.
[96] Fung, D. C. Y., Wilkins, M. R., Hart, D., Hong, S-. H., Using
clustered circular layout as an informative method for vi-
sualizing protein-protein interaction network. Proteomics
2010, 10, 2723–2727.
[97] Dafonesca, C. J., Shu, F., Zhang, J. J., Identification of two
residuals in MCM5 critical for the assembly of the MCM
complexes and Stat1-mediated transcription activation in
response to IFN-␥. Proc. Natl. Acad. Sci. USA 2001, 98,
3034–3039.
[98] Gonzalez, M. A., Tachibana, K. K., Laskey, R. A., Coleman,
N., Control of DNA replication and its potential clinical
exploitation. Nat. Rev. Cancer 2005, 5, 135–141.
[99] Fan, Z., Beresford, P. J., Zhang, D., Lieberman, J., HMG2 in-
teracts with the nucleosome assembly protein SET and is
a target of the cytotoxic T-lymphocyte protease granzyme
A. Mol. Cell. Biol. 2002, 22, 2810–2820.
[100] Bustin, M., Regulation of DNA-dependent activities by the
functional motifs of the high-mobility-group chromoso-
mal proteins. Mol. Cell. Biol. 1999, 19, 5237–5246.
[101] Amar, R. A., Stasko, J. T., Knowledge precepts for design
and evaluation of information visualization. IEEE Trans.
Visual. Comput. Graphics 2005, 11, 432–442.
[102] Br ¨uckner, A., Polge, C., Lentze, N., Auerbach, D. et al., Yeast
two-hybrid, a powerful tool for systems biology. Int. J.
Mol. Sci. 2009, 10, 2763–2788.
[103] Venkatesen, K., Rual, J-. F., Vazquez, A., Stelzl, U. et al.,
An empirical framework for binary interactome mapping.
Nat. Methods 2009, 6, 83–90.
[104] Klamt, S., Haus, U-. U., Theis, F., Hypergraphs and cellular
networks. PLoS Comput. Biol. 2009, 5, e1000385.
[105] Tipney, H. J., Schuyler, R. P., Hunter, L., Consistent visual-
izations of changing knowledge. Summit Translat. Bioin-
form. 2009, 2009, 129–132.
[106] Praneenararat, T., Takagi, T., Iwasaki, W., Interactive, mul-
tiscale navigation of large and complicated biological net-
works. Bioinformatics 2011, 27, 1121–1127.
[107] Kelder, T., Conklin, B. R., Evelo, C. T., Pico, A. R., Finding the
right questions: Exploratory pathway analysis to enhance
biological discovery in large datasets. PLoS Biol. 2010, 8,
e1000472.
[108] Hase, T., Tanaka, H., Suzuki, Y., Nakagawa, S. et al., Struc-
ture of protein interaction networks and their implications
on drug design. PLoS Comput. Biol. 2009, 5, e1000550.
[109] Braun, P., Tasan, M., Dreze, M., Barrios-Rodiles, M. et al.,
An experimentally derived confidence score for binary
protein-protein interactions. Nat. Methods 2009, 6, 91–97.
[110] Nielsen, J., Heuristic Evaluation, in: Nielsen, J., Mack, R.
L. (Eds.), Usability Inspection Methods, John Wiley and
Sons Inc., New York 1994, pp. 25–62.
[111] Zuk, T., Schlesier, L., Neumann, P., Hancock, M. S. et al.,
Heuristics for information visualization evaluation. Proc.
2006 AVI workshop on beyond time and errors; novel eval-
uation methods for information visualization, Association
for Computing Machinery, New York 2006.
[112] Ahmed, A., Dwyer, T., Murray, C., Song, L. et al., Info-
Vis 2004 Contest: Wilmascope graph visualization. Proc.
IEEE Symposium on Information Visualization 2004 (In-
foVis 2004), IEEE Computer Society, Los Alamitos 2004,
p. r4.
C 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.com