Bio-inspired Computing and its
Application in Cheminformatics
B Y
A B D E L A Z I M G A L A L H U S S I E N
D E M O N S T R A T O R A T F A C U L T Y O F S C I E N C E , F A Y O U M U N I V E R S I T Y
Professor Mohamed Amin
Faculty of Science
Menofiya University
Professor Aboul Ella Hassanien
Faculty of Computers and Information
Cairo University
Supervisoion by
Agenda
Cheminformatics
• Introduction.
• Representation.
• Molecular descriptors.
Bio-Inspiring
• Problems
• Algorithms
• Ant Colony Optimization
Bioinspiring and
Cheminformatics
• Classification
• Clustering
• Feature Selection
Application
• Drug Discovery
• Drug Design
Cheminformatics
Chemoinformatics is concerned with the
application of computational methods to
tackle chemical problems, with particular
emphasis on the manipulation of chemical
structural information.
The term was introduced in the late 1990s.
there is not even any universal agreement on
the correct spelling:
Cheminformatics.
chemical informatics.
Chemiinformatics.
Chemoinformatics.
Cheminformatics
• Cheminformatics is the use of computer and
informational techniques applied to a range of
problems in the field of Chemistry.
• Cheminformatics strategies are useful in drug
discovery and other efforts where large numbers
of compounds are being evaluated for specific
properties.
• Cheminformatics is also known as
multidisciplinary science as it combines
Chemistry, Biology, Mathematics, Biochemistry,
Statistics and informatics.
Problems in Cheminformatics
• Storing data generated through experiments or from molecular
simulation Retrieval of chemical
• Structures from chemical database (Software libraries).
• Prediction of physical, chemical and biological properties of chemical
compounds.
• Elucidation of the structure of a compound based on spectroscopic
data.
• Structure, Substructure, Similarity and diversity searching from
chemical database.
• Docking - Interaction between two macromolecules.
• Drug Discovery
• Molecular Science, Materials Science, Food Science (nutraceuticals),
Atmospheric chemistry, Polymer chemistry, Textile Industry,
Combinatorial organic synthesis (COS).
Problem Statement
Representation of Chemical Structures
•Chemical structures are usually stored
in a computer as molecular graphs.
Graph theory is a well-established area
of mathematics that has found
application not just in chemistry but in
many other areas, such as computer
science.
nodes = atoms
edges = bonds
The nodes and edges may have
properties associated with them.
SMILES
Connection Table
Connection Table
The simplest type of connection table consists of two sections:
A) List of the atomic numbers of the atoms in the molecule
B) List of the bonds, specified as pairs of bonded atoms.
hydrogen atoms may be implied in which case the connection
table is hydrogen suppressed.
SMILES
• SMILES stands for Simplified Molecular
Input Line Entry Specification.
• In SMILES, atoms are represented by
their atomic symbol.
• Upper case symbols are used for
aliphatic atoms and lower case for
aromatic atoms.
• Double bonds are written using “=”
and triple bonds using “#”
Morgan algorithm
• There may be many different ways to construct the connection table or
the SMILES string for a given molecule.
• each atom is assigned a connectivity value equal to the number of
connected atoms. In the second and subsequent iterations a new
connectivity value is calculated.
Screening Methods
• Molecule screens are often
implemented using binary string
representations of the molecules
and the query substructure called
bitstrings. Bitstrings consist of a
sequence of “0”s and “1s”. They
are the “natural currency” of
computers and so can be
compared and manipulated very
rapidly, especially if held in the
computer’s memory. A “1” in a
bitstring usually indicates the
presence of a particular structural
feature and a “0” its absence.
Structure Searching
• Graph theoretic methods can be used to perform substructure searching,
which is equivalent to determining whether one graph is entirely
contained within another, a problem known as subgraph isomorphism.
Molecular Descriptors
• The manipulation and analysis of chemical structural information is made possible through
the use of molecular descriptors.
• These are numerical values that characterise properties of molecules.
• The molecular descriptor is the final result of a logic and mathematical procedure which
transforms chemical information encoded within a symbolic representation of a molecule
into an useful number or the result of some standardized experiment.
• Examples:
• The descriptors fall into Four classes .
 Topological.
 Geometrical.
 Electronic .
 Hybrid or 3D Descriptors.
Molecular Descriptors
Computational Models
• Most molecular discoveries today are the result of an iterative, three-
phase cycle of design, synthesis and test. Analysis of the results from one
iteration provides information and knowledge that enables the next cycle
to be initiated and further improvements to be achieved.
• A common feature of this analysis stage is the construction of some form
of model which enables the observed activity or properties to be related
to the molecular structure.
• Examples:
 Quantitative Structure-Activity Relationships (QSARs)
 Quantitative Structure–Property Relationships (QSPRs)
Quantitative Structure-Activity Relationships
(QSARs)
QSAR is a mathematical relationship between a biological activity of a
molecular system and its geometric and chemical characteristics.
A general formula for a quantitative structure-activity relationship
(QSAR) can be given by the following:
activity = f (molecular or fragmental properties)
QSAR attempts to find consistent relationship between biological activity
and molecular properties, so that these “rules” can be used to evaluate
the activity of new compounds.
Quantitative Structure-Activity
Relationships (QSARs) (Cont.)
QSAR
Compounds + biological activity
New compounds with
improved biological activity
QSAR
Agenda
Cheminformatics
• Introduction.
• Representation.
• Molecular descriptors.
Bio-Inspiring
• Problems
• Algorithms
• Ant Colony Optimization
Bioinspiring and
Cheminformatics
• Classification
• Clustering
• Feature Selection
Application
• Drug Discovery
• Drug Design
Thesis statement
• what’s I aim to achieve
Bio-Inspired Computing
Finding the best solution
increasingly becomes very difficult
to identify, if not impossible, due to
the very large and dynamic scope of
solutions and complexity of
computations. Often, the optimal
solution for such a NP hard problem
is a point in the n-dimensional
hyperspace and identifying the
solution is computationally very
expensive or even not feasible in
limited time.
Bio-Inspired Computing
21
• The computing inspired from biology is a field of study
based on the social behavior of animals, insects and other
living organisms, including also connectionism and
emergence.
• Bio-inspired computing uses computers to model nature and
the study of nature to improve the usage of computers.
Biological
computation
Artificial
Intelligence
Bio-inspired
computing
Bio-Inspired Algorithms
Motivation
 Dealing too complex problems
Incapable to solve by human proposed solution
Absence of complete mathematical model
 Existing of similar problem in nature
Adaptation
Self-organization
Communication
Optimization
Bio-inspired computing Methods:
24
Some areas of bio-inspired computing are:
• neural networks
• genetic algorithm
• particle swarm
• ant colony optimization
• artificial bee colony
• bacterial foraging
• cuckoo search
• Firefly
• leaping frog
• bat algorithm
• flower pollination
• artificial plant optimization
Swarm Intelligence
• The SI-based algorithms belong to a wider class of the algorithms, called
the bio-inspired algorithms.
• we can observe that SI-based ⊂ bio-inspired ⊂ nature-inspired.
Swarm Intelligence
• Population of simple agents
• Decentralized
• Self-Organized
• No or local communication
• Example
 Ant/Bee colonies
 Bird flocking
 Fish schooling
Ant Colony Optimization
• mimic the foraging behavior of
social ants.
• Ants primarily use pheromone as
a chemical messenger.
• pheromone concentration can be
considered as the indicator of
quality solutions to a problem of
interest.
• The movement of an ant is
controlled by pheromone, which
will evaporate over time.
• the probability of ants at a
particular node i to choose the
route from node i to node j is
given by
Agenda
Cheminformatics
• Introduction.
• Representation.
• Molecular descriptors.
BioInspiring
• Cheminformatics
• Molecular Descriptors
• Similarity
Bioinspiring and
Cheminformatics
• Classification
• Clustering
• Feature Selection
Application
• Drug Discovery
• Drug Design
Thesis statement
• what’s I aim to achieve
Bio-Inspiring in Cheminformatics
Bio-Inspiring has many application in the field of Cheminformatics:
 Classification: is a general process related to categorization, the process
in which molecules are differentiated and understood.
 Clustering: is the task of grouping a set of objects in such a way that
objects in such a way that objects in the same group (called a cluster)
are more similar to each other than those in other groups (clusters).
 Feature Selection: is a process that chooses an optimal subset of
features according to a certain criterion.
Classification
• In machine learning and statistics, classification is the problem of
identifying to which of a set of categories (sub-populations) a
new observation belongs, on the basis of a training set of data containing
observations (or instances) whose category membership is known.
Clustering
• Clustering is the process of partitioning
a usually large dataset into groups (or
clusters), according to a similarity (or
dissimilarity) measure.
• If we assume that we have a dataset X,
defined as X = x1, x2, x3, . . ., which
consists of all the data that we want to
place into clusters, then we define a
clustering of X in m clusters C1, ..., Cm,
in such a way that the following
conditions apply:
Feature Selection
• Why we need FS?
 To improve performance (in terms of speed, predictive power,
simplicity of the model).
 to visualize the data for model selection.
 To reduce dimensionality and remove noise.
• Prespectives:
– searching for the best subset of features.
– criteria for evaluating different subsets.
– principle for selection, adding, removing or changing new features
during the search.
Agenda
Cheminformatics
• Introduction.
• Representation.
• Molecular descriptors.
BioInspiring
• Cheminformatics
• Molecular Descriptors
• Similarity
Bioinspiring and
Cheminformatics
• Classification
• Clustering
• Feature Selection
Application
• Drug Discovery
• Drug Design
Thesis statement
• what’s I aim to achieve
Application (I) Drug Design
• Drug design, often referred
to as rational drug design or
simply rational design, is the
inventive process of finding
new medications based on
the knowledge of a biological
target.
• The drug is most commonly
an organic small molecule
that activates or inhibits the
function of a biomolecule
such as a protein.
Application (II) Drug Discovery
Cheminformatics and Bioinformatics
in Drug Design
Literature Review
• Joerg Kurt Wegner, Aaron Sterling, Rajarshi Guha, Andreas Bender in their
survey “ Cheminformatics ” introduce a comprehensive introduction to the
field of cheminformatics and Roberto Todeschini and Viviana Consonni in
their book molecular descriptors combine a huge number of descriptors. All
new descriptors, QSAR approaches and chemometric strategies proposed
since 2000 have been included in this handbook.
• Aboul Ella Hassnien and Eid Elamry introduce “Swarm Intelligence Methods
and Concepts ”.
Literature Review
 Gerald M. Maggiora and Veerabahu Shanmugasundaram in the
“Molecular Similarity Measures ” introduce a survey on getting
similarity between 2 graph and they try to solve Maximum subgraph
matching.
 Arpan Kumar Kar introduce a bio-inspired review .
Thesis Statement
Title:
Bio-Inspiring Computing and its Application in Cheminformatics
Aim:
 Try to cluster Molecular using spectral clustering.
 Try to find similarity between molecules.
References
1. Andrew R. Leach and Valerie J. Gillet, “An Introduction to Chemoinformatics” Springer 2007.
2. Roberto Todeschini and Viviana Consonni ,“Molecular Descriptors for Cheminformatics” ,WILEY-VCH
May,2009.
3. Christina Chrysouli, Anastasios Tefa, “Spectral clustering and semi-supervised learning using evolvingsimilarity
graphs”, Applied Soft Computing,
4. U. Luxburg, A tutorial on spectral clustering, Stat. Comput. 17 (4) (2007)395–416
5. R. Dutt , A. K. Madan , “Predicting biological activity: Computational approach using novel distance based
molecular descriptors”, Computers in Biology and Medicine,2012.
6. Yang, X.S., Cui, Z.,Xias, R., Gandomi, A.H. and Karamanoglu, M. eds., 2013. Swarm intelligence and bio-inspired
computation: theory and applications. Newnes
7. Kar, Arpan Kumar. "Bio inspired computing–A review of algorithms and scope of applications." Expert Systems
with Applications 59 (2016): 20-32.
8. Emmert-Streib, Frank, Matthias Dehmer and Yongtang Shi. “Fifty years of graph matching, network alignment and
network comparison.” Inf. Sci. 346-347 (2016): 180-197.
9. Oduguwa, Abiola, Ashutosh Tiwari, Rajkumar Roy, and Conrad Bessant. "An overview of soft computing
techniques used in the drug discovery process." In Applied Soft Computing Technologies: The Challenge of
Complexity, pp. 465-480. Springer Berlin Heidelberg, 2006.
10. Maggiora, G.M. and Shanmugasundaram, V., 2004. Molecular similarity measures. Chemoinformatics: Concepts,
Methods, and Tools for Drug Discovery, pp.1-50.
Questions
Bio inspiring computing and its application in cheminformatics

Bio inspiring computing and its application in cheminformatics

  • 1.
    Bio-inspired Computing andits Application in Cheminformatics B Y A B D E L A Z I M G A L A L H U S S I E N D E M O N S T R A T O R A T F A C U L T Y O F S C I E N C E , F A Y O U M U N I V E R S I T Y Professor Mohamed Amin Faculty of Science Menofiya University Professor Aboul Ella Hassanien Faculty of Computers and Information Cairo University Supervisoion by
  • 2.
    Agenda Cheminformatics • Introduction. • Representation. •Molecular descriptors. Bio-Inspiring • Problems • Algorithms • Ant Colony Optimization Bioinspiring and Cheminformatics • Classification • Clustering • Feature Selection Application • Drug Discovery • Drug Design
  • 3.
    Cheminformatics Chemoinformatics is concernedwith the application of computational methods to tackle chemical problems, with particular emphasis on the manipulation of chemical structural information. The term was introduced in the late 1990s. there is not even any universal agreement on the correct spelling: Cheminformatics. chemical informatics. Chemiinformatics. Chemoinformatics.
  • 4.
    Cheminformatics • Cheminformatics isthe use of computer and informational techniques applied to a range of problems in the field of Chemistry. • Cheminformatics strategies are useful in drug discovery and other efforts where large numbers of compounds are being evaluated for specific properties. • Cheminformatics is also known as multidisciplinary science as it combines Chemistry, Biology, Mathematics, Biochemistry, Statistics and informatics.
  • 5.
    Problems in Cheminformatics •Storing data generated through experiments or from molecular simulation Retrieval of chemical • Structures from chemical database (Software libraries). • Prediction of physical, chemical and biological properties of chemical compounds. • Elucidation of the structure of a compound based on spectroscopic data. • Structure, Substructure, Similarity and diversity searching from chemical database. • Docking - Interaction between two macromolecules. • Drug Discovery • Molecular Science, Materials Science, Food Science (nutraceuticals), Atmospheric chemistry, Polymer chemistry, Textile Industry, Combinatorial organic synthesis (COS).
  • 6.
  • 7.
    Representation of ChemicalStructures •Chemical structures are usually stored in a computer as molecular graphs. Graph theory is a well-established area of mathematics that has found application not just in chemistry but in many other areas, such as computer science. nodes = atoms edges = bonds The nodes and edges may have properties associated with them. SMILES Connection Table
  • 8.
    Connection Table The simplesttype of connection table consists of two sections: A) List of the atomic numbers of the atoms in the molecule B) List of the bonds, specified as pairs of bonded atoms. hydrogen atoms may be implied in which case the connection table is hydrogen suppressed.
  • 9.
    SMILES • SMILES standsfor Simplified Molecular Input Line Entry Specification. • In SMILES, atoms are represented by their atomic symbol. • Upper case symbols are used for aliphatic atoms and lower case for aromatic atoms. • Double bonds are written using “=” and triple bonds using “#”
  • 10.
    Morgan algorithm • Theremay be many different ways to construct the connection table or the SMILES string for a given molecule. • each atom is assigned a connectivity value equal to the number of connected atoms. In the second and subsequent iterations a new connectivity value is calculated.
  • 11.
    Screening Methods • Moleculescreens are often implemented using binary string representations of the molecules and the query substructure called bitstrings. Bitstrings consist of a sequence of “0”s and “1s”. They are the “natural currency” of computers and so can be compared and manipulated very rapidly, especially if held in the computer’s memory. A “1” in a bitstring usually indicates the presence of a particular structural feature and a “0” its absence.
  • 12.
    Structure Searching • Graphtheoretic methods can be used to perform substructure searching, which is equivalent to determining whether one graph is entirely contained within another, a problem known as subgraph isomorphism.
  • 13.
    Molecular Descriptors • Themanipulation and analysis of chemical structural information is made possible through the use of molecular descriptors. • These are numerical values that characterise properties of molecules. • The molecular descriptor is the final result of a logic and mathematical procedure which transforms chemical information encoded within a symbolic representation of a molecule into an useful number or the result of some standardized experiment. • Examples: • The descriptors fall into Four classes .  Topological.  Geometrical.  Electronic .  Hybrid or 3D Descriptors.
  • 14.
  • 15.
    Computational Models • Mostmolecular discoveries today are the result of an iterative, three- phase cycle of design, synthesis and test. Analysis of the results from one iteration provides information and knowledge that enables the next cycle to be initiated and further improvements to be achieved. • A common feature of this analysis stage is the construction of some form of model which enables the observed activity or properties to be related to the molecular structure. • Examples:  Quantitative Structure-Activity Relationships (QSARs)  Quantitative Structure–Property Relationships (QSPRs)
  • 16.
    Quantitative Structure-Activity Relationships (QSARs) QSARis a mathematical relationship between a biological activity of a molecular system and its geometric and chemical characteristics. A general formula for a quantitative structure-activity relationship (QSAR) can be given by the following: activity = f (molecular or fragmental properties) QSAR attempts to find consistent relationship between biological activity and molecular properties, so that these “rules” can be used to evaluate the activity of new compounds.
  • 17.
  • 18.
    QSAR Compounds + biologicalactivity New compounds with improved biological activity QSAR
  • 19.
    Agenda Cheminformatics • Introduction. • Representation. •Molecular descriptors. Bio-Inspiring • Problems • Algorithms • Ant Colony Optimization Bioinspiring and Cheminformatics • Classification • Clustering • Feature Selection Application • Drug Discovery • Drug Design Thesis statement • what’s I aim to achieve
  • 20.
    Bio-Inspired Computing Finding thebest solution increasingly becomes very difficult to identify, if not impossible, due to the very large and dynamic scope of solutions and complexity of computations. Often, the optimal solution for such a NP hard problem is a point in the n-dimensional hyperspace and identifying the solution is computationally very expensive or even not feasible in limited time.
  • 21.
    Bio-Inspired Computing 21 • Thecomputing inspired from biology is a field of study based on the social behavior of animals, insects and other living organisms, including also connectionism and emergence. • Bio-inspired computing uses computers to model nature and the study of nature to improve the usage of computers. Biological computation Artificial Intelligence Bio-inspired computing
  • 22.
  • 23.
    Motivation  Dealing toocomplex problems Incapable to solve by human proposed solution Absence of complete mathematical model  Existing of similar problem in nature Adaptation Self-organization Communication Optimization
  • 24.
    Bio-inspired computing Methods: 24 Someareas of bio-inspired computing are: • neural networks • genetic algorithm • particle swarm • ant colony optimization • artificial bee colony • bacterial foraging • cuckoo search • Firefly • leaping frog • bat algorithm • flower pollination • artificial plant optimization
  • 25.
    Swarm Intelligence • TheSI-based algorithms belong to a wider class of the algorithms, called the bio-inspired algorithms. • we can observe that SI-based ⊂ bio-inspired ⊂ nature-inspired.
  • 26.
    Swarm Intelligence • Populationof simple agents • Decentralized • Self-Organized • No or local communication • Example  Ant/Bee colonies  Bird flocking  Fish schooling
  • 27.
    Ant Colony Optimization •mimic the foraging behavior of social ants. • Ants primarily use pheromone as a chemical messenger. • pheromone concentration can be considered as the indicator of quality solutions to a problem of interest. • The movement of an ant is controlled by pheromone, which will evaporate over time. • the probability of ants at a particular node i to choose the route from node i to node j is given by
  • 28.
    Agenda Cheminformatics • Introduction. • Representation. •Molecular descriptors. BioInspiring • Cheminformatics • Molecular Descriptors • Similarity Bioinspiring and Cheminformatics • Classification • Clustering • Feature Selection Application • Drug Discovery • Drug Design Thesis statement • what’s I aim to achieve
  • 29.
    Bio-Inspiring in Cheminformatics Bio-Inspiringhas many application in the field of Cheminformatics:  Classification: is a general process related to categorization, the process in which molecules are differentiated and understood.  Clustering: is the task of grouping a set of objects in such a way that objects in such a way that objects in the same group (called a cluster) are more similar to each other than those in other groups (clusters).  Feature Selection: is a process that chooses an optimal subset of features according to a certain criterion.
  • 30.
    Classification • In machinelearning and statistics, classification is the problem of identifying to which of a set of categories (sub-populations) a new observation belongs, on the basis of a training set of data containing observations (or instances) whose category membership is known.
  • 31.
    Clustering • Clustering isthe process of partitioning a usually large dataset into groups (or clusters), according to a similarity (or dissimilarity) measure. • If we assume that we have a dataset X, defined as X = x1, x2, x3, . . ., which consists of all the data that we want to place into clusters, then we define a clustering of X in m clusters C1, ..., Cm, in such a way that the following conditions apply:
  • 32.
    Feature Selection • Whywe need FS?  To improve performance (in terms of speed, predictive power, simplicity of the model).  to visualize the data for model selection.  To reduce dimensionality and remove noise. • Prespectives: – searching for the best subset of features. – criteria for evaluating different subsets. – principle for selection, adding, removing or changing new features during the search.
  • 33.
    Agenda Cheminformatics • Introduction. • Representation. •Molecular descriptors. BioInspiring • Cheminformatics • Molecular Descriptors • Similarity Bioinspiring and Cheminformatics • Classification • Clustering • Feature Selection Application • Drug Discovery • Drug Design Thesis statement • what’s I aim to achieve
  • 34.
    Application (I) DrugDesign • Drug design, often referred to as rational drug design or simply rational design, is the inventive process of finding new medications based on the knowledge of a biological target. • The drug is most commonly an organic small molecule that activates or inhibits the function of a biomolecule such as a protein.
  • 35.
  • 36.
  • 37.
    Literature Review • JoergKurt Wegner, Aaron Sterling, Rajarshi Guha, Andreas Bender in their survey “ Cheminformatics ” introduce a comprehensive introduction to the field of cheminformatics and Roberto Todeschini and Viviana Consonni in their book molecular descriptors combine a huge number of descriptors. All new descriptors, QSAR approaches and chemometric strategies proposed since 2000 have been included in this handbook. • Aboul Ella Hassnien and Eid Elamry introduce “Swarm Intelligence Methods and Concepts ”.
  • 38.
    Literature Review  GeraldM. Maggiora and Veerabahu Shanmugasundaram in the “Molecular Similarity Measures ” introduce a survey on getting similarity between 2 graph and they try to solve Maximum subgraph matching.  Arpan Kumar Kar introduce a bio-inspired review .
  • 39.
    Thesis Statement Title: Bio-Inspiring Computingand its Application in Cheminformatics Aim:  Try to cluster Molecular using spectral clustering.  Try to find similarity between molecules.
  • 40.
    References 1. Andrew R.Leach and Valerie J. Gillet, “An Introduction to Chemoinformatics” Springer 2007. 2. Roberto Todeschini and Viviana Consonni ,“Molecular Descriptors for Cheminformatics” ,WILEY-VCH May,2009. 3. Christina Chrysouli, Anastasios Tefa, “Spectral clustering and semi-supervised learning using evolvingsimilarity graphs”, Applied Soft Computing, 4. U. Luxburg, A tutorial on spectral clustering, Stat. Comput. 17 (4) (2007)395–416 5. R. Dutt , A. K. Madan , “Predicting biological activity: Computational approach using novel distance based molecular descriptors”, Computers in Biology and Medicine,2012. 6. Yang, X.S., Cui, Z.,Xias, R., Gandomi, A.H. and Karamanoglu, M. eds., 2013. Swarm intelligence and bio-inspired computation: theory and applications. Newnes 7. Kar, Arpan Kumar. "Bio inspired computing–A review of algorithms and scope of applications." Expert Systems with Applications 59 (2016): 20-32. 8. Emmert-Streib, Frank, Matthias Dehmer and Yongtang Shi. “Fifty years of graph matching, network alignment and network comparison.” Inf. Sci. 346-347 (2016): 180-197. 9. Oduguwa, Abiola, Ashutosh Tiwari, Rajkumar Roy, and Conrad Bessant. "An overview of soft computing techniques used in the drug discovery process." In Applied Soft Computing Technologies: The Challenge of Complexity, pp. 465-480. Springer Berlin Heidelberg, 2006. 10. Maggiora, G.M. and Shanmugasundaram, V., 2004. Molecular similarity measures. Chemoinformatics: Concepts, Methods, and Tools for Drug Discovery, pp.1-50.
  • 41.

Editor's Notes

  • #5 A dedicated website or other application which enables users to communicate with each other by posting information, comments, messages, images, etc. A social network is a social structure made of nodes (individuals or organizations) and edges that connect nodes in various relationships like friendship, kinship, etc.
  • #8 Note that hydrogen atoms are often omitted. The nodes and edges may have properties associated with them. For example, the atomic number or atom type may be associated with each node and the bond order with each edge. A graph represents the topology of a molecule only, that is, the way the nodes (or atoms) are connected.
  • #9 It can be represented by a graph or adjacency matrix
  • #13 It can be represented by a graph or adjacency matrix
  • #14 Perhaps the simplest descriptors are based on simple counts of features such as hydrogen bond donors, hydrogen bond acceptors, ring systems (including aromatic rings), rotatable bonds and molecular weight. Many of these features can be defined as substructures or molecular fragments and so their frequency of occurrence can be readily calculated from a 2D connection table using the techniques developed for substructure search. Many different molecular descriptors have been described and used for a wide variety of purposes. They vary in the complexity of the information they encode and in the time required to calculate them. In general, the computational requirements increase with the level of discrimination that is achieved. For example, the molecular weight does not convey much about a molecule’s properties but it is very rapid to compute. By contrast, descriptors that are based on quantum mechanics may provide accurate representations of properties, but they are much more time consuming to compute. Some descriptors have an experimental counterpart (e.g. the octanol–water partition coefficient)
  • #20 CD in Mono-Dimensional SN CD in Multi-Dimensional SN
  • #22 Multi-Dimensional Networks network has multiple types of interactions between actors of the same type. Each dimension represents one type of activity between users. p-dimensional network is represented as Multidimensional network can represent multiple types of interactions (activities) between one type of entities. Each type of interaction can be represented by one dimension E.g. at YouTube, two users can be connected through friendship connection, email communications, subscription/Fans, chatter in comments, etc.
  • #27 Communities are also called groups cohesive subgroups modules cluster in different context
  • #29 CD in Mono-Dimensional SN CD in Multi-Dimensional SN
  • #34 CD in Mono-Dimensional SN CD in Multi-Dimensional SN
  • #40 How we might extend methods presented so far to handle this heterogeneity. How to use Map Reduce capabilities in order to improve efficiency and increase scalability of community detection algorithms.