1. – Kyoto Encyclopedia of Genes and Genomes
– Integrated database for pathways, chemical reactions,
genomes, expression, and more..
– Data representation with graphs
• Comparison between pathways and other data
• Path computation in pathways
KEGG
2. KEGG
Kyoto Encyclopedia of Genes and Genomes
• Integrated Database of Biological Systems Information
for Post-genomic era
– Genomes, genes, pathways of completely sequenced
organisms
– Functional annotation for each gene by comparative
genomics
– Pathway reconstruction based on the annotation
– A system for computing and comparing biological networks
from molecular interaction data
• Graph representation and application of graph algorithms
– http://www.genome.ad.jp/kegg/
3. Information on relations
between molecules
Databases in KEGG
Pathway
Genomes
Genes
Expression
Chemicals and
their reactions
Orthologs
Sequence
similarity
4. GENES and PATHWAY
• GENES: ~400,000 genes from over 100 organisms
– Parsing GenBank and EMBL for completely sequenced genomes
– Parsing LocusLink and RefSeq for model organisms such as
human and mouse
– KEGG annotates function of each gene based on sequence
similarity
• PATHWAY: over 100 maps
– Metabolic pathways, regulatory pathways and protein
complexes
– Manually drawn and classified
– Information collected from various text books, literatures and
web pages
6. Graph Representation of Metabolic Pathways and
Chemical Compounds
• Metabolic Pathways
– Image maps and position of each object on them
– Graph 1
• Node: chemical compounds, Link: enzymatic reactions
– Graph 2
• Node: enzymes, Link: neighborhood relations of enzymes on
pathway maps
• Chemical Compounds
– Graph 1
• Node: atoms, Link: bonds between atoms
– Graph 2 for carbohydrates
• Node: sugars, Link: glycosylation bonds