This thesis is motivated by a case study of SSME, in which we digitalize and integrate cultural assets of TCM and provide Web-based knowledge services to medical experts. The major quest is to turn cultural assets from prolonged Chinese history into knowledge services contributing to modern biomedicine. In our view, the essence of knowledge service is cross-domain collaboration in knowledge discovery on the Web of data. Whereas the Service-Oriented Architecture enables interactions between Web agents, and the Semantic Web provides a knowledge representation and integration framework, the feasibility and benefits of Web-based collaborative knowledge discovery need to be further investigated. We propose a methodology named Semantic Graph Mining (SGM), which uses the semantic graph model to integrate graph mining and ontology reasoning for better analyzing biomedical complex networks (an important KDD problem). Potential methods of SGM include Web resource ranking, semantic association discovery, frequent subgraph mining, and clustering. The effectiveness of these methods is investigated in use cases such as TCM semantic search, TCM formulae analysis, drug-interaction analysis.
Semantic Data Retrieval: Search, Ranking, and Summarization
Semantic Web Development for Traditional Chinese Medicine
1. Semantic Web Development for
Traditional Chinese Medicine
Tong Yu, Zhejiang University. China.
July. 15th, 2008, IAAI, Chicago, Il.
2. Outline
• Overview of TCM Semantic Web
• Ontology Engineering and Reuse
• Semantic Mapping and Integration
• Semantic Query, Search and
Navigation
• Semantic Graph Mining
• Summary
3. The Semantic Web:
“A Giant Graph of Things”
• Based on the Internet and the Web
• Formal Semantics
o Use URIs as names for things
o RDF information about things
available through HTTP URIs
o Use RDF statements for
semantic links between things
• Global network of databases
7. TCM Ontology Platform
• Domain Categorization
o The current TCM ontology contains 15 major
categories for each sub-domain.
• Ontology Structure
o A typing system as a concept hierarchy
o A semantic network defining the associations between concepts
• Scale
o More than 20,000 classes and 100,000 instances
defined in the current ontology
• Access Control
o layered privilege mechanism that defines users as
reader, editor, checker and administrator.
• Service
o Web APIs for ontology-based applications.
14. Semantic Graph Mining
• We envision that intelligent agents could work on
the Semantic Web of structured data, and assist
their masters to solve problems, who can
o discover important Web resources
o discern latent semantic associations
o interpret interesting graph patterns.
• Existing methods of data mining, especially
graph mining, can be adopted to implement
these intelligent agents.
• We propose a methodology, called Semantic
Graph Mining (SGM), for building agents that
discover knowledge on the Semantic Web.
17. The Process for Analyzing
the Network of Herbs
– Data Modeling
– Data
Transformation &
Integration :
– Entity
Disambiguation
– Interaction
Identification
– Network Mapping
– Network Analysis
18. Semantic Graph
Resource Importance
• the in-degree centrality CI of a resource is measured by the
weighted sum of statements with the resource as object, and the
out-degree centrality is measured by the weighted sum of
statements with the resource as subject.
19. Semantic Graph
Resource Importance
• The Closeness Centrality of a resource r is defined as the inverse of
the sum of the distance from r to all other resources.
20. Semantic Graph
Resource Importance
• The Betweenness Centrality of a resource r is defined as the ratio of
shortest paths across the resource in the graph.
25. Conclusion
• We took the first systematic approach to leverage the
progress of Biomedical Informatics to address the
modernization of TCM.
• Domain experts evaluate the platform’s major technical
features as original and productive in Drug Safety and
Efficacy analysis.
• This case study demonstrates the Semantic Web’s
advantages in representation, integration, and
discovery of knowledge with complex domain models.
• Contributes to the Preservation and Modernization of
TCM as intangible cultural heritage.
26. Reference
• TCM Ontology Engineering and Reuse
o Y. Mao, et al. Dynamic Sub-Ontology Evolution for Traditional Chinese
Medicine Web Ontology. Journal of Biomedical Informatics, 2008 (In
progress)
o Y. Mao et al. Sub-Ontology Based Resource Management for Web-based
e-Learning. IEEE Transactions on Knowledge and Data Engineering, 2008
(In Progress)
• Data Mapping and Integration
o Zhao-hui Wu, Hua-jun Chen. 2008. Semantic Grid:Model,
Methodology,and Applications (Monograph). Co-published by Zhejiang
University Press and Springer-Verlag GmbH.
o Huajun Chen et al. RDF/RDFS-based Relational Database Integration.
ICDE 2006
o Huajun Chen et al.From Legacy Relational Databases to the Semantic
Web: an In-Use Application for Traditional Chinese Medicine. ISWC 2006.
27. Reference
• Semantic Search, Query, and Navigation
o Huajun Chen et al. Towards semantic e-science for traditional chinese
medicine. BMC Bioinformatics, 8(Suppl 3):56, 2007.
• Knowledge Discovery for TCM
o Yi Feng et al. Knowledge discovery in traditional Chinese medicine: State
of the art and perspectives, AI in Medicine, 38(3): 219-236, 2006.
o Xuezhong Zhou et al. Integrative mining of traditional Chinese medicine
literature and MEDLINE for functional gene networks. AI in Medicine
(2007) 41, 87—104.
• Semantic Graph Mining
o Tong Yu et al. Semantic Graph Mining for Biomedical Complex Network
Analysis. WWW ’08 Workshops: HCLS.
o Huajun Chen et al. Semantic Graph Mining for Biomedical Complex
Network Analysis. Brief. In Bioinformatics ( In progress).