INFORMATION MANAGEMENT                    Semantic Document Architecture for Desktop                    Data Integration A...
Motivation   Semantic Web  Semantic Desktop                       Ontologies                       Resource Description ...
Semantic DocumentsSemantic document are composite information resources composed of data/informationunits that are:    un...
Thesis Statement  “Semantic documents integrate desktop data into a unified desktop informationspace and enable desktop da...
Outline  Motivation  Semantic Document Model - SDM  Semantic Document Architecture - SDArch  Prototype  Thesis Validation ...
Semantic Document Model                                                  Semantic-Linking Part                            ...
Machine-Processable and Human-Readable instances of SDM MP document representation   Unique and permanent instance   HT...
Outline  Motivation  Semantic Document Model - SDM  Semantic Document Architecture - SDArch  Prototype  Thesis Validation ...
Semantic Document Architecture - SDArch                            Annals of Information systems’ 09   9
Semantic Document Authoring, Search, and Navigation     Concept Exploration Algorithm   Objective:     Search Algorithm  ...
Semantic Document Sharing SDArch social network Publishing only RDFs Capturing social-context annotations Contributing...
Outline  Motivation  Semantic Document Model - SDM  Semantic Document Architecture - SDArch  Prototype  Thesis Valiadtion ...
SDArch PrototypeObjectives:                                     Source Code Organization:                                 ...
SemanticDoc - MS Office Add-Ins                                  ICWE’08   14
Outline  Motivation  Semantic Document Model - SDM  Semantic Document Architecture - SDArch  Prototype  Thesis Validation ...
Thesis ValidationQ1: How do semantic documents improve information finding and retrieval insemantically integrated documen...
Experimental Evaluation of Information Retrieval in Semantic Documents Objectives:     Measuring effectiveness of the se...
Test Collections      Mammals of the World                    Metals and Alloys MAMO Ontology                      Metal...
Measuring Effectiveness of the Semantic Document(Indexing) AnnotationTest collection 1: Mammals of the World              ...
Measuring Effectiveness of the Semantic Document SearchTest collection: Mammals of the World   Test collection: Metals and...
Usability EvaluationEvaluation Hypothesis :    “Using SDArch results in a more effective, efficient, and satisfactory user...
Case Study: Authoring of Course Material Participants – SDArch Social Network       University of Lugano, Switzerland – ...
Usability Test Use Casesi. Setting Up the User Profile and the Social Network Propertiesii. Authoring and Publishing Seman...
Evaluation Methods and Metrics  Evaluation Criteria          Evaluation Method                Evaluation Metric1. Effectiv...
1. User Effectiveness metric: Task success rate                   Conventional System                 SDArch System    Tas...
2. User Efficiencymetric: Task execution time   metric: Number of mouse clicks   metric: Number of window switches        ...
3. User Satisfaction      metric: 5-level Likert Scale                                     Internal consistency (reliabili...
Outline  Motivation  Semantic Document Model - SDM  Semantic Document Architecture - SDArch  Prototype  Thesis Valiadtion ...
Conclusions Main contributions      Introducing the Semantic Document Model – SDM      Designing the Semantic Document ...
PublicationsJournals: S. Nešić, "Semantic Document Model to Enhance Data and Knowledge Interoperability," Annals of Infor...
Upcoming SlideShare
Loading in …5
×

Sasa Nesic - PhD Dissertation Defense

1,320 views

Published on

Published in: Education, Technology
  • Be the first to comment

  • Be the first to like this

Sasa Nesic - PhD Dissertation Defense

  1. 1. INFORMATION MANAGEMENT Semantic Document Architecture for Desktop Data Integration And ManagementPlace image here November 30, 2010 Saša Nešić PhD Dissertation Defense
  2. 2. Motivation Semantic Web Semantic Desktop  Ontologies  Resource Description Framework (RDF)  SPARQL query language Semantic Documents 2
  3. 3. Semantic DocumentsSemantic document are composite information resources composed of data/informationunits that are:  uniquely identified by globally unique URIs,  semantically annotated by concepts from domain ontologies,  interlinked with other data/information units via explicit semantic links . 3
  4. 4. Thesis Statement “Semantic documents integrate desktop data into a unified desktop informationspace and enable desktop data to be integrated into a unified information space of social communities” Improving the Effectiveness and Efficiency of Desktop Users 4
  5. 5. Outline Motivation Semantic Document Model - SDM Semantic Document Architecture - SDArch Prototype Thesis Validation Conclusions
  6. 6. Semantic Document Model Semantic-Linking Part Change-Tracking Part Annotation Part Core Part Core Part Annotation Part Semantic-Linking Part Change-Tracking Part- document unit types - annotation types - semantic linking interface - types of doc. unit changes- structural relationships - annotation interface - change-tracking interface- identification- binary content linking Annals of Information systems’ 09 6
  7. 7. Machine-Processable and Human-Readable instances of SDM MP document representation  Unique and permanent instance  HTTP de-referencable URIs  RDF data format HR document representation  Temporal document instances  Rendered from the MP instance  Existing document formats 7
  8. 8. Outline Motivation Semantic Document Model - SDM Semantic Document Architecture - SDArch Prototype Thesis Validation Conclusions
  9. 9. Semantic Document Architecture - SDArch Annals of Information systems’ 09 9
  10. 10. Semantic Document Authoring, Search, and Navigation Concept Exploration Algorithm Objective: Search Algorithm - conceptualization of DU semantics Objective: Input: Search Personalization Algorithm - search for semantic document units (DUs) - document unit: Objective: - Input: ontology(ies) domain - personalization of semantic doc. Search - Output: a free-text keyword query Input: - Output: vector: concept - list of retrieved semantic DUs: - a ranked list of semantic DUs - list of user preferences - concept weight vector: Features: Output: - forming semantic query: - re-ranked list of semantic DUs Features: Features: - lexical expansion of concept labels - - executingSCA for each DUagainst CI: extracting semantic query - syntactic concept matching - weighting schema for each user preference - semantic concept matching - ranking DUs based on calculated weights - measuring concept relevance - measuring similarity between and Semantic document authoring service Semantic document search and navigation service SEKE’ 10 10 10
  11. 11. Semantic Document Sharing SDArch social network Publishing only RDFs Capturing social-context annotations Contributing to:  Linked Open Data Cloud  Web of Linked Data  Semantic Web ESEC/FSE – SoSEA’09 11
  12. 12. Outline Motivation Semantic Document Model - SDM Semantic Document Architecture - SDArch Prototype Thesis Valiadtion Conclusions
  13. 13. SDArch PrototypeObjectives: Source Code Organization: Number of services 5  Validation of SDArch and SDM Number of .NET assemblies 15  Enabling experimental evaluation Number of .NET namespaces 14  Enabling usability evaluationImplementation: Semantic Document Repository  Sesame 2 RDF repository  SemWeb C# Library  MySQL DB-backed persistent RDF storage  SPARQL query support  Full-Text query support (Lucene) Services  WCF Framework Tools  MS Office Add-Ins 13
  14. 14. SemanticDoc - MS Office Add-Ins ICWE’08 14
  15. 15. Outline Motivation Semantic Document Model - SDM Semantic Document Architecture - SDArch Prototype Thesis Validation Conclusions
  16. 16. Thesis ValidationQ1: How do semantic documents improve information finding and retrieval insemantically integrated document collections? 1. Experimental evaluation of Information Retrieval in Semantic DocumentsQ2: How do semantic documents facilitate desktop users in completing tasks thatdraw data from both a personal desktop and social communities? 2. Usability evaluation of SDArch Services and Tools 16 16
  17. 17. Experimental Evaluation of Information Retrieval in Semantic Documents Objectives:  Measuring effectiveness of the semantic document search  Measuring effectiveness of the semantic document annotation (indexing) Compared approaches:  Concept-Based Indexing and Search – Simple Syntactic Matching  Concept-Based Indexing and Search – Lexically Expanded Syntactic Matching  Full Text Indexing and Search (Lucene)  Semantic Document Indexing and Search SEMAPRO 10 17 17
  18. 18. Test Collections Mammals of the World Metals and Alloys MAMO Ontology  Metals Ontology  OWL + SKOS  OWL + SKOS  Finnish National Museum  Key-To-Metals, Zurich  ~ 5000 domain concepts  ~ 1800 domain concepts Document Set  Document Set  Wikipedia – List of Mammals  Key-To-Metals records  150 articles  240 Word documents  2130 semantic document units  3312 semantic document units Query Set  Query Set  5 queries related to Mammals  5 queries related to Metals and Alloys 18 18
  19. 19. Measuring Effectiveness of the Semantic Document(Indexing) AnnotationTest collection 1: Mammals of the World # of syn. # of sem. weight of syn. weight of sem. Approach matches matches matches matchesCB – simple syntactic matching 1524 - 2.56 -CB – lexically expand. syntactic matching 3182 - 3.62 -Semantic document indexing and annotation 3182 2437 3.62 2.96Test collection 2: Metals and Alloys # of syn. # of sem. weight of syn. weight of Approach matches matches matches sem. matchesCB – simple syntactic matching 2153 - 1.73 -CB – lexically expand. syntactic matching 2879 - 2.43 -Semantic document indexing and annotation 2879 1024 2.43 2.14 19 19
  20. 20. Measuring Effectiveness of the Semantic Document SearchTest collection: Mammals of the World Test collection: Metals and alloys 20 20
  21. 21. Usability EvaluationEvaluation Hypothesis : “Using SDArch results in a more effective, efficient, and satisfactory user experience when authoring, exploring (i.e., searching and navigating) and utilizing documents in carrying out daily tasks.”Usability evaluation criteria :  User Effectiveness  User Efficiency  User Satisfaction ICALT’ 10 21 21
  22. 22. Case Study: Authoring of Course Material Participants – SDArch Social Network  University of Lugano, Switzerland – 7 participants  Simon Fraser University, Canada – 7 participants  Athabasca University, Canada – 2 participants  University of Belgrade, Serbia – 2 participants Document Collection  “Software Design Patterns” – 70 PowerPoint and Word documents Evaluation Session  Task-Based Usability Test  Follow-up questionnaires 22 22
  23. 23. Usability Test Use Casesi. Setting Up the User Profile and the Social Network Propertiesii. Authoring and Publishing Semantic Documentsiii. Searching and Navigating across Semantic Documents Task Task objective Slide 1 Design patterns definition 1 2 Example 1 - definition 2 3 Example 1 - illustration 4 Example 2 - definition 3 5 Example 2 - illustration 23
  24. 24. Evaluation Methods and Metrics Evaluation Criteria Evaluation Method Evaluation Metric1. Effectiveness Objective - Quantitative Measure • Task Success Rates Objective - Quantitative Measure • Task Completion Times2. Efficiency “ • Number of Mouse Clicks “ • Number of Window Switches3. Satisfaction Subjective - Questionnaire • 5-level Likert scale 24 24
  25. 25. 1. User Effectiveness metric: Task success rate Conventional System SDArch System Task Successful Completions % Successful Completions % 1 18 100 18 100 2 17 94.44 18 100 3 15 83.33 17 94.44 4 17 94.44 18 100 5 14 77.77 16 88.88 25 25
  26. 26. 2. User Efficiencymetric: Task execution time metric: Number of mouse clicks metric: Number of window switches T-Test results: Task Task p-value p-value 1 1 1.6*10-12 0.00071 0.00004 2 2 1.22*10-7 0.00011 0.0041 3 3 6.91*10-8 9.17*10-6 0.00016 4 4 3.67*10-7 0.00034 0.00009 5 5 4.82*10-10 2.6*10-6 0.00004 If p < 0.05  results are statistically significant 26 26
  27. 27. 3. User Satisfaction metric: 5-level Likert Scale Internal consistency (reliability) test: Dimension Cronbach’s α Usefulness 0.85Stronglyagree  Ease-of-Use 0.78 Ease-of-Learning 0.92 Overall Satisfaction 0.83 Recommended α values > 0.75Stronglydisagree  27 27
  28. 28. Outline Motivation Semantic Document Model - SDM Semantic Document Architecture - SDArch Prototype Thesis Valiadtion Conclusions
  29. 29. Conclusions Main contributions  Introducing the Semantic Document Model – SDM  Designing the Semantic Document Architecture – SDArch  Providing the SDArch Prototype Implementation  Experimental and Usability evaluations Future directions:  Document units versioning  Document units privacy and security  Decentralized storage of shared semantic documents 29
  30. 30. PublicationsJournals: S. Nešić, "Semantic Document Model to Enhance Data and Knowledge Interoperability," Annals of Information Systems - SpecialIssue on Semantic Web & Web 2.0, Springer US, pp. 135 – 160, 2009.Conferences: S. Nešić, F. Crestani, D. Gašević , M. Jazayeri, "Search and Navigation in Semantically Integrated Document Collections," 4thInternational Conference on Advances in Semantic Processing - SEMAPRO, pp. 123 – 129, Firenze, Italy, 2010. S. Nešić, D. Gašević , M. Jazayeri, "Semantic Document Architecture for Desktop Data Integration and Management," The 22ndInternational Conference on Software Engineering and Knowledge Engineering - SEKE, pp. 73 – 78, San Francisco, USA, 2010. S. Nešić, D. Gašević , M. Jazayeri, M. Landoni, "Using Semantic Documents and Social Networking in Authoring Course Material: AnEmpirical Study," 10th IEEE International Conference on Advanced Learning Technologies - ICALT, pp. 666 – 670, Sousse,Tunisia,2010. (Best paper award) S. Nešić, F. Crestani, D. Gašević , M. Jazayeri, "Concept-Based Semantic Annotation, Indexing and Retrieval of Office-LikeDocument Units," 9th RIAO Conference, pp. 234 – 237 Paris, France, 2010. S. Nešić, D. Gašević, M. Jazayeri, "Extending MS Office for sharing Document Content Units over the Semantic Web," 8thInternational Conference on Web Engineering - ICWE, Yorktown Heights, pp. 350 – 353, New York, USA, 2008. S. Nešić, D. Gašević, M. Jazayeri, "Semantic Document Management for Collaborative Learning Object Authoring," 8th IEEEInternational Conference on Advanced Learning Technologies - ICALT, pp. 751 – 755, Santander, Spain, 2008. S. Nešić, D. Gašević, M. Jazayeri, "An ontology-based framework for author-learning content interaction," 6th InternationalConference on Web-based Education - WBE, Chamonix, France, 2007. S. Nešić, D. Gašević, M. Jazayeri, "An Ontology-Based Framework for Authoring Assisted by Recommendation," 7th IEEEInternational Conference on Advanced Learning Technologies - ICALT, pp. 227 – 231, Niigata, Japan, 2007. S. Nešić, J. Jovanović, D. Gašević, M. Jazayeri, "Ontology-Based Content Model for Scalable Content Reuse," 4th ACM SIGARTInternational Conference on Knowledge Capture - K-CAP, pp. 195 – 198, Whistler, Canada, 2007.Workshops: S. Nešić, M. Jazayeri, F. Lelli, S. Nešić, "Towards Efficient Document Content Sharing in Social Networks” 2nd Workshop on SocialSoftware Engineering and Applications, co-located with ESEC/FSE, pp. 1- 8, Amsterdam, Netherlands, 2009. 30

×