Your SlideShare is downloading. ×
Semantic Wiki, Great Candidate for Knowledge Acquisition
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Semantic Wiki, Great Candidate for Knowledge Acquisition


Published on

this is a high-level pitch deck for knowledge acquisition (KA), beside the textual part. We already decide on matter that we need low level textual entailment based KA, while the high-level part …

this is a high-level pitch deck for knowledge acquisition (KA), beside the textual part. We already decide on matter that we need low level textual entailment based KA, while the high-level part involving more human computation is partially ignored at the point of presentation. This deck is an introduction to social semantic web and let people know how it can help with our KA tasks.

Published in: Technology, Education

  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. From Text and Data to Knowledge: ViaSemantic WikisThe Social Semantic Web in the SmallJesse Wang
  • 2. The Bottleneck of AI is Knowledge Acquisition2HumanIntelligenceComputerIntelligence
  • 4. Connecting both Information and PeopleConnections between peopleConnectionsbetweenInformationEmailSocial NetworkingGroupwareJavascriptWeblogsDatabasesFile SystemsHTTPKeyword SearchUSENETWikisWebsitesDirectory Portals2010 - 2020Web 1.02000 - 20101990 - 2000PC Era1980 - 1990RSSWidgetsPC’s2020 - 2030Office 2.0XMLRDFSPARQLAJAXFTP IRCSOAPMashupsFile ServersSocial Media SharingLightweight CollaborationATOMWeb 3.0Web 4.0Semantic SearchSemantic DatabasesDistributed SearchIntelligent personal agentsJavaSaaSWeb 2.0FlashOWLHTMLSGMLSQLGopherP2PThe WebThe PCWindowsMacOSSWRLOpenIDBBSMMO’sVRSemantic WebIntelligent WebThe InternetSocial WebWeb OS
  • 5. At Multiple Levels of Understanding5Signal entity (Words)Signal form (Syntax)Signal semantics (Concepts)Categories (taxonomy)StatementsModelsDecision-making
  • 6. HOW DO WE CAPTURE ALL?At least, the semantics?6
  • 7. Two Paths for Semantics (>>KB Construction) “Bottom-Up”– Add semantic metadata to pages and databases all over the Web• Alternatively train models to extract above info (machine-assisted)– Every Website becomes semantic• except for those not tagged, trained, or errors “Top-Down”– Experts build models and rules for semantics– Create services that provide this as an overlay to non-semanticWeb– Every website becomes semantic• except for those not covered -- Alex Iskold
  • 8. Five Approaches to Semantics Tagging Statistics Linguistics Semantic Web Artificial Intelligence
  • 9. The Tagging Approach Pros– Easy for users to add and read tags– Tags are just strings– No algorithms or ontologies to dealwith– No technology to learn Cons– Easy for users to add and read tags– Tags are just strings– No algorithms or ontologies to dealwith– No technology to learn Technorati Flickr Wikipedia YouTube
  • 10. The Statistical Approach Pros:– Pure mathematical algorithms– Massively scalable with good trainingdata– Language independent Cons:– No understanding of the content– Hard to craft good queries– Best for finding really popular things –not good at finding needles inhaystacks– Limited by data (esp. quality trainingdata)– Not great for sparse structured datawith strong inherent semantics Google Lucene Autonomy Farecast (Bing Travel)
  • 11. The Linguistic Approach Pros:– Almost-true language understanding– Extract knowledge from text– Best for search for particular facts orrelationships– More precise queries Cons:– Computationally intensive– Difficult to scale– Lots of special case and other errors– Language-dependent Powerset Hakia Inxight, Attensity, and others…
  • 12. The Semantic Web Approach Pros:– More precise queries– Smarter apps with less work– Not as computationally intensive– Share & link data between apps– Works for both unstructured andstructured data Cons:– Lack of tools– Difficult to scale– Who makes all the metadata? Radar Networks DBpedia Project Metaweb (Freebase)
  • 13. The Artificial Intelligence Approach Pros:– Smart in narrow domains– Answer questions intelligently– Reasoning and learning Cons:– Computationally intensive– Difficult to scale– Extremely hard to program– Does not work well outside of narrowdomains– Training takes a lot of work Cycorp AURA (Project Halo)
  • 14. The Approaches ComparedMake the software smarterMake the Data SmarterStatisticsLinguisticsSemanticWebA.I.Tagging
  • 15. In PracticeTaggingSemantic WebStatisticsLinguisticsArtificial intelligence
  • 16. From Tagging to AIData StructureIntelligence16
  • 17. The Semantic Web is a Key Enabler Moves the “intelligence” out of applications, into the data Data need special structures becomes self-describing; Meaning of data becomes part ofthe data Apps can become smarter with less work, because the datacarries knowledge about what it is and how to use it Data can be shared and linked more easily
  • 18. The Semantic Web = Open Database Layer for the WebUserProfilesWebContentDataRecordsApps &ServicesAds &ListingsOpen Data MappingsOpen Data RecordsOpen RulesOpen OntologiesOpen Query Interfaces
  • 19. And The Web IS the Database!Application A Application B
  • 21. 21
  • 22. In Every Part or Layer of the Semantic Web, We Need22
  • 23. Now a Complete Web23
  • 24. Crowd Wisdom To Best Map Human Knowledge for Human24
  • 25. Clear Semantics for Machine to Understand Knowledge25
  • 26. Semantic Wikis: the Social Semantic Web in Action!26SemanticWikis
  • 27. What is a Wiki? A Key Feature of Wikis is27This distinguishes wikis from other publication tools
  • 28. Consensus in Wikis Comes from Collaboration– ~17 edits/page on average inWikipedia (with high variance)– Wikipedia’s Neutral Point of View Convention– Users follow customs andconventions to engage witharticles effectively28
  • 29. Software Support Makes Wikis Successful Trivial to edit by anyone Tracking of all changes, one-step rollback Every article has a “Talk” pagefor discussion Notification facility allows anyoneto “watch” an article Sufficient security onpages, logins can be required A hierarchy ofadministrators, gardeners, andeditors Software Bots recognize certainkinds of vandalism and auto-revert, or recognize articles thatneed work, and flag them foreditors 29
  • 30. Success of Wikis30Actual number of articles on (thickblue line) compared with a Gompertz model that leadseventually to a maximum of about 4.4 million articles(thin green line)
  • 31. Summary: What Wiki Is Really AboutQuick and Easy – No downloadLayered Community AuthoringInterlinked Hierarchical ContentRevision ControlNotification
  • 32. What is a Semantic Wiki A wiki that has an underlying model of theknowledge described in its pages. To allow users to make their knowledge explicit and formal Semantic Web Compatible32Semantic Wiki
  • 33. Combining Human Knowledge and Data StructuresWikis forMetadataMetadatafor Wikis33
  • 34. Basics of Semantic Wikis Still a wiki, with regular wiki features– E.g. Category/Tags, Namespaces, Title, Versioning, ... Typed Content– E.g. Page/Card, Date, Number, URL/Email, String, … Typed Links– E.g. “capital_of”, “contains”, “born_in”… Querying Interface Support– E.g. “[[Category:Person]] [[Age::<30]]”34
  • 35. Advanced Semantic Wiki Features Semantic forms or templates Auto-completion based on semantics Powerful visualizations based on semantics/structures/types Rules and reasoning support Advanced search and queries (facetedsearch, SPARQL, etc.) Semantic notifications (personalized information filtering) Import and Export of Semantic Data Data Integration:identification, disambiguation, merging, trust, security/privacy, …35
  • 36. Characteristics of Semantic Wikis36
  • 37. What is the Promise of Semantic Wikis? Semantic Wikis facilitateConsensus over Data(Knowledge) Combine low-expressivitydata authorship with thebest features of traditionalwikis User-governed, user-maintained, user-defined Easy to use as anextension of text authoring37
  • 38. One Key Helpful Feature of Semantic WikisSemantic Wikis are “Schema-Last”Databases require DBAs and schema design;Semantic Wikis develop and maintain the schema in the wiki
  • 39. Great Candidate for Knowledge Acquisition Combining both unstructured and semi-structured data High connectivity on both information and social dimensions Collaboration with sophisticated software support Expected low-cost for crowd-sourcing Evolving category and template systems But…39
  • 40. BUT – Plain Wikis Are Not Good Enoughfor Deep Knowledge Acquisition40Knowledge is representedMOSTLY in unstructured andsemi-structured ways• Plain text• Templates• Infoboxes• Tables• Section headers• Links• References• Redirects• …
  • 41. Software/Feature Enhancements Are NeededQuick and easy way to view and edit schemaMachine assistence (NLP, Auto-suggest…)Better visualizations with structured dataMore user layers for better KB constructionBetter targeted (semantic) notifications41
  • 42.  K.A. is the well-known Artificial Intelligence Problem– AI authoring is too expensive, too slow, not scalable Three Possible Solutions– Automatic Machine Parsing (e.g. NELL, ReVerb)• Quality (depth) not good enough for textbook sentences• Error rates are too high• Still need humans in the loop for training data– Crowd Sourced Authoring (e.g. AMT)• Biology and Knowledge Engineering expertise is difficult to get• Mechanical Turk uses individuals, but the Knowledge Entry tasks appear torequire coordination, judgment, discussion, and working together– Social Authoring and Crowdsourcing with Intelligence SoftwareAssistance• Wikipedia showed this could work for text• Semantic Wiki software R&D to make it work for more structured knowledgeBest Bet for Knowledge Acquisition?42
  • 43. With All These Features…EffectiveKnowledgeacquisition viaSemanticWikisCombine thestrength ofhuman andmachinesConnectingHuman andMachinesHigh Qualitywhile low cost43
  • 44. Conclusion: To Bridge Machine and Human Intelligence44
  • 45. To Dive Into Social Semantic Web45
  • 46. THANK YOU!Credits: some slides are originally from the following people, with little or nomodifications:Nova SpivackDenny VrandecicMark GreavesBao Jie46