Msra talk smw+apps


Published on

A tech talk to Mi

Published in: Technology, Education
  • Be the first to comment

Msra talk smw+apps

  1. 1. Semantic Wikisand Applications Social Semantic Web In Action Jesse Wang 2011.12.09 Tech Talk at Microsoft Research Asia
  2. 2. About Me: Jesse Wang 王嘉欣2
  3. 3. Who is Vulcan3
  4. 4. What does Vulcan do4
  5. 5. Paul Allen | Idea Man
  6. 6. It all began with an idea…6
  7. 7. Now the Idea Continues as Project Halo7
  8. 8. Project Halo’s Focus Areas • Automated User-Centered AURA Reasoning and Acquisition System • Text book you can talk to • Semantic Inference with Large SILK Knowledge-base • Non-monotonic rule system / RIF • Semantic MediaWiki + SMW+ • Knowledge authoring with SMEs Plus other related semantic technologies and commercial efforts8
  9. 9. Crowdsourcing for Better Knowledge Acquisition10
  10. 10. Success of Wikis11
  11. 11. A Key Feature of Wiki This distinguishes wikis from other publication tools13
  12. 12. Consensus in Wikis Comes from  Collaboration – ~17 edits/page on average in Wikipedia (with high variance) – Wikipedia’s Neutral Point of View  Convention – Users follow customs and conventions to engage with articles effectively14
  13. 13. Software Support Makes Wikis Successful  Trivial to edit by anyone  Tracking of all changes, one- step rollback  Every article has a “Talk” page for discussion  Notification facility allows anyone to “watch” an article  Sufficient security on pages, logins can be required  A hierarchy of administrators, gardeners, and editors  Software Bots recognize certain kinds of vandalism and auto- revert, or recognize articles that need work, and flag them for editors15
  14. 14. How about Deep Info?Wikipedia has articles about…• … all cities with info on their populations, locations and skyscrapers, etc. … all German cars with engine size, accelerating data…Can you find:Skyscrapers with 50+ floorsand built after 2000 inShanghai (or Chinese citieswith 1,000,000+ people)?Or German(Porsche) cars thataccelerate from 0-100km/h in5 seconds? 16
  15. 15. Can Search Solve the Problem?17
  16. 16. How Wikipedia Answers – List! cars_by_acceleration18
  17. 17. Going Deeper
  18. 18. Deeper…20
  19. 19. And Deeper…21
  20. 20. And Now…22
  21. 21. To Get the Answer23
  22. 22. Look into List in Wikipedia
  23. 23. Editing Standard Wiki Article – Static List25
  24. 24. Static List, Tables, …, Not Useable Enough
  25. 25. To Find More Info • All Porsche vehicles made in Germany that accelerate from 1- 100 km/h less than 4 seconds • Sci-Fi movies made after year 2000 that cost less than $10M and gross more than $30M • A map showing where all Mercedes-Benz vehicles are manufactured • All skyscrapers in China (Japan, Thailand,…) of 50 (40/60/70) floors or more, and built in year 2000 (2001/2002) and after, sorted by built year, floors…, grouped by cities, regions… • And many more27
  26. 26. What is a Semantic Wiki  A wiki that has an underlying model of the knowledge described in its pages.  To allow users to make their knowledge explicit and formal  Semantic Web Compatible Semantic Wiki28
  27. 27. Two Perspectives Wikis for Metadata Metadata for Wikis29
  28. 28. Characteristics of Semantic Wikis Semantic Wikis 30
  29. 29. Basics of Semantic Wikis  Still a wiki, with regular wiki features – Category/Tags, Namespaces, Title, Versioning, ...  Typed Content (built-ins + user created, e.g. categories) – Page/Card, Date, Number, URL/Email, String, …  Typed Links (e.g. properties) – “capital_of”, “contains”, “born_in”…  Querying Interface Support – E.g. “[[Category:Member]] [[Age::<30]]” (in SMW)31
  30. 30. What is the Promise of Semantic Wikis?  Semantic Wikis facilitate Consensus over Data  Combine low-expressivity data authorship with the best features of traditional wikis  User-governed, user- maintained, user-defined  Easy to use as an extension of text authoring33
  31. 31. One Key Helpful Feature of Semantic Wikis Semantic Wikis are “Schema-Last” Databases require DBAs and schema design; Semantic Wikis develop and maintain the schema in the wiki35
  32. 32. List of Semantic WikisAceWiki Semantic MediaWiki - anArtificialMemory extension to MediaWiki thatWagn - Ruby on Rails-based turns it into a semantic wikiKiWi – Knowledge in a Wiki Swirrl - a spreadsheet-based semantic wiki applicationKnoodl – SemanticCollaboration tool and TaOPis - has a semantic wikiapplication platform subsystem based on Frame logicMetaweb - the software thatpowers Freebase TikiWiki CMS/Groupware integrates Semantic links as aOntoWiki core featureOpenRecord zAgile Wikidsmart - semanticallyPhpWiki enables Confluence 36
  33. 33. Short History of Semantic MediaWiki (SMW)  Born at AIFB – Typed links and types and more – Export articles as RDF – Maximally flexible for the wiki user  SMW 0.1 released by AIFB in Sept 2005 – Parser/storage support for typed links – [[type::link | label]] – FactBox for semantic relations at end of article – Special:SearchSemantic, with basic auto-completion for link types – Simple query language (“ask”)  Vulcan kicks off Halo Extensions to SMW project in August 2007  SMW 1.0 released by AIFB in Dec 2007, Ontoprise releases Halo Extension 1.0 in parallel – “Property” instead of “Relation” and “Attribute” – Many new datatypes/special pages/UI features37
  34. 34. Overview of Semantic MediaWiki (SMW)  Open source (GPL) – Well documented, active user forum  Active development – Commercial support (SMW+) available  World-wide community – International Conferences • Next SMWCon on 4/25-27, 2012 in Carlsbad, CA Very stable core, various extensions38
  35. 35. Semantic MediaWiki (SMW) Markup Syntax Tsinghua is a university located in [[Has location::Beijing]], with [[Has population::27000|about 27 thousands]] students. In page "Property:Has location": In page "Property:Has population": [[Has type::Page]] [[Has type::number]]39
  36. 36. Special Properties  “Has Type” is a pre-defined “special” property for meta- data – Example: [[Has type::String]]  “Allowed Values” is another special property – [[Allows value::Low]], – [[Allows value::Medium]], – [[Allows value::High]]  In Halo Extensions, there are domain and range support – RDFs expressivity – Semantic Gardening extension also supports “Cardinality”40
  37. 37. Define Classes Beijing is a city in [[Has country::China]], with population [[Has population::2,200,000]]. [[Category::Cities]] Categories are used to define classes because they are better for class inheritance. The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall skyscraper in … [[Categories: 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore, Owings and Merrill buildings]] Category:Skyscrapers in China Category: Skyscrapers by country41
  38. 38. Database-style Query over Wiki Data Example: Skyscrapers in China higher than 50 stories, built before 2000 ASK/SPARQL query target {{#ask: [[Category:Skyscrapers]] [[Located in::China]] [[Floor count::>50]] [[Year built::<2000]] … }}42
  39. 39. SMW Extensions – Help Build Great Things Data I/O • Halo Extensions, Semantic Forms, Semantic Notification, … Query and Browsing • Semantic Toolbar, Semantic Drilldown, Enhanced Retrieval, Search… Visualization • Semantic Result Printers, Tree View, Exhibit, Flash charts… Other useful extensions • HaloACL, Deployment, Triplestore Connector, Simple Rules… • Semantic WikiTags and Subversion Integration extensions • Upcoming Linked Data Extension, with R2R and SILK from F.U.Berlin43
  40. 40. Simple Example: Semantic Sci-Fi Movie Wiki  Demo44
  41. 41. Example: Ultrapedia – Semantic Wikipedia  Ultrapedia: An SMW demo built to explore general knowledge acquisition in a wiki  Wikipedia merged with the power of a database  Help Readers and Writers Be More Productive45
  42. 42. Standard View of the Wiki Data
  43. 43. Dynamic View of the Acceleration Data
  44. 44. Graph View of the Acceleration Data
  45. 45. Dynamic Mapping and Charting
  46. 46. Information Discovery via Visualization52
  47. 47. Video: Semantic Wikis for A New Problem Increasing technical complexity → ← Increasing User Participation  Social tag-based  Algorithm-based Semantic characterization object Entertainment  Keyword search over Wiki characterization tag data  Database-style  Inconsistent  Social database-style search semantics characterization  Consistent semantics  Easy to engineer  Database search +  Extremely difficult to wiki text search engineer  Semantic consistency via wiki mechanisms  Easy to engineer53
  48. 48. Semantic Seahawks Football Wiki54
  49. 49. Semantic Entertainment: Query Result  Highlight Reel  Commercial Look/Feel  Play-by-play video search  Highlight reel generation  Search on crowd-defined patterns (“touchdowns with big hits”)  Tree-based navigation widget  Very favorable economics
  50. 50. The Inspiration  We started with a  We could have an58
  51. 51. Case Study and Demo: Project Management with SMW+  Automatically populate tables  Just the data you want,  At the level you want  Calendars and timelines  Workflows  Personal menus  Form-oriented inputs  Notifications via email/RSS  MS Office integration  SVN integration66
  52. 52. Vulcan Project Management Wiki (Story)
  53. 53. Vulcan Project Management Wiki (Task)68
  54. 54. Vulcan Project Management Wiki (Visualizations)69
  55. 55. Screenshot of a Sprint page Data automatically generated via template queries on page
  56. 56. Requirements for Wiki “Developers”  One need not – Write code like a hardcore programmer – Design, setup RDBMS or make frequent schema changes – Possess knowledge of a senior system admin  Instead one need – Configure the wiki with desired extensions – Design and evolve the data model (schema) – Design Content • Customize templates, forms, styles, skin, etc.71
  57. 57. Effectiveness of SMW as a Platform Choice Packaged Software SMW + Extensions Custom Development ☺Very quick to ☺ Still quick to N Slow to develop obtain program ☺Extremely flexible N Hard to customize ☺ Easy to customize N High cost to develop N Expensive ☺ Low-moderate cost and maintain  Microsoft Project  Vulcan Project Wiki  .NET Framework  Version One  B.L.S.  J2EE, …  Microsoft  RPI map  Ruby on rails SharePoint72
  58. 58. SMW:: powerful tools and contents Semantic MediaWiki and related extensions have more potential power
  59. 59. Need Release :: The Power Be used by more people Content in more places Accessible via more applications Enhanced with more semantics
  60. 60. Need :: Workflow Integration + Usability Enhancements Infrequent Wiki users frequently forget where the wiki pages are located Search is a break from current workflow Search result can be noisy or irrelevant Usability: – Wiki/Template/SF markup syntax is not extremely hard, but enough to turn off many users – To locate and consume info in SMW is just not easy enough, need something better Why don’t we leverage Microsoft Office suite?
  61. 61. Microsoft Office :: The Most Popular Productivity Suite 500m users worldwide >90% market share Users live in the “suite” Outlook always open Potential for SMW
  62. 62. MICROSOFT OFFICE CONNECTOR :: How It Works  Leverage Microsoft Office Add-ins technology  Bring SMW info to Office applications on-demand  API for semantic data I/O  Utilize semantics to improve relevance  Smart actions for semantic properties
  63. 63. Backstage::Semantic Wiki Object Model Wiki Validation  To get page info Authentication  Get all forms related info To get the categories  Edit and save page w/ form – And descriptions  Change a property To get the article titles  Set form of a page To get the semantic  Create form templates properties  To upload into the Wiki
  64. 64. Microsoft Office Connector Smart Connections• Consume relevant, targeted information – With the tools you are already familiar with – In the context – better relevance and productivity – In place – no search overhead to break workflow – In real time – data from wiki is live – Automatically – linking to wiki• Let you contribute to Wiki – Without knowing where the content is – Without learning wiki/template syntax
  65. 65. Openness of SMW as a Platform80
  66. 66. Semantic MediaWiki Enables Collaboration Create and Manage Real Knowledge Build Social Semantic Web Applications In an Efficient and Cost- Effective Way85
  67. 67. Acknowledgement 86
  68. 68. (End of Slides) Backups start here87
  69. 69. Case Study: Battle-space Luminary System  Discover when New Information represents a change in understanding of entities – Discovery of explicit entity links, implicit relationships  Large Volumes of Data in various formats – Unstructured news articles – Tactical Reports, Field Intelligence – Structured Database Information  Use Wiki Pages to represent current knowledge about an entity – “what we know”  Domain Ontology to represent domain of information – “what we want to know”  Issue Alerts when Significant Events occur – New information according to category – Changing information on topics of interest – Need to send information to various devices – cell phones, email, etc.88
  70. 70. System Design  Wiki Configuration – Semantic MediaWiki: Large developer community, active development, open source. Wikipedia uses MediaWiki, so scalability and performance are important. – Semantic Results Format: Provides various rich media displays of semantic information, including graphs, timelines, maps – Semantic Forms: Provides convenient user interface for entering semantic data into wiki, avoiding cumbersome wikitext – Semantic Notifications: Enables sending of notifications when results of semantic query change.  Domain Ontology – Created OWL Ontology for Terrorism  Semantic Parsing, Extraction, Reasoning – Java Process using various Open-Source Toolkits – Rapid plugin of new technologies89 – Multiple Data Sources supported
  71. 71. Sample Content Page90
  72. 72. Wiki Content Design  Use Templates to Ensure Consistent Look-and-Feel – Templates Correspond to Ontology Classes – Fields within Templates correspond to Properties within Ontology – Rich Content Visualizations derived in consistent way  Hierarchical Categories match Class Hierarchy within Ontology – Ensures Validity for Properties – Category included on each Template page to ensure consistency  Forms Provide ability for users to enter data directly into wiki without knowing Wiki Text – Each form corresponds to a Template – Fields within forms correspond to the fields/properties within the Template – GUI can include auto-completion – Created Page immediately linked semantically to rest of Wiki91
  73. 73. Sample Visualizations92
  74. 74. Wikipedia for Porsches (Acceleration Data Example)  Information Need: All Porsche models that accelerate 0- 100kph in under 5, 6, and 7 seconds
  75. 75. More Porsche Acceleration Data in Wikipedia
  76. 76. Ultrapedia Main Page Main Page
  77. 77. Semantics for Improved Wiki NavigationTree View Control Abstract/Summary quick preview
  78. 78. The Porsche 996 Acceleration Table In Ultrapedia
  79. 79. Same Table as a Query
  80. 80. Dynamically-Generated Tables forfast? Which Porsches accelerate Queries  Information Need: All Porsche models that accelerate 0- 100kph in under 5, 6, and 7 seconds
  81. 81. Graph Views of the Acceleration Data
  82. 82. External Data via a Live Ebay Query
  83. 83. Linking to External Ebay Data
  84. 84. Photos in Mercedes-Benz E-class W212 Gallery Section Wiki Articles as Data
  85. 85. Timelines from Data Production Timeline View Volkswagen
  86. 86. Dynamic Mapping and Charting
  87. 87. Editing Wiki Data In Place Return