Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Ontopia Code Camp


Published on

A presentation of the Ontopia product from the Ontopia Code Camp at TMRA 2009.

Published in: Education, Technology
  • Dating for everyone is here: ❤❤❤ ❤❤❤
    Are you sure you want to  Yes  No
    Your message goes here
  • Follow the link, new dating source: ❤❤❤ ❤❤❤
    Are you sure you want to  Yes  No
    Your message goes here

Ontopia Code Camp

  1. 1. Ontopia Code Camp<br />TMRA 2009-11-11<br />Lars Marius Garshol & Geir Ove Grønmo<br />
  2. 2. Agenda<br />About you<br />who are you?<br />what do you want from the code camp?<br />About Ontopia<br />The product<br />The future<br />Participating in the project<br />Writing some code!<br />
  3. 3. Some background<br />About Ontopia<br />
  4. 4. Brief history<br />1999-2000<br />private hobby project for Geir Ove<br />2000-2009<br />commercial software sold by Ontopia AS<br />lots of international customers in diverse fields<br />2009-<br />open source project<br />
  5. 5. The project<br />Open source hosted at Google Code<br />Contributors<br />Lars Marius Garshol, Bouvet<br />Geir Ove Grønmo, Bouvet<br />Thomas Neidhart, SpaceApps<br />Lars Heuer, Semagia<br />Hannes Niederhausen, TMLab<br />Stig Lau, Bouvet<br />Baard H. Rehn-Johansen, Bouvet<br />Peter-Paul Kruijssen, Morpheus<br />Quintin Siebers, Morpheus<br />
  6. 6. Current activity (toward 5.1)<br />tolog updates<br />added by LMG<br />Various fixes and optimizations<br />by everyone<br />Toma implementation (in sandbox)<br />by Thomas<br />TMQL implementation (in sandbox)?<br />by Sven Krosse<br />
  7. 7. Architecture and modules<br />The product<br />
  8. 8. The big picture<br />Auto-class.<br />A.N.other<br />A.N.other<br />Other<br />CMSs<br />A.N.other<br />A.N.other<br />DB2TM<br />Portlet support<br />OKP<br />XML2TM<br />Engine<br />CMSintegration<br />Data <br />integration<br />Escenic<br />Taxon.import<br />Ontopoly<br />Web<br />service<br />
  9. 9. The engine<br />Core API<br />TMAPI 2.0 support<br />Import/export<br />RDF conversion<br />TMSync<br />Fulltext search<br />Event API<br />tolog query language<br />tolog update language<br />Engine<br />
  10. 10. Query Engine<br />Implementation of Ontopia’s tolog language (based on Prolog and SQL)<br />Allows powerful queries on the topic map data structure<br />Simplifies application development and improves performance<br />Example:<br />select $B, count($A) from <br />instance-of($B, city),<br />{ premiere($A : opera, $B : place) | <br /> premiere($A : opera, $C : place), <br /> located-in($C : containee, $B : container) } <br />order by $A desc?<br /><ul><li>returns all B's and the corresponding number of A's whereB is a city ANDEITHER B is the place where A was premieredOR the place where A was premiered is located in B in decreasing order of A</li></li></ul><li>TMSync<br />Configurable module for synchronizing one TM against another<br />define subset of source TM to sync (using tolog)<br />define subset of target TM to sync (using tolog)<br />the module handles the rest<br />Can also be used with non-TM sources<br />create a non-updating conversion from the source to some TM format<br />then use TMSync to sync against the converted TM instead of directly against the source<br />
  11. 11. How TMSync works<br />Define which part of the target topic map you want,<br />Define which part of the source topic map it is the master for, and<br />The algorithm does the rest<br />
  12. 12. If the source is not a topic map<br />TMSync<br />convert.xslt<br />Simply do a normal one-time conversion<br />let TMSync do the update for you<br />In other words, TMSync reduces the update problem to a conversion problem<br />source.xml<br />
  13. 13. The City of Bergen usecase<br /><br />Service<br />Unit<br />Person<br />LOS<br />City of Bergen<br />LOS<br />
  14. 14. The backends<br />In-memory<br />no persistent storage<br />thread-safe<br />no setup<br />RDBMS<br />transactions<br />persistent<br />thread-safe<br />uses caching<br />clustering<br />Remote<br />uses web service<br />read-only<br />unofficial<br />Engine<br />Memory<br />RDBMS<br />Remote<br />
  15. 15. RDBMS Backend<br />Allows the Engine to use topic maps stored in a relational database<br />Based on a generic topic map schema<br />Necessary when working with very large topic maps<br />Transparent to applications<br />Features<br />Automatically loads data when needed<br />Caches frequently used data<br />Full support for RDBMS transactions<br />Supports tolog-to-SQL compilation<br />Statistical reports for performance tuning<br />Platform support<br />Oracle, MySQL, PostgreSQL, MS SQL Server<br />Test suite available for verifying compatibility with other JDBC-enabled RDBMSes<br />
  16. 16. DB2TM<br />Upconversion to TMs<br />from RDBMS via JDBC<br />or from CSV<br />Uses XML mapping<br />can call out to Java<br />Supports sync<br />either full rescan<br />or change table<br />TMRAP<br />Nav<br />DB2TM<br />Classify<br />Engine<br />Memory<br />RDBMS<br />Remote<br />
  17. 17. DB2TM example<br />Ontopia<br />+<br />=<br />United Nations<br />Bouvet<br />&lt;relation name=&quot;organizations.csv&quot; columns=&quot;id name url&quot;&gt;<br /> &lt;topic type=&quot;ex:organization&quot;&gt;<br /> &lt;item-identifier&gt;#org${id}&lt;/item-identifier&gt;<br /> &lt;topic-name&gt;${name}&lt;/topic-name&gt;<br /> &lt;occurrence type=&quot;ex:homepage&quot;&gt;${url}&lt;/occurrence&gt;<br /> &lt;/topic&gt;<br />&lt;/relation&gt;<br />
  18. 18. TMRAP<br />Web service interface<br />via SOAP<br />via plain HTTP<br />Requests<br />get-topic<br />get-topic-page<br />get-tolog<br />delete-topic<br />...<br />TMRAP<br />Nav<br />DB2TM<br />Classify<br />Engine<br />Memory<br />RDBMS<br />Remote<br />
  19. 19. Navigator framework<br />Servlet-based API<br />manage topic maps<br />load/scan/delete/create<br />JSP tag library<br />XSLT-like<br />based on tolog<br />JSTL integration<br />TMRAP<br />Nav<br />DB2TM<br />Classify<br />Engine<br />Memory<br />RDBMS<br />Remote<br />
  20. 20. Ontopia Navigator Framework<br />Java API for interacting with TM repository<br />JSP tag library<br />based on tolog<br />kind of like XSLT in JSP with tolog instead of XPath<br />has JSTL integration<br />Undocumented parts<br />web presentation components<br />some wrapped as JSP tags<br />want to build proper portlets from them<br />
  21. 21.<br />
  22. 22. Navigator tag library example<br /> &lt;%-- assume variable &apos;composer&apos; is already set --%&gt;<br />&lt;p&gt;&lt;b&gt;Operas:&lt;/b&gt;&lt;br/&gt;&lt;tolog:foreach query=”composed-by(%composer% : composer, $OPERA : opera), { premiere-date($OPERA, $DATE) }?”&gt; &lt;li&gt; &lt;a href=&quot;opera.jsp?id=&lt;tolog:id var=&quot;OPERA&quot;/&gt;”<br /> &gt;&lt;tolog:out var=&quot;OPERA&quot;/&gt;&lt;/a&gt;<br /> &lt;tolog:if var=&quot;DATE&quot;&gt; &lt;tolog:out var=&quot;DATE&quot;/&gt; &lt;/tolog:if&gt; &lt;/li&gt;&lt;/tolog:foreach&gt;&lt;/p&gt;<br />
  23. 23. Elmer Preview<br />
  24. 24.
  25. 25.
  26. 26.
  27. 27. Automated classification<br />Undocumented<br />experimental<br />Extracts text<br />autodetects format<br />Word, PDF, XML, HTML<br />Processes text<br />detects language<br />stemming, stop-words<br />Extracts keywords<br />ranked by importance<br />uses existing topics<br />supports compound terms<br />TMRAP<br />Nav<br />DB2TM<br />Classify<br />Engine<br />Memory<br />RDBMS<br />Remote<br />
  28. 28. Example of keyword extraction<br />topic maps 1.0<br />metadata 0.57<br />subject-based class. 0.42<br />Core metadata 0.42<br />faceted classification 0.34<br />taxonomy 0.22<br />monolingual thesauri 0.19<br />controlled vocabulary 0.19<br />Dublin Core 0.16<br />thesauri 0.16<br />Dublin 0.15<br />keywords 0.15<br />
  29. 29. Example #2<br />Automated classification 1.0 5<br />Topic Maps 0.51 14<br />XSLT 0.38 11<br />compound keywords 0.29 2<br />keywords 0.26 20<br />Lars 0.23 1<br />Marius 0.23 1<br />Garshol 0.22 1<br />...<br />
  30. 30. So how could this be used?<br />To help users classify new documents in a CMS interface<br />suggest appropriate keywords, screened by user before approval<br />Automate classification of incoming documents<br />this means lower quality, but also lower cost<br />Get an overview of interesting terms in a document corpus<br />classify all documents, extract the most interesting terms<br />this can be used as the starting point for building an ontology<br />(keyword extraction only)<br />
  31. 31. Example user interface<br />The user creates an article<br />this screen then used to add keywords<br />user adjusts the proposals from the classifier<br />
  32. 32. Vizigator<br />Viz<br />Ontopoly<br />Graphical visualization<br />VizDesktop<br />Swing app to configure<br />filter/style/...<br />Vizlet<br />Java applet for web<br />uses configuration<br />loads via TMRAP<br />uses “Remote” backend<br />TMRAP<br />Nav<br />DB2TM<br />Classify<br />Engine<br />Memory<br />RDBMS<br />Remote<br />
  33. 33. The Vizigator<br />Graphical visualization of Topic Maps<br />Two parts<br />VizDesktop: Swing desktop app for configuration<br />Vizlet: Java applet for web deployment<br />Configuration stored in XTM file<br />
  34. 34. Without configuration<br />
  35. 35. With configuration<br />
  36. 36. The Vizigator<br />The Vizigator uses TMRAP<br />the Vizlet runs in the browser (on the client)<br />a fragment of the topic map is downloaded from the server<br />the fragment is grown as needed<br />Server<br />TMRAP<br />
  37. 37. Ontopoly<br />Viz<br />Ontopoly<br />Generic editor<br />web-based, AJAX<br />meta-ontology in TM<br />Ontology designer<br />create types and fields<br />control user interface<br />build views<br />incremental dev<br />Instance editor<br />guided by ontology<br />TMRAP<br />Nav<br />DB2TM<br />Classify<br />Engine<br />Memory<br />RDBMS<br />Remote<br />
  38. 38. Ontopoly<br />A generic Topic Maps editor, in two parts<br />ontology editor: used to create the ontology and schema<br />instance editor: used to enter instances based on ontology<br />Built with the Web Editor Framework<br />works with both XTM files and topic maps stored in RDBMS backend<br />supports access control to administrative functions, ontology, and instance editors<br />existing topic maps can be imported<br />parts of the ontology can be marked as read-only, or hidden<br />
  39. 39.
  40. 40. Typical deployment<br />Viewing<br />application<br />Engine<br />Users<br />DB<br />Backend<br />Ontopoly<br />Frameworks<br />Editors<br />DB<br />TMRAP<br />DB2TM<br />HTTP<br />DB<br />External application<br />Application server<br />
  41. 41. CMS integration<br />The best way to add content functionality to Ontopia<br />the world doesn’t need another CMS<br />better to reuse those which already exist<br />So far two integrations exist<br />Escenic<br />OfficeNet Knowledge Portal<br />more are being worked on<br />
  42. 42. Implementation<br />A CMS event listener<br />the listener creates topics for new CMS articles, folders, etc<br />the mapping is basically the design of the ontology used by this listener<br />Presentation integration<br />it must be possible to list all topics attached to an article<br />conversely, it must be possible to list all articles attached to a topic<br />how close the integration needs to be here will vary, as will the difficulty of the integration<br />User interface integration<br />it needs to be possible to attach topics to an article from within the normal CMS user interface<br />this can be quite tricky<br />Search integration<br />the Topic Maps search needs to also search content in the CMS<br />can be achieved by writing a tolog plug-in<br />
  43. 43. Articles as topics<br />is about<br />Elections<br />New city council appointed<br />Goal: associate articles with topics<br />mainly to say what they are about<br />typically also want to include other metadata<br />Need to create topics for the articles to do this<br />in fact, a general CMS-to-TM mapping is needed<br />must decide what metadata and structures to include<br />
  44. 44. Mapping issues<br />Article topics<br />what topic type to use?<br />title becomes name? (do you know the title?)<br />include author? include last modified? include workflow state?<br />should all articles be mapped?<br />Folders/directories/sections/...<br />should these be mapped, too?<br />one topic type for all folders/.../.../...?<br />if so, use associations to connect articles to folders<br />use associations to reproduce hierarchical folder structure<br />Multimedia objects<br />should these be included?<br />what topic type? what name? ...<br />
  45. 45. Two styles of mappings<br />Articles as articles<br />Topic represents only the article<br />Topic type is some subclass of “article”<br />“Is about” association connects article into topic map<br />Fields are presentational<br />title, abstract, body<br />Articles as concepts<br />Topic represents some real-world subject (like a person)<br />article is just the default content about that subject<br />Type is the type of the subject (person)<br />Semantic associations to the rest of the topic map<br />works in department, has competence, ...<br />Fields can be semantic<br />name, phone no, email, ...<br />
  46. 46. Article as article<br />Article about building of a new school<br />Is about association to “Primary schools”<br />Topic type is “article”<br />
  47. 47. Article as concept<br />Article about a sports hall<br />Article really represents the hall<br />Topic type is “Location”<br />Associations to<br /><ul><li>city borough
  48. 48. events in the location
  49. 49. category “Sports”</li></li></ul><li>
  50. 50.
  51. 51.
  52. 52.
  53. 53.
  54. 54. Two projects<br />
  55. 55. The project<br />A new citizen’s portal for the city administration<br />strategic decision to make portal main interface for interaction with citizens<br />as many services as possible are to be moved online<br />Big project<br />started in late 2004, to continue at least into 2008<br />~5 million Euro spent by launch date<br />1.7 million Euro budgeted for 2007<br />Topic Maps development is a fraction of this (less than 25%)<br />Many companies involved<br />Bouvet/Ontopia<br />Avenir<br />KPMG<br />Karabin<br />Escenic<br />
  56. 56. Simplified original ontology<br />Service catalog<br />Escenic (CMS)<br />LOS<br />Form<br />Article<br />nearly<br />everything<br />Category<br />Service<br />Subject<br />Department<br />Borough<br />External<br />resource<br />Employee<br />Payroll++<br />
  57. 57. Data flow<br />Ontopoly<br />Ontopia<br />Escenic<br />LOS<br />Integration<br />TMSync<br />DB2TM<br />Fellesdata<br />Payroll<br />(Agresso)<br />Dexter/Extens<br />Service<br />Catalog<br />
  58. 58. Conceptual architecture<br />Data<br />sources<br />Oracle Portal<br />Application<br />Ontopia<br />Escenic<br />Oracle Database<br />
  59. 59. The portal<br />
  60. 60. Technical architecture<br />
  61. 61. NRK/Skole<br />Norwegian National Broadcasting (NRK)<br />media resources from the archives<br />published for use in schools<br />integrated with the National Curriculum<br />In production<br />delayed by copyright wrangling<br />Technologies<br />OKS<br />Polopoly CMS<br />MySQL database<br />Resin application server<br />
  62. 62. Curriculum-based browsing (1)<br />Curriculum<br />Social studies<br />High school<br />
  63. 63. Curriculum-based browsing (2)<br />Gender roles<br />
  64. 64. Curriculum-based browsing (3)<br />Feminist movement in the 70s and 80s<br />Changes to the family in the 70s<br />The prime minister’s husband<br />Children choosing careers<br />Gay partnerships in 1993<br />
  65. 65. One video (prime minister’s husband)<br />Metadata<br />Subject<br />Person<br />Related<br />resources<br />Description<br />
  66. 66. Conceptual architecture<br />Polopoly<br />HTTP<br />Ontopia<br />MediaDB<br />Grep<br />DB2TM<br />TMSync<br />RDBMS backend<br />MySQL<br />Editors<br />
  67. 67. Implementation<br />Domain model in Java<br />Plain old Java objects built on<br />Ontopia’s Java API<br />tolog<br />JSP for presentation<br />using JSTL on top of the domain model<br />Subversion for the source code<br />Maven2 to build and deploy<br />Unit tests<br />
  68. 68. What we’d like to see<br />The future<br />
  69. 69. The big picture<br />Auto-class.<br />A.N.other<br />A.N.other<br />Other<br />CMSs<br />A.N.other<br />A.N.other<br />DB2TM<br />Portlet support<br />OKP<br />XML2TM<br />Engine<br />CMSintegration<br />Data <br />integration<br />Escenic<br />Taxon.import<br />Ontopoly<br />Web<br />service<br />
  70. 70. CMS integrations<br />The more of these, the better<br />Candidate CMSs<br />Liferay (being worked on at Bouvet)<br />Alfresco (might be started soon)<br />Magnolia<br />Inspera (possible project here)<br />JSR-170 Java Content Repository<br />CMIS (OASIS web service standard)<br />
  71. 71. Portlet toolkit<br />Subversion contains a number of “portlets”<br />basically, Java objects doing presentation tasks<br />some have JSP wrappers as well<br />Examples<br />display tree view<br />list of topics filterable by facets<br />show related topics<br />get-topic-page via TMRAP component<br />Not ready for prime-time yet<br />undocumented<br />incomplete<br />
  72. 72. Ontopoly plug-ins<br />Plugins for getting more data from externals<br />TMSync import plugin<br />DB2TM plugin<br /> plugin<br />adapted RDF2TM plugin<br />classify plugin<br />...<br />Plugins for ontology fragments<br />menu editor, for example<br />
  73. 73. TMCL<br />Now implementable<br />We’d like to see<br />an object model for TMCL (supporting changes)<br />a validator based on the object model<br />Ontopoly import/export from TMCL (initially)<br />refactor Ontopoly API to make it more portable<br />Ontopoly ported to use TMCL natively (eventually)<br />
  74. 74. Things we’d like to remove<br />OSL support<br />Ontopia Schema Language<br />Web editor framework<br />unfortunately, still used by some major customers<br />Fulltext search<br />the old APIs for this are not really of any use<br />
  75. 75. Management interface<br />Import topic maps (to file or RDBMS)<br />
  76. 76. What do you think?<br />Suggestions?<br />Questions?<br />Plans?<br />Ideas?<br />
  77. 77. Setting up the developer environment<br />Getting started<br />
  78. 78. If you are using Ontopia...<br />...simply download the zip, then<br />unzip,<br />set the classpath,<br />start the server, ...<br />...and you’re good to go<br />
  79. 79. If you are developing Ontopia...<br />You must have<br />Java 1.5 (not 1.6 or 1.7 or ...)<br />Ant 1.6 (or later)<br />Ivy 2.0 (or later)<br />Subversion<br />Then<br />check out the source from Subversion<br />svn checkout ontopia-read-only<br />ant bootstrap<br />ant dist.jar.ontopia<br />ant test<br />ant dist.ontopia<br />
  80. 80. Beware<br />This is fun, because<br />you can play around with anything you want<br />e.g, my build has a faster TopicIF.getRolesByType<br />you can track changes as they happen in svn<br />However, you’re on your own<br />if it fails it’s kind of hard to say why<br />maybe it’s your changes, maybe not<br />For production use, official releases are best<br />
  81. 81. Participating etc<br />The project<br />
  82. 82. Our goal<br />To provide the best toolkit for building Topic Maps-based applications<br />We want it to be<br />actively maintained,<br />bug-free,<br />scalable,<br />easy to use,<br />well documented,<br />stable,<br />reliable<br />
  83. 83. Our philosophy<br />We want Ontopia to provide as much useful more-or-less generic functionality as possible<br />New contributions are generally welcome as long as<br />they meet the quality requirements, and<br />they don’t cause problems for others<br />
  84. 84. The sandbox<br />There’s a lot of Ontopia-related code which does not meet those requirements<br />some of it can be very useful,<br />someone may pick it up and improve it<br />The sandbox is for these pieces<br />some are in Ontopia’s Subversion repository,<br />others are maintained externally<br />To be “promoted” into Ontopia a module needs<br />an active maintainer,<br />to be generally useful, and<br />to meet certain quality requirements<br />
  85. 85. Communications<br />Join the mailing list(s)!<br /><br /><br />Google Code page<br /><br />note the “updates” feed!<br />Blog<br /><br />Twitter<br /><br />
  86. 86. Committers<br />These are the people who run the project<br />they can actually commit to Subversion<br />they can vote on decisions to be made etc<br />Everyone else can<br />use the software as much as they want,<br />report and comment on issues,<br />discuss on the mailing list, and<br />submit patches for inclusion<br />
  87. 87. How to become a committer<br />Participate in the project!<br />that is, get involved first<br />let people get to know you, show some commitment<br />Once you’ve gotten some way into the project you can ask to become a committer<br />best if you have provided some patches first<br />Unless you’re going to commit changes there’s no need to be a committer<br />
  88. 88. Finding a task to work on<br />Report bugs!<br />they exist. if you find any, please report them.<br />Look at the open issues<br />there is always testing/discussion to be done<br />Look for issues marked “newbie”<br /><br />Look at what’s in the sandbox<br />most of these modules need work<br />Scratch an itch<br />if there’s something you want fixed/changed/added...<br />
  89. 89. How to fix a bug<br />First figure out why you think it fails<br />Then write a test case<br />based on your assumption<br />make sure the test case fails (test before you fix)<br />Then fix the bug<br />follow the coding guidelines (see wiki)<br />Then run the test suite<br />verify that you’ve fixed the bug<br />verify that you haven’t broken anything<br />Then submit the patch<br />
  90. 90. The test suite<br />Lots of *.test packages in the source tree<br />3148 test cases as of right now<br />test data in ontopia/src/test-data<br />some tests are generators based on files<br />some of the test files come from<br />Run with<br />ant test<br />java net.ontopia.test.TestRunner src/test-data/config/tests.xml test-group<br />
  91. 91. Source tree structure<br />net.ontopia.<br />utils various utilities<br />test various test support code<br />infoset LocatorIF code + cruft<br />persistence OR-mapper for RDBMS backend<br />product cruft<br />xml various XML-related utilities<br />topicmaps next slides<br />
  92. 92. Source tree structure<br />net.ontopia.topicmaps.<br />core core engine API<br />impl engine backends + utils<br />utils utilities (see next slide)<br />cmdlineutils command-line tools<br />entry TM repository<br />nav + nav2 navigator framework<br />query tolog engine<br />viz<br />classify <br />db2tm<br />webed cruft<br />
  93. 93. Source tree structure<br />net.ontopia.topicmaps.utils<br />* various utility classes<br />ltm LTM reader and writer<br />ctm CTM reader<br />rdf RDF converter (both ways)<br />tmrap TMRAP implementation<br />
  94. 94. Let’s write some code!<br />
  95. 95. The engine<br />The core API corresponds closely to the TMDM<br />TopicMapIF, TopicIF, TopicNameIF, ...<br />Compile with<br />ant init compile.ontopia<br />.class files go into ontopia/build/classes<br />ant dist.ontopia.jar # makes a jar<br />
  96. 96. The importers<br />Main class implements TopicMapReaderIF<br />usually, this lets you set up configuration, etc<br />then uses other classes to do the real work<br />XTM importers<br />use an XML parser<br />main work done in XTM(2)ContentHandler<br />some extra code for validation and format detection<br />CTM/LTM importers<br />use Antlr-based parsers<br />real code in ctm.g/ltm.g<br />All importers work via the core API<br />
  97. 97. Fixing a real bug<br />There is a failing test case in the TM/XML importer<br />So let’s fix that right now...<br />
  98. 98. Find an issue in the issue tracker<br />(Picking one with “Newbie” might be good, <br />but isn’t necessary)<br />Get set up<br />check out the source code<br />build the code<br />run the test suite<br />Then dig in<br />we’ll help you with any questions you have<br />At the end, submit a patch to the issue tracker<br />remember to use the test suite!<br />