Semantic web, python, construction industry


Published on

Sheets and notes for my 2005 Europython (Sweden) talk on the python/zope RDF application I made.

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Semantic web, python, construction industry

  1. 1. Context Setting for this talk I assume you’re either a researcher or someone interested in the semantic web. And that you’re interested in the fun to be had with python and plone in that regard. If you’re a researcher, you might be a bit strapped for time. You’re supposed to be doing ac tual research, not using Emacs typing in code. Another issue might be the quality of your prototypes. Making it look good takes a lot of effort, likewise error checking and so on. What you can take with you out of this talk is perhaps some ideas on using plone for your demos. Hey, maybe you’ll just make a UML model and have your work done for you by the archgenxml code generator! If you’re attracted to the semantic web, you might be either attracted or shocked by the simple way in which I’m using it. At least, you might say I advocate using plone in a more data-oriented instead of the common content-oriented way. 1
  2. 2. Enough about you all, now about me :-) I’ve worked now for five years as a researcher in construction information. Mostly information exchange in building projects, a bit semantic web based. Now, the construction industry has much the same problems as the software industry. Almost the same picture was hanging in the coffee room at the university and I was one of the very few programmers, so this was a joke on the construction industry. Likewise, the construction industry is often a shining light to which the programmer must look. Ah, no, actually. How much effort goes into a building’s specification? The drawing takes a lot of time, but you’ve got 36 hours for a good written documentation of the non-drawn things. And actually following the instructions? Getting everything delivered on time? Large infrastructural projects normally go way over budget. Partly because the politicians want things done differently, but is that any different from clients of the software industry? So you see. Well, automatic couplings between the drawing and the cost estimation. Would be logical, don’t you think? Ouch, effectively 95% of the computer drawings are just lines. For the cost estimation program, it’s almost the same as a scanned-in hand-made drawing. Those written specifications? Text. Pure text. Perhaps a classification system. Problem is that there are no large players. The top 5 companies in the Netherlands combined have about 5% market share. All those small little companies you find in every town… When boeing or airbus wants their suppliers to use a certain system, it happens. If the top1 2 construction player says it, it doesn’t happen. Too costly, as #2 wants something else and #3 also.
  3. 3. Semantic web - simply “Semantic web” sounds very Artificial Intelligence like. Well, python borrowed some things from Lisp, so that’s not so out of place at this conference :-) I myself am treating the semantic web in a very simple way: on a document or application level, make your data available as XML or RDF, downloadable via http. Perhaps password protected if needed. Now, how bad can that be? In plone, it is often just a case of adding another page template. If you look at the screen you see a part of my application. A template that creates such a page takes some work with getting everything in the right place with the right icons in front of it and so on. But exporting it as a bit of XML takes 4 times less time. Just select the items, print the right tag and slap the title into the tag. That’s it. One week after doing that I got an email from a fellow researcher at a different institute saying he’s made a java desktop client to view the same data. WHAT?! Yes, that’s pretty easy to do, just download an xml file and you’ve got all the data you need. As a researcher, this ease of accessing your data might mean the difference between seeing your research used by others or just lingering in your papers and on just your computer. One thing that’s more core semantic web that I’d like to advertise here is the use of ontologies. An ontology is basically a set of terms or definitions that can be used by other applications. Defined in a computer-readable way. Every single item in the ontology has a unique ID, which is an URL. I’ll talk about that ID mechanism later, it’s really a 3 great mechanism to get your data out of an isolated position. Back to the ontology. If application 1 says “this object of mine is an Overpass according to that ontology” and application 2 says the same about an object they’ve got stored, they’re compatible. If application is the design app you’re working in and application 2 is a cost estimation app, you can suddently get a good cost estimation for your overpass construction project.
  4. 4. Plone The previous screen you saw was the plone collaborative ontology editor I made. What you see here is the UML model of the application. I’ll explain the reasoning behind it a bit. First things first: this was used as the input for the ArchGenXML code generator. An older version, so things would ‘ve been a bit cleaner if I did it again from scratch. Anyway. [FU/TS explanation] [draw a hamburger diagram on the board] Now back to the time-strapped researcher’s neck-saving measure: ArchGenXML generates a complete archetypes-based plone product out of this. The entire directory structure, the, the, the .py files for the classes, empty templates ready to fill for extra screens you defined on your classes. Containment/aggregation is mapped to folder structures, with a possibility to restrict the allowed types inside the folder. All automatically. You’re wasting time if you’re doing it all by hand at the moment! 4
  5. 5. Rdflib XML is much more well-known, but RDF is also quite handy. What you see on the screen is pretty much the basic principle of RDF. You’ve got sets of information (files) with classes within them, all identified by an URL. Classes in other files can refer to any URL, just like in the normal web: you can hyperlink to any file. With RDF every part of information gets a URL, so every bit of information is suddenly referenceable. The fun part with RDF is that those links have a URL too. So you can have a different URL for “is father of” and for “goes to that conference”. All of RDF eventually boils down to subject/verb/object, so “I” “am presenting at” “europython”. Can’t be any simpler. You can make your own vocabulary! And everybody can join in! For this, I had to integrate the python rdflib (which is pretty nice) with zope (which I love). But zope’s database has extensionclasses which bite python2.2’s new style classes (which are used by rdflib). So I had to manually mutulate the code. Not pretty. Not a pretty result. But… I could load RDF files into zope. The new Zemantic project does the same, but in a much cleaner way. So don’t ask for my product! Bug them. 5
  6. 6. Conclusions Plone is good for a researcher, especially with archgenxml code generation. You get a good user interface which is collaborative in nature. With relatively low effort. On the screen you see a collaborative ontology editor in the lower left corner, a PloneMall powered catalog application in the top left and a project creation/administration tool on the right. The lower left one kept me busy for three months but the other two took me a week each from start to finish. All three applications export or import http-provided data. Just serve or download a file and your program gets much more interesting. Catalog and “object tree” (a project management application) both use the ontology and therefore can understand eachother. So… Make your application data-friendly and export your stuff. 6
  7. 7. Stuff Reinout van Rees This research was done at D elft University at Technology; since 1 april I'm working at Zest softw are, where I'm trying to make medium-sized organisations happy w ith a shiny plone website. Couple of things not mentioned which might interest you, so if you want, ask a question: python for d ata conversion, importance of project automation (also when w riting 200 page dissertations), data openness. 7