Drupal, Calais
& the Semantic Web
Prepared by Frank Febbraro, CTO & Presented by Jeff Walpole, CEO
Introductions (and sizing each other up)
 Raise your hand if you are a…
 Technologist?
 Journalist?
 SemHead?
 Raise ...
Publishing tech Phase2 is working on
 CMS frameworks
 Drupal & Java Development
 Taxonomy solutions
 Geo-tagging & Map...
DrupalDrupal
We heart Drupal
Why We use Drupal for CMS
 Performance/Reliability: Dozens of major
publishers turn to Drupal and tens of
thousands of hi...
Drupal Semantic Modules
rdf, rdf cck, foaf, relations, sparql, sioc, calais collection
http://www.youtube.com/watch?v=r4Wg...
The Calais API
connect. everything.
How does Calais work?
1. Categorizes and metatags the
people, places, companies, facts
and events in your content to make
...
<Topic>M&A</Topic>
<Acquisition offset="494" length="130">
<Company_Acquirer>Reuters</Company_Acquirer>
<Company_Acquired>...
DrupalDrupal
Calais for DrupalCalais for Drupal
Calais Collection
What does Calais for Drupal Look Like?
•suggest terms
allowing full user
control of the tagging
(think of del.icio.us
reco...
Calais Terms or Taxonomy Terms?
configure per node type
save. magic happens.
Too much information? Limit it.
autodiscovery, what’s that mean?
RDF for your nodes
LOOK AWAY!!!!!!!
DrupalDrupal
Calais for DrupalCalais for Drupal
More Like ThisMore Like This Topic HubsTopic HubsGeoGeo
More Like This
automatically prefill from Calais
configure threshold
relevant on-site content to browse
relevant news from the web
DrupalDrupal
Calais for DrupalCalais for Drupal
More Like ThisMore Like This Topic HubsTopic HubsGeoGeo
self organizing co...
create a contextual expression
configure plugins (or define your own)
tell your story
show content in various contexts
they are panels, so rearrange
DrupalDrupal
Calais for DrupalCalais for Drupal
Linked DataLinked Data
More Like ThisMore Like This Topic HubsTopic HubsGe...
Linked Data Datasets
Linked Data
it’s all about the URIs
Drupal: http://dbpedia.org/resource/Drupal
Washington DC: http://d.opencalais.com/er/g...
Calais linked data RDF
hello dbpedia (for geeks)
Calais geo config
on a map, but wait, there’s more
dbpedia data (or other sources)
semantic company data
company data from dbpedia
Calais URI for Toyota
Get the next link to follow
SPARQL query to get the data from DBPedia
render it to html, voilà
DrupalDrupal
Calais for DrupalCalais for Drupal
Linked DataLinked Data
More Like ThisMore Like This Topic HubsTopic HubsGe...
The Big Picture – OpenPublish
DrupalDrupal
Calais for DrupalCalais for Drupal
Linked DataLinked Data
More Like ThisMore Li...
http://opensourceopenminds.com/openpublish
Enough Talk - lets see a demo…
Q&A
Upcoming SlideShare
Loading in...5
×

Phase2 OpenPublish Presentation SF SemWeb Meetup, April 28, 2009

2,051

Published on

Presentation on the new OpenPublish platform - built on Drupal 6 and OpenCalais - by Jeff Walpole, CEO, and Frank Febbraro, CTO, Phase2 Technology.

Published in: Technology
2 Comments
10 Likes
Statistics
Notes
  • Hi there -- best bet for more detailed info would be to see the OpenPublish site here: http://www.opensourceopenminds.com/openpublish

    You can also ask specific questions of the OpenPublish architect Fran Febbraro via Twitter at @Febbraro. http://twitter.com/febbraro

    If you are just seeking to use the Calais modules for Drupal, you can find them here: http://drupal.org/project/opencalais

    Frank also created these.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Have you some note or a video from this presentation.

    it's difficult to find enough information of the slides especialy regarding topichubs.

    I can realy find any step by step use of this Drupal module.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
2,051
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
62
Comments
2
Likes
10
Embeds 0
No embeds

No notes for slide
  • Developing quite a few great SemWeb modules too. Arto is a maniac
  • Calais provides the Semantic Engine for OpenPublish. It gives us the context to the world outside if our site. So lets talk about how Calais and Drupal work together.
  • Has anyone used Calais? This represents the core of our discussions. The Calais module sits at the epicenter of this collection of modules. It is an API and integration with nodes. It provides auto tagging of your nodes, and these other modules we developed site on top of the Calais data to drive the power of the meta data into your site and to your users.
  • As I said, Calais is an auto-tagger. It’s really just a taxonomy integration. Calais Terms are like the maternal twin of Taxonomy. We wanted to make use of taxonomy for the added benefits.
  • How is it configured? Calais is configured per content type.
  • Saving is where the magic happens.
  • Use the relevance threshold to limit the amount of noise, you can also blacklist terms, subsititue, hook into, etc.
  • Autodiscovery links allow bots, browsers, readers, etc to find content in other formats related to the current page. Seen here there are a few other related content formats, the application/rdf+xml is the related Calais RDF document in XML form.
  • RDF is great for representing data, but awful for your eyes. That is why semwebbers all wear glasses. This is the #1 comment I have received. RDFa is a method for embedding RDF data into XHTML documents. GRDDL can be used to transform it into RDF. We did not tackle RDFa YET!!! in Calais b/c this is an area that is beign worked on and integrated into D7 (at the theme layer) and has already begun. Might be a nice back-port though.
  • RDF can turn you into stone.
  • A collection of modules that consist of a core “framework” module that provides a plugin architecture allowing modules to provide related content. On or off site content.
  • Start with a More Like This Thumbprint (Terms). This is the thumbprint of a node, the terms that you feel most accurately represent the essence of your node content. In here you will select or enter terms, or have Calais prefill. Calais returns a relevancy score, we can use that to prefill these automatically.
  • Configure the relevance score that a term must have to be automagically applied.
  • When viewing a node, it now provides other relevant on site nodes matched based on taxonomy.
  • It also does off site searching, seen here using Yahoo’s BOSS, Build your Own Search Service.
  • Topic Hubs are site pages that aggregate content based on inclusion in taxonomy expressions.
  • Here is where you can build your expressions. You can broaden or narrow the scope based on the expression you create. But simply put, all nodes/comments, etc that match this expression will be present in your topichub.
  • There are a variety of plugins, or you can define your own.
  • This represents how the various plugins represent the content on your site that is matching your contextual expression
  • The map provides some nice features. Showing your content based on geo graphical terms. Cities, States or Countries.
  • They are just panels so add whatever you want. Node content, views, blocks, define your own. What makes the TopicHub plugins unique is that they respond to the context of your Hub, using the expression.
  • Linked Data refers to the linking of RDF datasets across the Semantic Web. Sony referenced over here, is the same Sony talked about over there. This has been a huge goal of the semantic web for quite some time and it is finally alive.
  • Diagram shows the Linked Data world. There are new datasets being release all the time and this diagram is already obsolete as the Calais Linked Data is not in there
  • Diagram shows the Linked Data world. There are new datasets being release all the time and this diagram is already obsolete as the Calais Linked Data is not in there
  • Again, RDF is ugly.
  • DBPedia human-readable data.
  • Calais has disambiguated these geographical terms and provided lat/lon for us.
  • But the Calais Linked Data URI allow much more.
  • Here we are showing additional data retrieved from DBPedia
  • Article about Toyota having a rough go at it. Who would have thought a car company would be in financial trouble in this day and age!?!?!
  • This grabs the most relevant company from Calais and if it is disambiguated, looks up data on DBPedia.
  • This is a view of the Taxonomy Term edit screen. The Calais Term for Toyota has the following Linked Data URI.
  • With that URI, we grab the RDF from Calais for the disambiguated company. That RDF doc returned has a link to the DBPedia resource that is “the same as” this resource.
  • With that Resource URI, we create a SPARQL query to get data from the DBPedia via it’s SPARQL endpoint. (endpoint is just a fancy name for webservice that responds to SPARQL queries)
  • We then render the resultant data into HTML. Easy as Pie.
  • Recognizes search bots (configurable) and sends your page to Calais and injects microformats into the body of your page that crawlers such as Yahoo SearchMonkey can comprehend. So what does this pyramid bring us to?
  • OpenPublish is a Drupal semantic publishing platform. It consists of Drupal, and Install profile, and a number of Modules that we have combined to provide a great starting point for publishers filled with best practices from our experience. There is nothing you could not build yourself, but we have combined things you would likely want to safe you a few (or few hundred) hours. Save a newspaper.
  • Go and download it, install it, kick the tires. Let us know your thoughts. We love feedback.
  • We will be showing people how to install and configure OpenPublish and the Calais Collection modules. Work through issues, give feedback, provide ideas.
  • Phase2 OpenPublish Presentation SF SemWeb Meetup, April 28, 2009

    1. 1. Drupal, Calais & the Semantic Web Prepared by Frank Febbraro, CTO & Presented by Jeff Walpole, CEO
    2. 2. Introductions (and sizing each other up)  Raise your hand if you are a…  Technologist?  Journalist?  SemHead?  Raise your hand if you use or have used Drupal?  Calais API?  Lets play word association…  Linked data  RDF  SPARQL  GRDDL
    3. 3. Publishing tech Phase2 is working on  CMS frameworks  Drupal & Java Development  Taxonomy solutions  Geo-tagging & Mapping  Charting & Graphing Data  Semantic Web integration  Open Data/APIs  Topic Hubs  Publishing workflow  Feed Syndication  Buzz and topic trend monitoring  Community collaboration sites  Multi-site & virtual site CMS architecture  An open source CMS installation specifically for publishers – called OpenPublish
    4. 4. DrupalDrupal We heart Drupal
    5. 5. Why We use Drupal for CMS  Performance/Reliability: Dozens of major publishers turn to Drupal and tens of thousands of high traffic sites because it is an enterprise class platform  Ease/Expense of Implementation: As one of the leading shops developing for this platform, we can be as efficient as anyone and this platform is our preferred technology.  Evolving Technology Extensibility: You need something modular/extensible that allows you to add new features easily and we know this is possible with Drupal.  Easier Modular Enhancements: Drupal's architecture is modular and integrates well without requiring customization to core components that would make them difficult to maintain.  P2 Expertise: Our entire development staff of 12+ developers can support you on Drupal and we are known as one of the top firms in the country.  Large Community Support: You need a community that is active, robust, responsive and growing. We are involved in the Drupal community and have an ear to the ground on features and changes that would affect your site.  Easy Staff Training: The Drupal CMS is intuitive and we are well versed in training others to use it. To support training, there are numerous videos, online tutorials, local classes and even books on how it works.  Decreased Support Costs: Publishers find they can do a lot more themselves and when they do need help, the time is a fraction of what a proprietary CMS would cost for similar changes.
    6. 6. Drupal Semantic Modules rdf, rdf cck, foaf, relations, sparql, sioc, calais collection http://www.youtube.com/watch?v=r4WgTRIRoa0
    7. 7. The Calais API connect. everything.
    8. 8. How does Calais work? 1. Categorizes and metatags the people, places, companies, facts and events in your content to make it ‘machine-readable,’ and returns that metadata to you. 2. Makes connections between the entities in your content and related data in Wikipedia, GeoNames, the IMDB, Shopping.com and more 3. Empowers you to share your metadata with search engines, news aggregators, ‘related stories’ applications and others in the content ecosystem.
    9. 9. <Topic>M&A</Topic> <Acquisition offset="494" length="130"> <Company_Acquirer>Reuters</Company_Acquirer> <Company_Acquired>ClearForest Ltd.</Company_Acquired> <Status>Planned</Status> </Acquisition> <Company>Reuters</Company> <Company>ClearForest Ltd.</Company> Reuters Announced the Acquisition of ClearForest New York - April 30, 2007 Reuters, the global information company, has entered into an agreement to acquire all of the outstanding shares of ClearForest Ltd., a privately held provider of Text Analytics solutions, whose tagging platform and analytical products allow clients to derive precise business information from huge amounts of textual content. ClearForest has received sufficient shareholder approval to complete the transaction, which is expected to close in approximately 30 days, subject to customary closing conditions. The financial terms were not disclosed. Reuters plans to retain and continue to work with the existing management team and their highly skilled workforces in the US and Israel. It also plans to continue to support existing products and customers. Reuters believes that search will be a pivotal element to the future of how financial information is sourced and consumed. As part of its drive into this space, Reuters has created a new strategic group and appointed Gerry Campbell, who will oversee the integration of ClearForest and drive this innovation. <Product>Text Analytic Solution </Product> <Company>ClearForest Ltd.</Company> <Company>Reuters</Company> <Country>United States</Country> <Country>Israel</Country> <Company>Reuters</Company> <Person>Gerry Campbell</Person> <ManagementChange offset="2789" length="92"> <Person>Gerry Campbell</Person> <Company>Reuters</Company> <Action>Enters</Position> </ManagementChange> What Would that Look Like (in code)?
    10. 10. DrupalDrupal Calais for DrupalCalais for Drupal Calais Collection
    11. 11. What does Calais for Drupal Look Like? •suggest terms allowing full user control of the tagging (think of del.icio.us recommending tags).
    12. 12. Calais Terms or Taxonomy Terms?
    13. 13. configure per node type
    14. 14. save. magic happens.
    15. 15. Too much information? Limit it.
    16. 16. autodiscovery, what’s that mean?
    17. 17. RDF for your nodes
    18. 18. LOOK AWAY!!!!!!!
    19. 19. DrupalDrupal Calais for DrupalCalais for Drupal More Like ThisMore Like This Topic HubsTopic HubsGeoGeo More Like This
    20. 20. automatically prefill from Calais
    21. 21. configure threshold
    22. 22. relevant on-site content to browse
    23. 23. relevant news from the web
    24. 24. DrupalDrupal Calais for DrupalCalais for Drupal More Like ThisMore Like This Topic HubsTopic HubsGeoGeo self organizing content
    25. 25. create a contextual expression
    26. 26. configure plugins (or define your own)
    27. 27. tell your story
    28. 28. show content in various contexts
    29. 29. they are panels, so rearrange
    30. 30. DrupalDrupal Calais for DrupalCalais for Drupal Linked DataLinked Data More Like ThisMore Like This Topic HubsTopic HubsGeoGeo Linked Data
    31. 31. Linked Data Datasets
    32. 32. Linked Data it’s all about the URIs Drupal: http://dbpedia.org/resource/Drupal Washington DC: http://d.opencalais.com/er/geo/city/ralg-geo1/f497898f-2b9b-7cda- ec7b-85d896acbe3e Calais linked data for humans
    33. 33. Calais linked data RDF
    34. 34. hello dbpedia (for geeks)
    35. 35. Calais geo config
    36. 36. on a map, but wait, there’s more
    37. 37. dbpedia data (or other sources)
    38. 38. semantic company data
    39. 39. company data from dbpedia
    40. 40. Calais URI for Toyota
    41. 41. Get the next link to follow
    42. 42. SPARQL query to get the data from DBPedia
    43. 43. render it to html, voilà
    44. 44. DrupalDrupal Calais for DrupalCalais for Drupal Linked DataLinked Data More Like ThisMore Like This Topic HubsTopic HubsGeoGeo MarmosetMarmoset Marmoset: microformats for search agents
    45. 45. The Big Picture – OpenPublish DrupalDrupal Calais for DrupalCalais for Drupal Linked DataLinked Data More Like ThisMore Like This Topic HubsTopic HubsGeoGeo MarmosetMarmoset
    46. 46. http://opensourceopenminds.com/openpublish
    47. 47. Enough Talk - lets see a demo…
    48. 48. Q&A
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×