Drupal and the Semantic Web: from RDF to Whitehouse.gov - SemTech2010

4,756 views

Published on

Drupal and the Semantic Web: from RDF to Whitehouse.gov

As we usher in this era of open data in which organizations of all types are taking a queue from the administration's focus on transparency, accountability and efficiency by making raw data available to the public in platform independent formats, web CMS is rapidly becoming a useful place to showcase semantic web standards.

Meanwhile this trend is converging with a dramatic rise in the popularity of the most semantic web friendly open source CMS, Drupal which is featuring RDF as part of its core architecture. In October 2009, the official site of the President and flagship site for the administration, Whitehouse.gov re-launched on Drupal and featured some nods to semantic web technology with the addition of RDFa content and heavy use of taxonomies to drive content and search.

Published in: Technology
0 Comments
6 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
4,756
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
0
Comments
0
Likes
6
Embeds 0
No embeds

No notes for slide
  • Introduce myself
    Explain Phase2
    Discuss the fact that Frank is not present
    New agenda: less tech/more use case driven - no examples

  • Explain how we are/were a development shop building custom
    Found OSS CMS in 2004 and Drupal in 2005


  • Explain the history and uses of Opencalais
    How it works on the admin side
    How configuration can be controlled
  • What are the other modules built and developed around it
    How an API provides the engine through which we can develop new features
  • ~ 12K downloads
    ~ 2,400 active sites
    ~ 20%
  • Developing quite a few great SemWeb modules too. Arto is a maniac
    The RDF CCK module allows site administrators to map each content type, node title, node body and CCK field to an RDF term (class or property).
  • Drupal 7 takes RDF as a central part of the architecture. New modules are coming that will do even more

    Drupal 7 RDF module maintainer: Stéphane "scor" Corlosquet
    Drupal 7 RDF contributor and evangelist extraordinaire: Lin Clark
    Code contributors:
    Mark Birbeck
    Alex Bronstein
    John Breslin
    Benjamin Doherty
    Stefan Freudenberg
    Rolf Guescini
    Daniel F. Kudwien
    Florian Lorétan
    Frédéric Marand
    Benjamin Melançon
    John Morahan
  • Drupal 7 takes RDF as a central part of the architecture. New modules are coming that will do even more

    Drupal 7 RDF module maintainer: Stéphane "scor" Corlosquet
    Drupal 7 RDF contributor and evangelist extraordinaire: Lin Clark
    Code contributors:
    Mark Birbeck
    Alex Bronstein
    John Breslin
    Benjamin Doherty
    Stefan Freudenberg
    Rolf Guescini
    Daniel F. Kudwien
    Florian Lorétan
    Frédéric Marand
    Benjamin Melançon
    John Morahan

  • So how is this being used to fuel the open gov movement and why?
  • This is how citizens see the concepts behind open gov
  • Not everyone sees gov2.0 and opengov the same - some have interpreted it more from a data/technologists perspective. The good news is that Drupal is equally suited to address these needs.
  • Enabling the public to have a two way conversation with the government
    Be pro-active in publishing to the web
    Collect needs/ideas from citizens
    Improve citizen services online
    Be more open with information, data and policy decision making
  • December 8, 2009 Obama Administration released the OGD memo


  • But OG is not just about open source or even open data
  • 47 data sets May 2009
    270K+ data sets a year later in June 2010
    unlocking data unlocks opportunities
    public knowledge
    core mission
    economic opportunity

  • OG is doing something very important in that it is creating innovation. AppsforAmerica and Code for America is a great example. Lots of ways that developers can now get engaged in helping govt.


  • Kieran’s list has 17, but we know there are many more. The list is likely to double this year just based upon current inquiries.

  • The REAL Overview:
    - More than scaling a website. It was scaling the delivering Drupal websites.
    - Cover project details, the site itself, go over the launch, infrastructure, and what we've been doing since
    - Why replace? They only had a website before, but when it was over, we provided them a platform to build on to tap into (and now participate in) the our vast community of creative problem solvers

  • Disclaimer.
    - Due to NDA's etc. I cannot go into great detail about things.
    - Thrilled that I can talk about it though

  • Why Drupal? (it rawks!!)
    - New Media was a champion of Open Source and Drupal for whitehouse.gov.
    - The team had a very clear vision of what they wanted, detailed control to tell the human interest side of the Presidency, Drupal provided that.
    - New functionality and improved administrative capabilities and a platform to extend.

  • What makes this platform we built great?
    - Great design
    - Drupal 6
    - Performance patches
    - Lots of contrib modules
    - Custom features and integrations

  • This is a rather typical architectural approach to some of the larger Drupal based site.
  • Key Functionality: Apache Solr search w/ Faceting.
    - Big benefit here and a massive improvement over the original.
  • Over quarter of a million visitor records exposed. Released monthly. Bulk import, staging, and cutover via Drush

  • Key Functionality: Media Browser.
    - Custom Solr Search integration
    - Categorical filtering Media objects.
    - AJAX enabled categorical browsing.
    - Fallback HTML version for 508 compliance.
  • Building on that, we overhauled the handling of multimedia to take all the guesswork out. Strict process for content entry that leads to far more consistent usage and rendering of imagery and media. This also leads to better 508 as content input and referencing is strictly controlled. Node Embed is now released to the public
  • Building on that, we overhauled the handling of multimedia to take all the guesswork out. Strict process for content entry that leads to far more consistent usage and rendering of imagery and media. This also leads to better 508 as content input and referencing is strictly controlled. Node Embed is now released to the public
  • HTML5 version of the site was implemented. One of the great features of that is that is can now display video on my iPad.
  • Key Functionality: Tight integration with Akamai Cache Control Utility. Clears cache automatically on content updates, also allow any individual page to be cleared from a button on that page. This is a more flexible utility to clear any URL directly from the CMS.
  • Launch: No DNS delays, etc. We were locked into the launch 4 hours prior, so it was like clicking up the track of a roller coaster waiting to go over the top. Crazy. At each hour leading to launch we were checking the status of servers/functionality & monitoring performance. Then at 1pm exactly the firehose was turned on.

    My desktop monitoring each web & database server the day of launch. I was looking at top, watching replication, database connections, number of apache processes, free memory, etc.

  • New user functionality
    More opengov responsiveness
    Great data use
    More RDF???

    How does this apply & what does the future hold?
    - This site sets a new bar for how large scale Drupal can be deployed.
    - Security, Process Review, and Scalability
    - Processes are not all Drupal based, but the process is key
    - As Drupal moves up market this will become more and more important
    - These orgs are ready for us, but we need to be ready for them

  • How it is being developed?
    From our work with related open government efforts, we’ve developed a framework and process for implementing sites that are compliant and forward-thinking about OGD.
    Why?
    Because open technology can only be used to accomplish OGD goals if it’s done correctly, responsibly, and with minimal burden on agencies.
    Who will use it?
    Government agency technology reps required to comply with the OGD.
    To accomplish what?
    Immediate help with compliance, but also proactive commitment to open government shared through open technology




  • Jeff:
  • Drupal and the Semantic Web: from RDF to Whitehouse.gov - SemTech2010

    1. 1. Drupal and the Semantic Web: from RDF to WHITEHOUSE.G OV Jeff Walpole, CEO
    2. 2. What’s Being Covered Why Drupal Rocks the Semantic Web Why the OpenGov movement needs Drupal and the semantic web Deep dive case study on WhiteHouse.gov The case for a semantic enabled distribution of Drupal for open government
    3. 3. Why We use Drupal ? Technology Extensibility Easy Modular Enhancements Out of the box Web 2.0 Semantic Web Friendly Performance/Reliability Ease of Implementation THE COMMUNITY!
    4. 4. Why Drupal Works for Semantics Open Source Modular Architecture System of Nodes and relationships Great technical vision for the future Touches the right type of sites (publishing, government, etc.) Allows common users to become semantic publishers
    5. 5. Integrating OpenCalais
    6. 6. Calais Collection of Modules Calais Geo More Like This Topic Hubs Linked Data
    7. 7. OpenPublish - Publishers’ Distro http://openpublishapp.com
    8. 8. Drupal 6 Semantic Modules rdf, rdf cck, foaf, relations, sparql, RDF SPARQL Proxy, sioc, opencalais http://drupal.org/project/rdfcck
    9. 9. RDF in Drupal 7 Core This is how RDFa goes mainstream Drupal 7 site content is published as RDFa When enabled, marks up some attributes by default including: Node Titles, User information, taxonomy, comment information, image information, etc. Create your own mappings from custom fields (CCK) Lots of new modules will use this functionality and blow out semantic capabilities in the next release. http://semantic-drupal.com/
    10. 10. RDF in Drupal 7 Tutorial Tomorrow “How to Build Linked Data Sites with Drupal 7 and RDFa” Stéphane Corlosquet, Lin Clark, Axel Polleres, Alexandre Passant Franciscan C 8:30 AM - 3:00 PM
    11. 11. What’s Being Covered Why Drupal Rocks the Semantic Web Why the OpenGov movement needs Drupal and the semantic web Deep dive case study on WhiteHouse.gov The case for a semantic enabled distribution of Drupal for open government
    12. 12. Open Governmen
    13. 13. Citizens view of what OG is... Source: planspark Got this by dumping the full (and slightly cleaned-up) text of Rebooting America -- Ideas for Redesigning American Democracy for the Internet Age into the Wordle tag cloud generator and returning the top 80 tags
    14. 14. Techies view of what OG is... Source: digiphile Got this by putting the agenda for Transparency Camp into Wordle
    15. 15. While they work out the details, we have many of the technical answers here to get started with Drupal
    16. 16. Technical Requirements of OGD www.agency.gov/open Use of “modern technology” / best practices Open data sets Published Open Government Plan FOIA Plan and Information Mechanisms for public feedback and input Downloadable/machine readable copies of virtually everything
    17. 17. OGD showed how hard web 2.0 thinking is for government web 3.0 might actually be easier...
    18. 18. 8 Steps to Publishing Public Data 1. Complete: All public data is made 5. Machine processable: Data is available. Public data is data that is reasonably structured to allow not subject to valid privacy, security automated processing. or privilege limitations. 6. Non-discriminatory: Data is 2. Primary: Data is as collected at the available to anyone, with no source, with the highest possible requirement of registration. level of granularity, not in aggregate or modified forms. 7. Non-proprietary: Data is available in a format over which no entity has 3. Timely: Data is made available as exclusive control. quickly as necessary to preserve the value of the data. 8. License-free: Data is not subject to 4. Accessible: Data is available to the any copyright, patent, trademark or widest range of users for the widest trade secret regulation. Reasonable privacy, security and privilege range of purposes. restrictions may be allowed. Source: Open Government Working Group Meeting in Sebastopol, CA, October 22, 2007
    19. 19. Open Government Needs the Semantic Web Structured Data Linked Open Data Visualizations Mashups Dataset metrics Semantic Archives
    20. 20. Innovation is Happening OGI is creating innovation we have not seen before by the Feds on the web (well since they invented it at least)
    21. 21. Typical WCMS Gov Policy Stakeholders (OPA, OCIO, etc.) Tech Information Assurance (IT, Security, Data Quality) Stack Enterprise Architecture / Standards Agency Warehouse/ Legacy Systems Reporting Systems
    22. 22. Data Visualizations / Mashups Data Directories / Linked Open Data An Open Data APIs Gov WCMS Collaboration / Social Media Tools Tech Policy Stakeholders (OPA, OCIO, etc.) Stack Information Assurance (IT, Security, Data Quality) Enterprise Architecture / Standards Agency Warehouse/ Legacy Systems Reporting Systems
    23. 23. Who is Using Drupal in OpenGov?
    24. 24. What’s Being Covered Why Drupal Rocks the Semantic Web Why the OpenGov movement needs Drupal and the semantic web Deep dive case study on WhiteHouse.gov The case for a semantic enabled distribution of Drupal for open government
    25. 25. http://www.flickr.com/photos/a_ninjamonkey/4042006778/
    26. 26. Why The White House Chose Drupal Championed from within EOP Robust Core & Contrib Functionality Allowed full control of platform Open & Transparent Ability to easily integrate new tech (like semweb)
    27. 27. Why The White House Chose Drupal Championed from within EOP Robust Core & Contrib Functionality Allowed full control of platform Open & Transparent Ability to easily integrate new tech (like semweb)
    28. 28. Ingredients Great Design Drupal 6 Performance patches Lots of contrib modules Custom developed (and contributed modules) Custom features & integration http://whitehouse.gov/tech
    29. 29. Tiers cdn web cache search database monitor puppet Exact specifications undisclosed.
    30. 30. http://drupal.org/project/node_embed
    31. 31. RDFa
    32. 32. http://drupal.org/project/akamai
    33. 33. Launch Saturday, October 24 2009, 1pm
    34. 34. Numbers 1 million+ page views/day 100s of thousands of unique visitors/day 10s of millions page views / month 100k+ peak concurrent live streams 15k+ Contact/Email submissions/day Exact figures undisclosed.
    35. 35. Next...
    36. 36. What’s Being Covered Why Drupal Rocks the Semantic Web Why the OpenGov movement needs Drupal and the semantic web Deep dive case study on WhiteHouse.gov The case for a semantic enabled distribution of Drupal for open government
    37. 37. OpenGov Distribution Helps tackle OG needs using Drupal Government best practices Regulatory compliance Introduces semweb concepts Meets security requirements Allows for rapid site development
    38. 38. Data Directories (Data.gov) / Features Server (Apps.gov) Linked Open Data Government Extensions APIs Themes Contrib Custom Modules Modules (NEW) Content Types / Views / CCK Default Configurations D7 Core OpenPublic Distribution
    39. 39. Q&A http://phase2technology.com jeff@phase2technology.com twitter.com/JeffWalpole

    ×