EthicShare.org (Mostly Solr)

  • 2,398 views
Uploaded on

 

More in: Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
2,398
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
28
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Twin Cities Drupal Users Group - October 22, 2008 EthicShare: Solr + Drupal Under the Hood Tour
  • 2. EthicShare?
    • Who: University of Minnesota's Center for Bioethics, the University of Minnesota Libraries, and the University of Minnesota Department of Computer Science and Engineering
      • EthicShare’s pilot implementation builds on a recent planning phase that was a collaboration with the University of Virginia, Georgetown University, Indiana University-Bloomington, and Indiana University-Purdue University, Indianapolis.
    • What: A sustainable aggregation of bioethics research and a platform for scholarship
    • When: Pilot Phase runs from January 2008 - June 2009
    • How: Funded by the Andrew W. Mellon Foundation
  • 3. The Platform
    • Drupal
      • Community Development Framework
    • Solr
      • Faceted Search Appliance
  • 4. The Process
  • 5.  
  • 6.
    • Origin: Created by CNET and released January 2006
      • Became an Apache Software Foundation project shortly thereafter
    • Builds on the Lucene Search Engine Library
      • Comes with Lucene’s search syntax and features
    • Provides simple HTTP/XML API
    • Strongly typed field definitions
    • Noteworthy Implementations Netflix, CNET Reviews, GameSpot, Digg
        • More: http: //wiki .apache. org/solr/PublicServers
  • 7. Behind the Scenes - Indexing
    • HTTP/XML API
    • http://localhost:8983/solr/update
    • http://localhost:8983/solr/select
    • Indexing = POSTing XML Records to /update
    • Commands: <add> <delete><commit/><optimize/>
      • <add>
      • <doc>
      • <field name=”nid&quot;>101</field>
      • <field name=”vid&quot;>2</field>
      • <field name=&quot;title&quot;>Solr Search is Simply Great</field>
      • <field name=”body&quot;>Solr and Drupal are like PB And J</field>
      • <field name=&quot;changed&quot;>1224707462</field>
      • <field name=”tid&quot;>4</field>
      • <field name=”name&quot;>libsys</field>
      • <field name=”uid&quot;>10297</field>
      • </doc>
      • </add>
  • 8. Behind the Scenes - Searching
    • Get Contents of …/select URL: cURL, file_get_contents($url)…
    • ApacheSolr makes use of a Solr PHP Client Abstraction Layer
      • http: //wiki .apache.org/solr/SolPHP
  • 9. Setup - Solr Directory Layout
      • Tomcat Files:
      • … /tomcat/webapps/solr_ethicshare.war (cp solr.war from example dir)
      • … /tomcat/conf/Catalina/localhost/ solr_ethicshare.xml
    solr_ethicshare.xml - Tell Tomcat About Solr <Context docBase=&quot;solr_ethicshare.war&quot; debug=&quot;0&quot; crossContext=&quot;true&quot; > <Environment name=&quot;solr/home&quot; type=&quot;java.lang.String&quot; value=&quot;/usr/local/solr_home/ethicshare&quot; override=&quot;true&quot; /> </Context>
  • 10. Solr Schema - Fields and Types
    • Starter schema:
      • ../drupaldir/sites/all/modules/apachesolr/schema.xml
    • <types> ex:
      • string=solr.StrField
      • boolean=solr.BoolField
    • <fields>
      • <field name=&quot;title&quot; type=&quot;string&quot; indexed=&quot;true&quot; stored=&quot;true&quot;/>
  • 11. Solr Schema - <type> Analyzers
    • Tokenize on whitespace, then remove any common words (StopFilterFactory)
    • Remove any duplicates (RemoveDuplicatesTokenFilterFactory)
  • 12. Solr Schema - Dynamic Fields
    • <dynamicField name=&quot;smfield*&quot; type=&quot;string&quot; indexed=&quot;true&quot; stored=&quot;true&quot; multiValued=&quot;true&quot;/>
    • <dynamicField name=&quot;tmfield*&quot; type=&quot;text&quot; indexed=&quot;true&quot; stored=&quot;true&quot; multiValued=&quot;true&quot;/>
  • 13. Solr Schema - Some Example Options
    • uniqueKey
      • <!-- Field to use to determine and enforce document uniqueness.
      • Unless this field is marked with required=&quot;false&quot;, it will be a required field
      • -->
        • <uniqueKey>nid</uniqueKey>
    • defaultSearchField
      • <!-- field for the QueryParser to use when an explicit fieldname is absent -->
        • <defaultSearchField>text</defaultSearchField>
    • solrQueryParser
      • <!-- SolrQueryParser configuration: defaultOperator=&quot;AND|OR&quot; --
        • <solrQueryParser defaultOperator=&quot;AND&quot;/>
  • 14. ApacheSolr Search Integration Module
    • Core Search Integrated
    • Blocks for facet configuration
    • Schedules Indexing (via core search)
    • Theme Hooks for overriding look and feel
    • CCK Integration
      • hook_apachesolr_cck_field_mappings()
        • Which Fields to Index
        • How to Index them
        • Callback to pre-process fields
        • Whether or Not to Provide a Facet Block
    • Help! We need testers for alpha3!
      • http: //drupal . org/project/apachesolr
  • 15. Links
    • Installing Solr + Tomcat
      • http: //mikejoconnor . net/content/solr-ubercartorg
    • Google Book Search API
      • http://code. google .com/apis/books/
    • unAPI
      • http: //unapi .info/
    • ApacheSolr Search Integration
      • http: //drupal .org/project/apachesolr
    • IBM Developer Works - Solr
      • http://www. ibm .com/developerworks/java/library/j-solr1/
    • SolPHP
      • http: //wiki .apache. org/solr/SolPHP