EthicShare.org (Mostly Solr)
Upcoming SlideShare
Loading in...5
×
 

EthicShare.org (Mostly Solr)

on

  • 3,484 views

 

Statistics

Views

Total Views
3,484
Views on SlideShare
3,483
Embed Views
1

Actions

Likes
1
Downloads
27
Comments
0

1 Embed 1

http://www.slideshare.net 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

EthicShare.org (Mostly Solr) EthicShare.org (Mostly Solr) Presentation Transcript

  • Twin Cities Drupal Users Group - October 22, 2008 EthicShare: Solr + Drupal Under the Hood Tour
  • EthicShare?
    • Who: University of Minnesota's Center for Bioethics, the University of Minnesota Libraries, and the University of Minnesota Department of Computer Science and Engineering
      • EthicShare’s pilot implementation builds on a recent planning phase that was a collaboration with the University of Virginia, Georgetown University, Indiana University-Bloomington, and Indiana University-Purdue University, Indianapolis.
    • What: A sustainable aggregation of bioethics research and a platform for scholarship
    • When: Pilot Phase runs from January 2008 - June 2009
    • How: Funded by the Andrew W. Mellon Foundation
  • The Platform
    • Drupal
      • Community Development Framework
    • Solr
      • Faceted Search Appliance
  • The Process
  •  
    • Origin: Created by CNET and released January 2006
      • Became an Apache Software Foundation project shortly thereafter
    • Builds on the Lucene Search Engine Library
      • Comes with Lucene’s search syntax and features
    • Provides simple HTTP/XML API
    • Strongly typed field definitions
    • Noteworthy Implementations Netflix, CNET Reviews, GameSpot, Digg
        • More: http: //wiki .apache. org/solr/PublicServers
  • Behind the Scenes - Indexing
    • HTTP/XML API
    • http://localhost:8983/solr/update
    • http://localhost:8983/solr/select
    • Indexing = POSTing XML Records to /update
    • Commands: <add> <delete><commit/><optimize/>
      • <add>
      • <doc>
      • <field name=”nid&quot;>101</field>
      • <field name=”vid&quot;>2</field>
      • <field name=&quot;title&quot;>Solr Search is Simply Great</field>
      • <field name=”body&quot;>Solr and Drupal are like PB And J</field>
      • <field name=&quot;changed&quot;>1224707462</field>
      • <field name=”tid&quot;>4</field>
      • <field name=”name&quot;>libsys</field>
      • <field name=”uid&quot;>10297</field>
      • </doc>
      • </add>
  • Behind the Scenes - Searching
    • Get Contents of …/select URL: cURL, file_get_contents($url)…
    • ApacheSolr makes use of a Solr PHP Client Abstraction Layer
      • http: //wiki .apache.org/solr/SolPHP
  • Setup - Solr Directory Layout
      • Tomcat Files:
      • … /tomcat/webapps/solr_ethicshare.war (cp solr.war from example dir)
      • … /tomcat/conf/Catalina/localhost/ solr_ethicshare.xml
    solr_ethicshare.xml - Tell Tomcat About Solr <Context docBase=&quot;solr_ethicshare.war&quot; debug=&quot;0&quot; crossContext=&quot;true&quot; > <Environment name=&quot;solr/home&quot; type=&quot;java.lang.String&quot; value=&quot;/usr/local/solr_home/ethicshare&quot; override=&quot;true&quot; /> </Context>
  • Solr Schema - Fields and Types
    • Starter schema:
      • ../drupaldir/sites/all/modules/apachesolr/schema.xml
    • <types> ex:
      • string=solr.StrField
      • boolean=solr.BoolField
    • <fields>
      • <field name=&quot;title&quot; type=&quot;string&quot; indexed=&quot;true&quot; stored=&quot;true&quot;/>
  • Solr Schema - <type> Analyzers
    • Tokenize on whitespace, then remove any common words (StopFilterFactory)
    • Remove any duplicates (RemoveDuplicatesTokenFilterFactory)
  • Solr Schema - Dynamic Fields
    • <dynamicField name=&quot;smfield*&quot; type=&quot;string&quot; indexed=&quot;true&quot; stored=&quot;true&quot; multiValued=&quot;true&quot;/>
    • <dynamicField name=&quot;tmfield*&quot; type=&quot;text&quot; indexed=&quot;true&quot; stored=&quot;true&quot; multiValued=&quot;true&quot;/>
  • Solr Schema - Some Example Options
    • uniqueKey
      • <!-- Field to use to determine and enforce document uniqueness.
      • Unless this field is marked with required=&quot;false&quot;, it will be a required field
      • -->
        • <uniqueKey>nid</uniqueKey>
    • defaultSearchField
      • <!-- field for the QueryParser to use when an explicit fieldname is absent -->
        • <defaultSearchField>text</defaultSearchField>
    • solrQueryParser
      • <!-- SolrQueryParser configuration: defaultOperator=&quot;AND|OR&quot; --
        • <solrQueryParser defaultOperator=&quot;AND&quot;/>
  • ApacheSolr Search Integration Module
    • Core Search Integrated
    • Blocks for facet configuration
    • Schedules Indexing (via core search)
    • Theme Hooks for overriding look and feel
    • CCK Integration
      • hook_apachesolr_cck_field_mappings()
        • Which Fields to Index
        • How to Index them
        • Callback to pre-process fields
        • Whether or Not to Provide a Facet Block
    • Help! We need testers for alpha3!
      • http: //drupal . org/project/apachesolr
  • Links
    • Installing Solr + Tomcat
      • http: //mikejoconnor . net/content/solr-ubercartorg
    • Google Book Search API
      • http://code. google .com/apis/books/
    • unAPI
      • http: //unapi .info/
    • ApacheSolr Search Integration
      • http: //drupal .org/project/apachesolr
    • IBM Developer Works - Solr
      • http://www. ibm .com/developerworks/java/library/j-solr1/
    • SolPHP
      • http: //wiki .apache. org/solr/SolPHP