SlideShare a Scribd company logo
1 of 17
Download to read offline
Full­text searching with Marjory




               Markus Wolff




                     
What's Marjory?

        A webservice for full­text indexing and 
    


        searching of documents
        Written in PHP
    



        Based on Zend Framework
    



        (Very) Roughly comparable to Solr
    



        BSD­licensed, available on Google Code
    




                                
How does Marjory work?
                                Your application



                                 Sends search 
    Sends Document data                                Returns result in desired
                                 terms via GET
     or location via POST                             output format (default: XML)




                                   Marjory
                            (ReST­based webservice)



    Stores document data                                    Returns query
                                 Queries search
       in search engine                                        results
                                    engine




                                Search engine
                                            
                               (Default: Lucene)
Features

        Search engine abstraction
    


            use the engine that suits your needs, just write a 
        


            small adaptor class
            Zend_Search_Lucene built­in by default
        



        Multiple search catalogs
    


            Index many sites with one dedicated search server
        



            Put all documents matching any criteria into 
        


            separate search indexes to speed up search

                                     
More features

        Two ways to index documents:
    


            submit an XML snippet containing any content you 
        


            want to index
            or, just submit an URI (valid PHP stream resource) 
        


            and let Marjory extract the content from the 
            document
                 HTML supported by default (for now)
             



                 add your own document parser class to extract plain text 
             


                 from any other document format (or special markup 
                 structures)
                                         
Even more features

        Index documents asynchronously using Dropr 
    


        as a messaging service
            Dropr: PHP­based durable messaging service
        



            Example webservice and Dropr client included with 
        


            Marjory
            Application does not need to wait for document 
        


            retrieval, parsing and adding to the index
            More info: www.dropr.org
        




                                   
Latest additions

        Search results as a Dojo.Data compatible 
    


        JSON data source
        API exposure via JSON­RPC as alternative to 
    


        XML over ReST (experimental!)




                               
How to add a catalog

        Send a POST request to:
    


        http://marjory.example.com/rest/catalog/
        Containing this XML snippet:
    


        <add catalog=quot;MyGloriousCatalogquot; />
        Et voilá, you got yourself a new search index
    




                               
Adding a document

        Make a POST request to:
    


        http://marjory.example.com/rest/add/
        Send the document content as XML like this:
    


    <add catalog=quot;defaultquot;>
      <doc uri=quot;MyUniqueDocumentIdquot;>
          <field name=quot;titlequot;>Marjory: Search as a service</field>
          <field name=quot;abstractquot;>
            An epic novel about full­text indexing in an SOA environment
          </field>
          <field name=quot;contentquot;>Lorem ipsum dolor sit amet... (to be continued)</field> 
      </doc>

    </add>


                                                
Adding a document, the easy way

        Or, if Marjory should retrieve and parse the 
    


        document:
    <add catalog=quot;defaultquot;>
      <doc src=quot;http://my.website.tld/my/document.htmlquot; />
    </add>
        If you have many and/or complex documents, 
    


        better use Dropr to send messages to Marjory


                                
Searching for documents

        Make a GET request including the query terms:
    

        http://marjory.example.com/rest/select?q=Marjory
        Additional parameters to...
    


            Limit number of results
        



            Include only specific fields in response
        



            Specify a search catalog
        


                 Default catalog name: „default“ ­ who would have 
             


                 guessed?


                                         
Search response format

<?xml version=quot;1.0quot; encoding=quot;UTF­8quot;?>
<response>
  <responseHeader>
    <status>0</status><QTime>1</QTime>
  </responseHeader>

  <result numFound=quot;2quot; start=quot;0quot;>
   <doc>
    <str name=quot;idquot;>MA147LL/A</str>
    <str name=quot;namequot;>Apple 60 GB iPod Black</str>
   </doc>
   <doc>
    <str name=quot;idquot;>EN7800GTX/2DHTV/256M</str>
    <str name=quot;namequot;>ASUS Extreme N7800GTX</str>
   </doc>
  </result>
</response>


                                             
Looks familiar?

        Blatantly stolen from Solr :­)
    



        Why reinvent the wheel?
    



        Makes switching between the two projects easy 
    


        if need be
        Don't like it? Try JSON­RPC instead.
    




                                 
Access control

        No access control provided by Marjory
    



        Use your webserver's authentication and ACL 
    


        capabilities
        There are currently no plans to add anything 
    


        built­in, unless someone convinces me 
        otherwise :­)



                               
Things to do

        Fully unit­test the beast
    



        Add a nice admin GUI (currently in progress)
    



        Add other engines
    



        Support more document formats out of the box
    


        (PDF likely to be next addition)
        Fine­tuning (how about renaming or removing 
    


        catalogs, for example?)

                                 
Is it production­ready?

        Yes, and it's already being used on production 
    


        websites




                               
That's all, folks!

        More information:
    


            http://code.google.com/p/marjory/
        



            http://www.dropr.org/
        



            http://blog.wolff­hamburg.de/
        




                                     

More Related Content

What's hot

ePub 3, HTML 5 & CSS 3 (+ Fixed-Layout)
ePub 3, HTML 5 & CSS 3 (+ Fixed-Layout)ePub 3, HTML 5 & CSS 3 (+ Fixed-Layout)
ePub 3, HTML 5 & CSS 3 (+ Fixed-Layout)Clément Wehrung
 
HTML5 - Introduction
HTML5 - IntroductionHTML5 - Introduction
HTML5 - IntroductionDavy De Pauw
 
Elements of html powerpoint
Elements of html powerpointElements of html powerpoint
Elements of html powerpointAnastasia1993
 
Web Development Using CSS3
Web Development Using CSS3Web Development Using CSS3
Web Development Using CSS3Anjan Mahanta
 
Web Development Using CSS3
Web Development Using CSS3Web Development Using CSS3
Web Development Using CSS3Anjan Mahanta
 
How to migrate from any CMS (thru the front-door)
How to migrate from any CMS (thru the front-door)How to migrate from any CMS (thru the front-door)
How to migrate from any CMS (thru the front-door)ICF CIRCUIT
 
Things I wish web graduates knew
Things I wish web graduates knewThings I wish web graduates knew
Things I wish web graduates knewLorna Mitchell
 
Choose Your Own Adventure: SEO For Web Developers | Unified Diff
Choose Your Own Adventure: SEO For Web Developers | Unified DiffChoose Your Own Adventure: SEO For Web Developers | Unified Diff
Choose Your Own Adventure: SEO For Web Developers | Unified DiffSteve Morgan
 
HTML Web design english & sinhala mix note
HTML Web design english & sinhala mix noteHTML Web design english & sinhala mix note
HTML Web design english & sinhala mix noteMahinda Gamage
 
DIWE - Coding HTML for Basic Web Designing
DIWE - Coding HTML for Basic Web DesigningDIWE - Coding HTML for Basic Web Designing
DIWE - Coding HTML for Basic Web DesigningRasan Samarasinghe
 

What's hot (20)

ePub 3, HTML 5 & CSS 3 (+ Fixed-Layout)
ePub 3, HTML 5 & CSS 3 (+ Fixed-Layout)ePub 3, HTML 5 & CSS 3 (+ Fixed-Layout)
ePub 3, HTML 5 & CSS 3 (+ Fixed-Layout)
 
Sightly - Part 2
Sightly - Part 2Sightly - Part 2
Sightly - Part 2
 
Html and Xhtml
Html and XhtmlHtml and Xhtml
Html and Xhtml
 
HTML5 - Introduction
HTML5 - IntroductionHTML5 - Introduction
HTML5 - Introduction
 
Introduction to WEB HTML, CSS
Introduction to WEB HTML, CSSIntroduction to WEB HTML, CSS
Introduction to WEB HTML, CSS
 
Ferret
FerretFerret
Ferret
 
Elements of html powerpoint
Elements of html powerpointElements of html powerpoint
Elements of html powerpoint
 
Web Development Using CSS3
Web Development Using CSS3Web Development Using CSS3
Web Development Using CSS3
 
Web Development Using CSS3
Web Development Using CSS3Web Development Using CSS3
Web Development Using CSS3
 
How to migrate from any CMS (thru the front-door)
How to migrate from any CMS (thru the front-door)How to migrate from any CMS (thru the front-door)
How to migrate from any CMS (thru the front-door)
 
Things I wish web graduates knew
Things I wish web graduates knewThings I wish web graduates knew
Things I wish web graduates knew
 
WWW and HTTP
WWW and HTTPWWW and HTTP
WWW and HTTP
 
Html and html5 cheat sheets
Html and html5 cheat sheetsHtml and html5 cheat sheets
Html and html5 cheat sheets
 
Choose Your Own Adventure: SEO For Web Developers | Unified Diff
Choose Your Own Adventure: SEO For Web Developers | Unified DiffChoose Your Own Adventure: SEO For Web Developers | Unified Diff
Choose Your Own Adventure: SEO For Web Developers | Unified Diff
 
HTML Web design english & sinhala mix note
HTML Web design english & sinhala mix noteHTML Web design english & sinhala mix note
HTML Web design english & sinhala mix note
 
DIWE - Coding HTML for Basic Web Designing
DIWE - Coding HTML for Basic Web DesigningDIWE - Coding HTML for Basic Web Designing
DIWE - Coding HTML for Basic Web Designing
 
HTML5
HTML5 HTML5
HTML5
 
Web page concept final ppt
Web page concept  final pptWeb page concept  final ppt
Web page concept final ppt
 
html
htmlhtml
html
 
Xhtml 2010
Xhtml 2010Xhtml 2010
Xhtml 2010
 

Viewers also liked

Lezersonderzoek jo steyaert - indigov op Kortom studiedag dag van het informa...
Lezersonderzoek jo steyaert - indigov op Kortom studiedag dag van het informa...Lezersonderzoek jo steyaert - indigov op Kortom studiedag dag van het informa...
Lezersonderzoek jo steyaert - indigov op Kortom studiedag dag van het informa...Jo Steyaert
 
Corp Web Risks and Concerns
Corp Web Risks and ConcernsCorp Web Risks and Concerns
Corp Web Risks and ConcernsPINT Inc
 
California water footprint
California water footprintCalifornia water footprint
California water footprintprostoalex
 
PHP, AJAX und XUL im Intranet
PHP, AJAX und XUL im IntranetPHP, AJAX und XUL im Intranet
PHP, AJAX und XUL im IntranetMarkus Wolff
 
why LG IPS technology?
why LG IPS technology?why LG IPS technology?
why LG IPS technology?moshimoshi
 
Rapid Evolution of Web Dev? aka Talking About The Web
Rapid Evolution of Web Dev? aka Talking About The WebRapid Evolution of Web Dev? aka Talking About The Web
Rapid Evolution of Web Dev? aka Talking About The WebPINT Inc
 
Irrigation of agricultural crops in California
Irrigation of agricultural crops in CaliforniaIrrigation of agricultural crops in California
Irrigation of agricultural crops in Californiaprostoalex
 
Thoughts on Defensive Development for Sitecore
Thoughts on Defensive Development for SitecoreThoughts on Defensive Development for Sitecore
Thoughts on Defensive Development for SitecorePINT Inc
 
Omni-Proofing Your Organization
Omni-Proofing Your OrganizationOmni-Proofing Your Organization
Omni-Proofing Your OrganizationMonica Gout
 
Magento Performance Improvements with Client Side Optimizations
Magento Performance Improvements with Client Side OptimizationsMagento Performance Improvements with Client Side Optimizations
Magento Performance Improvements with Client Side OptimizationsPINT Inc
 
Review of Malcolm Gladwell's Outliers
Review of Malcolm Gladwell's OutliersReview of Malcolm Gladwell's Outliers
Review of Malcolm Gladwell's Outliersbrownab
 

Viewers also liked (18)

Lezersonderzoek jo steyaert - indigov op Kortom studiedag dag van het informa...
Lezersonderzoek jo steyaert - indigov op Kortom studiedag dag van het informa...Lezersonderzoek jo steyaert - indigov op Kortom studiedag dag van het informa...
Lezersonderzoek jo steyaert - indigov op Kortom studiedag dag van het informa...
 
Corp Web Risks and Concerns
Corp Web Risks and ConcernsCorp Web Risks and Concerns
Corp Web Risks and Concerns
 
California water footprint
California water footprintCalifornia water footprint
California water footprint
 
The Shift Home
The Shift HomeThe Shift Home
The Shift Home
 
Mis 14 FAVs de #Empleo2020
Mis 14 FAVs de #Empleo2020Mis 14 FAVs de #Empleo2020
Mis 14 FAVs de #Empleo2020
 
PHP, AJAX und XUL im Intranet
PHP, AJAX und XUL im IntranetPHP, AJAX und XUL im Intranet
PHP, AJAX und XUL im Intranet
 
why LG IPS technology?
why LG IPS technology?why LG IPS technology?
why LG IPS technology?
 
Rapid Evolution of Web Dev? aka Talking About The Web
Rapid Evolution of Web Dev? aka Talking About The WebRapid Evolution of Web Dev? aka Talking About The Web
Rapid Evolution of Web Dev? aka Talking About The Web
 
Irrigation of agricultural crops in California
Irrigation of agricultural crops in CaliforniaIrrigation of agricultural crops in California
Irrigation of agricultural crops in California
 
Thoughts on Defensive Development for Sitecore
Thoughts on Defensive Development for SitecoreThoughts on Defensive Development for Sitecore
Thoughts on Defensive Development for Sitecore
 
Omni-Proofing Your Organization
Omni-Proofing Your OrganizationOmni-Proofing Your Organization
Omni-Proofing Your Organization
 
Magento Performance Improvements with Client Side Optimizations
Magento Performance Improvements with Client Side OptimizationsMagento Performance Improvements with Client Side Optimizations
Magento Performance Improvements with Client Side Optimizations
 
The Romanticism of Things
The Romanticism of ThingsThe Romanticism of Things
The Romanticism of Things
 
Who Gamifies the Gamificators?
Who Gamifies the Gamificators?Who Gamifies the Gamificators?
Who Gamifies the Gamificators?
 
Desarrollemos el negocio de la #mHealth
Desarrollemos el negocio de la #mHealthDesarrollemos el negocio de la #mHealth
Desarrollemos el negocio de la #mHealth
 
Transmisión de Conocimiento en Apps Sanitarias
Transmisión de Conocimiento en Apps SanitariasTransmisión de Conocimiento en Apps Sanitarias
Transmisión de Conocimiento en Apps Sanitarias
 
Dr. House Design Thinking
Dr. House Design ThinkingDr. House Design Thinking
Dr. House Design Thinking
 
Review of Malcolm Gladwell's Outliers
Review of Malcolm Gladwell's OutliersReview of Malcolm Gladwell's Outliers
Review of Malcolm Gladwell's Outliers
 

Similar to Search As A Service

Switching search to SOLR
Switching search to SOLRSwitching search to SOLR
Switching search to SOLRPhase2
 
REST Introduction (PHP London)
REST Introduction (PHP London)REST Introduction (PHP London)
REST Introduction (PHP London)Paul James
 
Turbogears Presentation
Turbogears PresentationTurbogears Presentation
Turbogears Presentationdidip
 
Advanced SEO for Web Developers
Advanced SEO for Web DevelopersAdvanced SEO for Web Developers
Advanced SEO for Web DevelopersNathan Buggia
 
Web Scraping In Ruby Utosc 2009.Key
Web Scraping In Ruby Utosc 2009.KeyWeb Scraping In Ruby Utosc 2009.Key
Web Scraping In Ruby Utosc 2009.Keyjtzemp
 
Guide 6 - Tapping Into Your Website Configuration File.pdf
Guide 6 - Tapping Into Your Website Configuration File.pdfGuide 6 - Tapping Into Your Website Configuration File.pdf
Guide 6 - Tapping Into Your Website Configuration File.pdfpersuebusiness
 
The Django Web Application Framework 2
The Django Web Application Framework 2The Django Web Application Framework 2
The Django Web Application Framework 2fishwarter
 
The Django Web Application Framework 2
The Django Web Application Framework 2The Django Web Application Framework 2
The Django Web Application Framework 2fishwarter
 
The Django Web Application Framework 2
The Django Web Application Framework 2The Django Web Application Framework 2
The Django Web Application Framework 2fishwarter
 
The Django Web Application Framework 2
The Django Web Application Framework 2The Django Web Application Framework 2
The Django Web Application Framework 2fishwarter
 
Cwinters Intro To Rest And JerREST and Jersey Introductionsey
Cwinters Intro To Rest And JerREST and Jersey IntroductionseyCwinters Intro To Rest And JerREST and Jersey Introductionsey
Cwinters Intro To Rest And JerREST and Jersey Introductionseyelliando dias
 
Getting More Traffic From Search Advanced Seo For Developers Presentation
Getting More Traffic From Search  Advanced Seo For Developers PresentationGetting More Traffic From Search  Advanced Seo For Developers Presentation
Getting More Traffic From Search Advanced Seo For Developers PresentationSeo Indonesia
 
Open Source Web Technologies
Open Source Web TechnologiesOpen Source Web Technologies
Open Source Web TechnologiesAastha Sethi
 
Rapid Application Development with WSO2 Platform
Rapid Application Development with WSO2 PlatformRapid Application Development with WSO2 Platform
Rapid Application Development with WSO2 PlatformWSO2
 
Fast and Easy Website Tuneups
Fast and Easy Website TuneupsFast and Easy Website Tuneups
Fast and Easy Website TuneupsJeff Wisniewski
 
Practical catalyst
Practical catalystPractical catalyst
Practical catalystdwm042
 
High Performance Web Pages - 20 new best practices
High Performance Web Pages - 20 new best practicesHigh Performance Web Pages - 20 new best practices
High Performance Web Pages - 20 new best practicesStoyan Stefanov
 

Similar to Search As A Service (20)

Switching search to SOLR
Switching search to SOLRSwitching search to SOLR
Switching search to SOLR
 
REST Introduction (PHP London)
REST Introduction (PHP London)REST Introduction (PHP London)
REST Introduction (PHP London)
 
Boost and SEO
Boost and SEOBoost and SEO
Boost and SEO
 
Turbogears Presentation
Turbogears PresentationTurbogears Presentation
Turbogears Presentation
 
Advanced SEO for Web Developers
Advanced SEO for Web DevelopersAdvanced SEO for Web Developers
Advanced SEO for Web Developers
 
Web Scraping In Ruby Utosc 2009.Key
Web Scraping In Ruby Utosc 2009.KeyWeb Scraping In Ruby Utosc 2009.Key
Web Scraping In Ruby Utosc 2009.Key
 
Guide 6 - Tapping Into Your Website Configuration File.pdf
Guide 6 - Tapping Into Your Website Configuration File.pdfGuide 6 - Tapping Into Your Website Configuration File.pdf
Guide 6 - Tapping Into Your Website Configuration File.pdf
 
The Django Web Application Framework 2
The Django Web Application Framework 2The Django Web Application Framework 2
The Django Web Application Framework 2
 
The Django Web Application Framework 2
The Django Web Application Framework 2The Django Web Application Framework 2
The Django Web Application Framework 2
 
The Django Web Application Framework 2
The Django Web Application Framework 2The Django Web Application Framework 2
The Django Web Application Framework 2
 
The Django Web Application Framework 2
The Django Web Application Framework 2The Django Web Application Framework 2
The Django Web Application Framework 2
 
Cwinters Intro To Rest And JerREST and Jersey Introductionsey
Cwinters Intro To Rest And JerREST and Jersey IntroductionseyCwinters Intro To Rest And JerREST and Jersey Introductionsey
Cwinters Intro To Rest And JerREST and Jersey Introductionsey
 
Getting More Traffic From Search Advanced Seo For Developers Presentation
Getting More Traffic From Search  Advanced Seo For Developers PresentationGetting More Traffic From Search  Advanced Seo For Developers Presentation
Getting More Traffic From Search Advanced Seo For Developers Presentation
 
Fast by Default
Fast by DefaultFast by Default
Fast by Default
 
Open Source Web Technologies
Open Source Web TechnologiesOpen Source Web Technologies
Open Source Web Technologies
 
Rapid Application Development with WSO2 Platform
Rapid Application Development with WSO2 PlatformRapid Application Development with WSO2 Platform
Rapid Application Development with WSO2 Platform
 
Php frameworks
Php frameworksPhp frameworks
Php frameworks
 
Fast and Easy Website Tuneups
Fast and Easy Website TuneupsFast and Easy Website Tuneups
Fast and Easy Website Tuneups
 
Practical catalyst
Practical catalystPractical catalyst
Practical catalyst
 
High Performance Web Pages - 20 new best practices
High Performance Web Pages - 20 new best practicesHigh Performance Web Pages - 20 new best practices
High Performance Web Pages - 20 new best practices
 

Recently uploaded

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 

Recently uploaded (20)

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 

Search As A Service

  • 2. What's Marjory? A webservice for full­text indexing and   searching of documents Written in PHP  Based on Zend Framework  (Very) Roughly comparable to Solr  BSD­licensed, available on Google Code     
  • 3. How does Marjory work? Your application Sends search  Sends Document data Returns result in desired terms via GET or location via POST output format (default: XML) Marjory (ReST­based webservice) Stores document data Returns query Queries search in search engine results engine Search engine     (Default: Lucene)
  • 4. Features Search engine abstraction  use the engine that suits your needs, just write a   small adaptor class Zend_Search_Lucene built­in by default  Multiple search catalogs  Index many sites with one dedicated search server  Put all documents matching any criteria into   separate search indexes to speed up search    
  • 5. More features Two ways to index documents:  submit an XML snippet containing any content you   want to index or, just submit an URI (valid PHP stream resource)   and let Marjory extract the content from the  document HTML supported by default (for now)  add your own document parser class to extract plain text   from any other document format (or special markup  structures)    
  • 6. Even more features Index documents asynchronously using Dropr   as a messaging service Dropr: PHP­based durable messaging service  Example webservice and Dropr client included with   Marjory Application does not need to wait for document   retrieval, parsing and adding to the index More info: www.dropr.org     
  • 7. Latest additions Search results as a Dojo.Data compatible   JSON data source API exposure via JSON­RPC as alternative to   XML over ReST (experimental!)    
  • 8. How to add a catalog Send a POST request to:  http://marjory.example.com/rest/catalog/ Containing this XML snippet:  <add catalog=quot;MyGloriousCatalogquot; /> Et voilá, you got yourself a new search index     
  • 9. Adding a document Make a POST request to:  http://marjory.example.com/rest/add/ Send the document content as XML like this:  <add catalog=quot;defaultquot;> <doc uri=quot;MyUniqueDocumentIdquot;>     <field name=quot;titlequot;>Marjory: Search as a service</field>     <field name=quot;abstractquot;> An epic novel about full­text indexing in an SOA environment     </field>     <field name=quot;contentquot;>Lorem ipsum dolor sit amet... (to be continued)</field>  </doc> </add>    
  • 10. Adding a document, the easy way Or, if Marjory should retrieve and parse the   document: <add catalog=quot;defaultquot;>   <doc src=quot;http://my.website.tld/my/document.htmlquot; /> </add> If you have many and/or complex documents,   better use Dropr to send messages to Marjory    
  • 11. Searching for documents Make a GET request including the query terms:  http://marjory.example.com/rest/select?q=Marjory Additional parameters to...  Limit number of results  Include only specific fields in response  Specify a search catalog  Default catalog name: „default“ ­ who would have   guessed?    
  • 13. Looks familiar? Blatantly stolen from Solr :­)  Why reinvent the wheel?  Makes switching between the two projects easy   if need be Don't like it? Try JSON­RPC instead.     
  • 14. Access control No access control provided by Marjory  Use your webserver's authentication and ACL   capabilities There are currently no plans to add anything   built­in, unless someone convinces me  otherwise :­)    
  • 15. Things to do Fully unit­test the beast  Add a nice admin GUI (currently in progress)  Add other engines  Support more document formats out of the box  (PDF likely to be next addition) Fine­tuning (how about renaming or removing   catalogs, for example?)    
  • 16. Is it production­ready? Yes, and it's already being used on production   websites    
  • 17. That's all, folks! More information:  http://code.google.com/p/marjory/  http://www.dropr.org/  http://blog.wolff­hamburg.de/ 