SlideShare a Scribd company logo
LIGHTNING TALKS
Powered by Lucene:
IBM Content Analytics with Enterprise Search




Wolfgang Jung



Barcelona, 19th October 2011               © 2011 IBM Corporation
IBM Content Analytics with Enterprise Search



Our agenda in the next 10 minutes
LIGHTNING TALKS
    IBM is commited to Open Source
     – Decade of contribution to the community.

    Adoption of Apache Lucene to IBM Content Analytics
    – The Why, What & examples.

    Demonstration of IBM Content Analytics
    – see the development results live.
               Be enlightened !

2                                                  © 2011 IBM Corporation
IBM Content Analytics with Enterprise Search



IBM is commited to Open Source

    Decade of lineage and contributions to the open source community
      – Apache Hadoop.
          IBM‘s use of BigIndex for Search is mention in Chuck Lams‘s “Hadopp in Action”
      – Apache Derby
      – Apache Geronimo and Jetty
      – Eclipse: Founded by IBM, PMC Board of Directors
      – Apache UIMA: Unstructured Information Management Architecture.
          Developed by IBM, Contributed to Apache
      – Apache Jakarta: Lucene. PMC members
          Significant contributions via IBM Lucene Extension Library (ILEL)
      – Linux ... and more!


3                                                                                  © 2011 IBM Corporation
IBM Content Analytics with Enterprise Search



Adoption of Apache Lucene
to IBM Content Analytics with Enterprise Search
    The use of UIMA is existing since first release in 2005 of IBM OmniFind and later
    IBM Content Analytics, continued into today‘s IBM Content Analytics with
    Enterprise Search
         http://www-01.ibm.com/software/data/content-management/analytics/uima.html


    IBM‘s decision for the use of Lucene
      –Index is a common technology and better to improve
      –lower cost of maintenance
      –advantage in incremental indexing
      –extensibility



4                                                                                     © 2011 IBM Corporation
IBM Content Analytics with Enterprise Search



Adoption of Apache Lucene
to IBM Content Analytics with Enterprise Search
    IBM is a very active contributor. Look for PMC members:
      –Michael McCandless; Shai Erera; Doron Cohen
         http://lucene.apache.org/who.html

    IBM extended Lucene based on our needs. Two examples already
    contributed to community :
      –Query Parser
      –Facets




5                                                             © 2011 IBM Corporation
IBM Content Analytics with Enterprise Search



Adoption of Apache Lucene
to IBM Content Analytics with Enterprise Search
    On 13th December 2006, IBM and Yahoo! announced IBM OmniFind Yahoo! Edition, as
    “no-cost, entry level enterprise search product developed to help eliminate financial and
    technology barriers to intranet and Web search.”
         http://www-03.ibm.com/press/us/en/pressrelease/20767.wss

    This technology included Lucene as index technology and had full support by IBM
      – 45,000+ downloads from the website http://omnifind.ibm.yahoo.net
      – IBM support contracts for clients with “IBM Elite Support for OmniFind Yahoo Edition“
      – Below 15 incidents regarding index technology


    Technology is seen as success for IBM




6                                                                                               © 2011 IBM Corporation
IBM Content Analytics with Enterprise Search


Content Analytics generates new insights and aggregates key
findings gathered from large data volumes in a visualized form

                                                          Extracted Concept
                                                        Claimant: Soft Tissue Injury
                                                                                                     Automatic
                                                                                                     Visualizing
                                               Person    Injury    Body Part      Location     Results of concept evaluation
                                                                                                are displayed to the users
                                               Noun      Verb     Noun Phrase    Prep Phrase

                                               Claus sprained his ankle on the step




                                               Analysed documents
                                                 with identified concepts


       Sources of Information
       Internal (ECM, Files, DBMS, etc.)
        and External (Social, News, etc.)




7                                                                                                           © 2011 IBM Corporation
IBM Content Analytics with Enterprise Search




Rapid Insights from Automotive Complaints

    We will be using publically available data from the National Highway Traffic Safety Agency (NHTSA)
    to demonstrate how IBM Content Analytics can be used to identify problems with automobiles.
    NHTSA receives various reports about malfunctions, accidents, and other issues with automobiles
    from dealerships, repair facilities, and from the general public. NHTSA publishes the data at
    http://www.nhtsa.gov. For this demo we have created a collection from the NHTSA “complaints”
    data spanning several years ending in early 2010. We will show how this and similar data can be
    analyzed to arrive at rapid insights not possible by manually reading through the complaint records.




8                                                                                             © 2011 IBM Corporation
IBM Content Analytics with Enterprise Search



See Content Analytics live!




9                                              © 2011 IBM Corporation
IBM Content Analytics with Enterprise Search



See Content Analytics live!




10                                             © 2011 IBM Corporation
IBM Content Analytics with Enterprise Search




                                               Be enlightened !



11                                                                © 2011 IBM Corporation
LIGHTNING TALKS
Powered by Lucene:
IBM Content Analytics with Enterprise Search




Wolfgang Jung



Barcelona, 19th October 2011                   © 2011 IBM Corporation

More Related Content

What's hot

Ml, AI and IBM Watson - 101 for Business
Ml, AI  and IBM Watson - 101 for BusinessMl, AI  and IBM Watson - 101 for Business
Ml, AI and IBM Watson - 101 for Business
Jouko Poutanen
 
IBM Smarter Analytics
IBM Smarter AnalyticsIBM Smarter Analytics
IBM Smarter AnalyticsAdrian Turcu
 
IBM Cognitive platform: IBM Watson
IBM Cognitive platform: IBM WatsonIBM Cognitive platform: IBM Watson
IBM Cognitive platform: IBM Watson
Daniela Zuppini
 
Watson and Analytics
Watson and AnalyticsWatson and Analytics
Watson and AnalyticsJorge W. Hago
 
Ibm big data-platform
Ibm big data-platformIbm big data-platform
Ibm big data-platform
IBM Sverige
 
What Watson Explorer is and How it works
What Watson Explorer is and How it worksWhat Watson Explorer is and How it works
What Watson Explorer is and How it works
Virginia Fernandez
 
Oltre l’intelligenza Artificiale: agire alla velocità del pensiero
Oltre l’intelligenza Artificiale: agire alla velocità del pensieroOltre l’intelligenza Artificiale: agire alla velocità del pensiero
Oltre l’intelligenza Artificiale: agire alla velocità del pensiero
Jürgen Ambrosi
 
Watson AI platform for business - IBM Cloud
Watson AI platform for business - IBM CloudWatson AI platform for business - IBM Cloud
Watson AI platform for business - IBM Cloud
Sarmad Ibrahim
 
IBM Watson
IBM Watson IBM Watson
IBM Watson
Harshdeep Singh
 
Building Bots Using IBM Watson
Building Bots Using IBM WatsonBuilding Bots Using IBM Watson
Building Bots Using IBM Watson
Entrepreneur / Startup
 
Using Watson to build Cognitive IoT Apps on Bluemix
Using Watson to build Cognitive IoT Apps on BluemixUsing Watson to build Cognitive IoT Apps on Bluemix
Using Watson to build Cognitive IoT Apps on Bluemix
IBM
 
IBM Watson Explorer: Explore, analyze and interpret information for better bu...
IBM Watson Explorer: Explore, analyze and interpret information for better bu...IBM Watson Explorer: Explore, analyze and interpret information for better bu...
IBM Watson Explorer: Explore, analyze and interpret information for better bu...
Virginia Fernandez
 
AI future 2025 - IBM Watson Re
AI future 2025  - IBM Watson ReAI future 2025  - IBM Watson Re
AI future 2025 - IBM Watson Re
Sarmad Ibrahim
 
An AI Maturity Roadmap for Becoming a Data-Driven Organization
An AI Maturity Roadmap for Becoming a Data-Driven OrganizationAn AI Maturity Roadmap for Becoming a Data-Driven Organization
An AI Maturity Roadmap for Becoming a Data-Driven Organization
David Solomon
 
Modernizing the Analytics and Data Science Lifecycle for the Scalable Enterpr...
Modernizing the Analytics and Data Science Lifecycle for the Scalable Enterpr...Modernizing the Analytics and Data Science Lifecycle for the Scalable Enterpr...
Modernizing the Analytics and Data Science Lifecycle for the Scalable Enterpr...
Data Con LA
 
Big Data and Analytics: The IBM Perspective
Big Data and Analytics: The IBM PerspectiveBig Data and Analytics: The IBM Perspective
Big Data and Analytics: The IBM Perspective
The_IPA
 
Webinar - Comparative Analysis of Cloud based Machine Learning Platforms
Webinar - Comparative Analysis of Cloud based Machine Learning PlatformsWebinar - Comparative Analysis of Cloud based Machine Learning Platforms
Webinar - Comparative Analysis of Cloud based Machine Learning Platforms
BigDataCloud
 
Master the art of Data Science
Master the art of Data ScienceMaster the art of Data Science
Master the art of Data Science
InTTrust S.A.
 
Libera la potenza del Machine Learning
Libera la potenza del Machine LearningLibera la potenza del Machine Learning
Libera la potenza del Machine Learning
Jürgen Ambrosi
 
InTTrust -IBM Artificial Intelligence Event
InTTrust -IBM Artificial Intelligence  EventInTTrust -IBM Artificial Intelligence  Event
InTTrust -IBM Artificial Intelligence Event
Michail Pagiatakis
 

What's hot (20)

Ml, AI and IBM Watson - 101 for Business
Ml, AI  and IBM Watson - 101 for BusinessMl, AI  and IBM Watson - 101 for Business
Ml, AI and IBM Watson - 101 for Business
 
IBM Smarter Analytics
IBM Smarter AnalyticsIBM Smarter Analytics
IBM Smarter Analytics
 
IBM Cognitive platform: IBM Watson
IBM Cognitive platform: IBM WatsonIBM Cognitive platform: IBM Watson
IBM Cognitive platform: IBM Watson
 
Watson and Analytics
Watson and AnalyticsWatson and Analytics
Watson and Analytics
 
Ibm big data-platform
Ibm big data-platformIbm big data-platform
Ibm big data-platform
 
What Watson Explorer is and How it works
What Watson Explorer is and How it worksWhat Watson Explorer is and How it works
What Watson Explorer is and How it works
 
Oltre l’intelligenza Artificiale: agire alla velocità del pensiero
Oltre l’intelligenza Artificiale: agire alla velocità del pensieroOltre l’intelligenza Artificiale: agire alla velocità del pensiero
Oltre l’intelligenza Artificiale: agire alla velocità del pensiero
 
Watson AI platform for business - IBM Cloud
Watson AI platform for business - IBM CloudWatson AI platform for business - IBM Cloud
Watson AI platform for business - IBM Cloud
 
IBM Watson
IBM Watson IBM Watson
IBM Watson
 
Building Bots Using IBM Watson
Building Bots Using IBM WatsonBuilding Bots Using IBM Watson
Building Bots Using IBM Watson
 
Using Watson to build Cognitive IoT Apps on Bluemix
Using Watson to build Cognitive IoT Apps on BluemixUsing Watson to build Cognitive IoT Apps on Bluemix
Using Watson to build Cognitive IoT Apps on Bluemix
 
IBM Watson Explorer: Explore, analyze and interpret information for better bu...
IBM Watson Explorer: Explore, analyze and interpret information for better bu...IBM Watson Explorer: Explore, analyze and interpret information for better bu...
IBM Watson Explorer: Explore, analyze and interpret information for better bu...
 
AI future 2025 - IBM Watson Re
AI future 2025  - IBM Watson ReAI future 2025  - IBM Watson Re
AI future 2025 - IBM Watson Re
 
An AI Maturity Roadmap for Becoming a Data-Driven Organization
An AI Maturity Roadmap for Becoming a Data-Driven OrganizationAn AI Maturity Roadmap for Becoming a Data-Driven Organization
An AI Maturity Roadmap for Becoming a Data-Driven Organization
 
Modernizing the Analytics and Data Science Lifecycle for the Scalable Enterpr...
Modernizing the Analytics and Data Science Lifecycle for the Scalable Enterpr...Modernizing the Analytics and Data Science Lifecycle for the Scalable Enterpr...
Modernizing the Analytics and Data Science Lifecycle for the Scalable Enterpr...
 
Big Data and Analytics: The IBM Perspective
Big Data and Analytics: The IBM PerspectiveBig Data and Analytics: The IBM Perspective
Big Data and Analytics: The IBM Perspective
 
Webinar - Comparative Analysis of Cloud based Machine Learning Platforms
Webinar - Comparative Analysis of Cloud based Machine Learning PlatformsWebinar - Comparative Analysis of Cloud based Machine Learning Platforms
Webinar - Comparative Analysis of Cloud based Machine Learning Platforms
 
Master the art of Data Science
Master the art of Data ScienceMaster the art of Data Science
Master the art of Data Science
 
Libera la potenza del Machine Learning
Libera la potenza del Machine LearningLibera la potenza del Machine Learning
Libera la potenza del Machine Learning
 
InTTrust -IBM Artificial Intelligence Event
InTTrust -IBM Artificial Intelligence  EventInTTrust -IBM Artificial Intelligence  Event
InTTrust -IBM Artificial Intelligence Event
 

Similar to Lightning talk :IBM Content Analytics with Enterprise Search - Wolfgang Jung

Smw+ semantic enterprise wiki en_153
Smw+ semantic enterprise wiki en_153Smw+ semantic enterprise wiki en_153
Smw+ semantic enterprise wiki en_153
Semantic Enterprise Wiki SMWplus
 
"IBMs Open Source Strategy" by Adam Jollans @ eLiberatica 2009
"IBMs Open Source Strategy" by Adam Jollans @ eLiberatica 2009"IBMs Open Source Strategy" by Adam Jollans @ eLiberatica 2009
"IBMs Open Source Strategy" by Adam Jollans @ eLiberatica 2009
eLiberatica
 
Smw+tutorial berlin-fall-2011
Smw+tutorial berlin-fall-2011Smw+tutorial berlin-fall-2011
Smw+tutorial berlin-fall-2011
Semantic Enterprise Wiki SMWplus
 
Flex 4.5 and mobile development
Flex 4.5 and mobile developmentFlex 4.5 and mobile development
Flex 4.5 and mobile developmentMichael Chaize
 
Deploying Enterprise Search in PLM Context with Aras
Deploying Enterprise Search in PLM Context with ArasDeploying Enterprise Search in PLM Context with Aras
Deploying Enterprise Search in PLM Context with Aras
Aras
 
Employ the Cloud for Efficient Content Analytics - 10 november 2011
Employ the Cloud for Efficient Content Analytics - 10 november 2011Employ the Cloud for Efficient Content Analytics - 10 november 2011
Employ the Cloud for Efficient Content Analytics - 10 november 2011Samir Batla
 
Rosinski ibm ai overview with several examples of projects in the media and l...
Rosinski ibm ai overview with several examples of projects in the media and l...Rosinski ibm ai overview with several examples of projects in the media and l...
Rosinski ibm ai overview with several examples of projects in the media and l...
FIAT/IFTA
 
Accelerating Apache MXNet Models on Apple Platforms Using Core ML - MCL311 - ...
Accelerating Apache MXNet Models on Apple Platforms Using Core ML - MCL311 - ...Accelerating Apache MXNet Models on Apple Platforms Using Core ML - MCL311 - ...
Accelerating Apache MXNet Models on Apple Platforms Using Core ML - MCL311 - ...
Amazon Web Services
 
Breizh camp adobe flex et les mobiles
Breizh camp   adobe flex et les mobilesBreizh camp   adobe flex et les mobiles
Breizh camp adobe flex et les mobiles
Michael Chaize
 
Splunk in 60 Minutes | Splunk Tutorial For Beginners | Splunk Training | Splu...
Splunk in 60 Minutes | Splunk Tutorial For Beginners | Splunk Training | Splu...Splunk in 60 Minutes | Splunk Tutorial For Beginners | Splunk Training | Splu...
Splunk in 60 Minutes | Splunk Tutorial For Beginners | Splunk Training | Splu...
Edureka!
 
Open source, commercial or a co-existance strategy
Open source, commercial or a co-existance strategyOpen source, commercial or a co-existance strategy
Open source, commercial or a co-existance strategy
IBM Rational software
 
Starting mobile development
Starting mobile developmentStarting mobile development
Starting mobile development
Mihai Corlan
 
Mariana Alupului Inventions
Mariana Alupului InventionsMariana Alupului Inventions
Mariana Alupului Inventionsmalupului
 
Adobe flash platform java
Adobe flash platform javaAdobe flash platform java
Adobe flash platform javaCh'ti JUG
 
Adobe flash platform java
Adobe flash platform javaAdobe flash platform java
Adobe flash platform java
Michael Chaize
 
Native extensions webinar
Native extensions webinarNative extensions webinar
Native extensions webinarimmanuelnoel
 
Jax2001 adobe keynote
Jax2001 adobe keynoteJax2001 adobe keynote
Jax2001 adobe keynote
Michael Chaize
 
The IBM Rational Insight Reporting Solution
The IBM Rational Insight Reporting SolutionThe IBM Rational Insight Reporting Solution
The IBM Rational Insight Reporting Solution
Marc Nehme
 
Convergence of mobility, analytics, social and cloud to drive innovation
Convergence of mobility, analytics, social and cloud to drive innovationConvergence of mobility, analytics, social and cloud to drive innovation
Convergence of mobility, analytics, social and cloud to drive innovation
Kerrie Holley
 
Inform: Targeting the Interest Graph
Inform: Targeting the Interest GraphInform: Targeting the Interest Graph
Inform: Targeting the Interest Graph
Vital.AI
 

Similar to Lightning talk :IBM Content Analytics with Enterprise Search - Wolfgang Jung (20)

Smw+ semantic enterprise wiki en_153
Smw+ semantic enterprise wiki en_153Smw+ semantic enterprise wiki en_153
Smw+ semantic enterprise wiki en_153
 
"IBMs Open Source Strategy" by Adam Jollans @ eLiberatica 2009
"IBMs Open Source Strategy" by Adam Jollans @ eLiberatica 2009"IBMs Open Source Strategy" by Adam Jollans @ eLiberatica 2009
"IBMs Open Source Strategy" by Adam Jollans @ eLiberatica 2009
 
Smw+tutorial berlin-fall-2011
Smw+tutorial berlin-fall-2011Smw+tutorial berlin-fall-2011
Smw+tutorial berlin-fall-2011
 
Flex 4.5 and mobile development
Flex 4.5 and mobile developmentFlex 4.5 and mobile development
Flex 4.5 and mobile development
 
Deploying Enterprise Search in PLM Context with Aras
Deploying Enterprise Search in PLM Context with ArasDeploying Enterprise Search in PLM Context with Aras
Deploying Enterprise Search in PLM Context with Aras
 
Employ the Cloud for Efficient Content Analytics - 10 november 2011
Employ the Cloud for Efficient Content Analytics - 10 november 2011Employ the Cloud for Efficient Content Analytics - 10 november 2011
Employ the Cloud for Efficient Content Analytics - 10 november 2011
 
Rosinski ibm ai overview with several examples of projects in the media and l...
Rosinski ibm ai overview with several examples of projects in the media and l...Rosinski ibm ai overview with several examples of projects in the media and l...
Rosinski ibm ai overview with several examples of projects in the media and l...
 
Accelerating Apache MXNet Models on Apple Platforms Using Core ML - MCL311 - ...
Accelerating Apache MXNet Models on Apple Platforms Using Core ML - MCL311 - ...Accelerating Apache MXNet Models on Apple Platforms Using Core ML - MCL311 - ...
Accelerating Apache MXNet Models on Apple Platforms Using Core ML - MCL311 - ...
 
Breizh camp adobe flex et les mobiles
Breizh camp   adobe flex et les mobilesBreizh camp   adobe flex et les mobiles
Breizh camp adobe flex et les mobiles
 
Splunk in 60 Minutes | Splunk Tutorial For Beginners | Splunk Training | Splu...
Splunk in 60 Minutes | Splunk Tutorial For Beginners | Splunk Training | Splu...Splunk in 60 Minutes | Splunk Tutorial For Beginners | Splunk Training | Splu...
Splunk in 60 Minutes | Splunk Tutorial For Beginners | Splunk Training | Splu...
 
Open source, commercial or a co-existance strategy
Open source, commercial or a co-existance strategyOpen source, commercial or a co-existance strategy
Open source, commercial or a co-existance strategy
 
Starting mobile development
Starting mobile developmentStarting mobile development
Starting mobile development
 
Mariana Alupului Inventions
Mariana Alupului InventionsMariana Alupului Inventions
Mariana Alupului Inventions
 
Adobe flash platform java
Adobe flash platform javaAdobe flash platform java
Adobe flash platform java
 
Adobe flash platform java
Adobe flash platform javaAdobe flash platform java
Adobe flash platform java
 
Native extensions webinar
Native extensions webinarNative extensions webinar
Native extensions webinar
 
Jax2001 adobe keynote
Jax2001 adobe keynoteJax2001 adobe keynote
Jax2001 adobe keynote
 
The IBM Rational Insight Reporting Solution
The IBM Rational Insight Reporting SolutionThe IBM Rational Insight Reporting Solution
The IBM Rational Insight Reporting Solution
 
Convergence of mobility, analytics, social and cloud to drive innovation
Convergence of mobility, analytics, social and cloud to drive innovationConvergence of mobility, analytics, social and cloud to drive innovation
Convergence of mobility, analytics, social and cloud to drive innovation
 
Inform: Targeting the Interest Graph
Inform: Targeting the Interest GraphInform: Targeting the Interest Graph
Inform: Targeting the Interest Graph
 

More from lucenerevolution

Text Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and LuceneText Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and Lucene
lucenerevolution
 
State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here! State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here!
lucenerevolution
 
Building Client-side Search Applications with Solr
Building Client-side Search Applications with SolrBuilding Client-side Search Applications with Solr
Building Client-side Search Applications with Solr
lucenerevolution
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applications
lucenerevolution
 
Scaling Solr with SolrCloud
Scaling Solr with SolrCloudScaling Solr with SolrCloud
Scaling Solr with SolrCloud
lucenerevolution
 
Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersAdministering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud Clusters
lucenerevolution
 
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and ParboiledImplementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
lucenerevolution
 
Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs
lucenerevolution
 
Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchlucenerevolution
 
Real-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and StormReal-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and Storm
lucenerevolution
 
Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?
lucenerevolution
 
Schemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST APISchemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST API
lucenerevolution
 
High Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with LuceneHigh Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with Lucene
lucenerevolution
 
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMText Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
lucenerevolution
 
Faceted Search with Lucene
Faceted Search with LuceneFaceted Search with Lucene
Faceted Search with Lucene
lucenerevolution
 
Recent Additions to Lucene Arsenal
Recent Additions to Lucene ArsenalRecent Additions to Lucene Arsenal
Recent Additions to Lucene Arsenal
lucenerevolution
 
Turning search upside down
Turning search upside downTurning search upside down
Turning search upside down
lucenerevolution
 
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
lucenerevolution
 
Shrinking the haystack wes caldwell - final
Shrinking the haystack   wes caldwell - finalShrinking the haystack   wes caldwell - final
Shrinking the haystack wes caldwell - finallucenerevolution
 

More from lucenerevolution (20)

Text Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and LuceneText Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and Lucene
 
State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here! State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here!
 
Search at Twitter
Search at TwitterSearch at Twitter
Search at Twitter
 
Building Client-side Search Applications with Solr
Building Client-side Search Applications with SolrBuilding Client-side Search Applications with Solr
Building Client-side Search Applications with Solr
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applications
 
Scaling Solr with SolrCloud
Scaling Solr with SolrCloudScaling Solr with SolrCloud
Scaling Solr with SolrCloud
 
Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersAdministering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud Clusters
 
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and ParboiledImplementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
 
Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs
 
Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic search
 
Real-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and StormReal-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and Storm
 
Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?
 
Schemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST APISchemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST API
 
High Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with LuceneHigh Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with Lucene
 
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMText Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
 
Faceted Search with Lucene
Faceted Search with LuceneFaceted Search with Lucene
Faceted Search with Lucene
 
Recent Additions to Lucene Arsenal
Recent Additions to Lucene ArsenalRecent Additions to Lucene Arsenal
Recent Additions to Lucene Arsenal
 
Turning search upside down
Turning search upside downTurning search upside down
Turning search upside down
 
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
 
Shrinking the haystack wes caldwell - final
Shrinking the haystack   wes caldwell - finalShrinking the haystack   wes caldwell - final
Shrinking the haystack wes caldwell - final
 

Recently uploaded

Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 

Recently uploaded (20)

Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 

Lightning talk :IBM Content Analytics with Enterprise Search - Wolfgang Jung

  • 1. LIGHTNING TALKS Powered by Lucene: IBM Content Analytics with Enterprise Search Wolfgang Jung Barcelona, 19th October 2011 © 2011 IBM Corporation
  • 2. IBM Content Analytics with Enterprise Search Our agenda in the next 10 minutes LIGHTNING TALKS IBM is commited to Open Source – Decade of contribution to the community. Adoption of Apache Lucene to IBM Content Analytics – The Why, What & examples. Demonstration of IBM Content Analytics – see the development results live. Be enlightened ! 2 © 2011 IBM Corporation
  • 3. IBM Content Analytics with Enterprise Search IBM is commited to Open Source Decade of lineage and contributions to the open source community – Apache Hadoop. IBM‘s use of BigIndex for Search is mention in Chuck Lams‘s “Hadopp in Action” – Apache Derby – Apache Geronimo and Jetty – Eclipse: Founded by IBM, PMC Board of Directors – Apache UIMA: Unstructured Information Management Architecture. Developed by IBM, Contributed to Apache – Apache Jakarta: Lucene. PMC members Significant contributions via IBM Lucene Extension Library (ILEL) – Linux ... and more! 3 © 2011 IBM Corporation
  • 4. IBM Content Analytics with Enterprise Search Adoption of Apache Lucene to IBM Content Analytics with Enterprise Search The use of UIMA is existing since first release in 2005 of IBM OmniFind and later IBM Content Analytics, continued into today‘s IBM Content Analytics with Enterprise Search http://www-01.ibm.com/software/data/content-management/analytics/uima.html IBM‘s decision for the use of Lucene –Index is a common technology and better to improve –lower cost of maintenance –advantage in incremental indexing –extensibility 4 © 2011 IBM Corporation
  • 5. IBM Content Analytics with Enterprise Search Adoption of Apache Lucene to IBM Content Analytics with Enterprise Search IBM is a very active contributor. Look for PMC members: –Michael McCandless; Shai Erera; Doron Cohen http://lucene.apache.org/who.html IBM extended Lucene based on our needs. Two examples already contributed to community : –Query Parser –Facets 5 © 2011 IBM Corporation
  • 6. IBM Content Analytics with Enterprise Search Adoption of Apache Lucene to IBM Content Analytics with Enterprise Search On 13th December 2006, IBM and Yahoo! announced IBM OmniFind Yahoo! Edition, as “no-cost, entry level enterprise search product developed to help eliminate financial and technology barriers to intranet and Web search.” http://www-03.ibm.com/press/us/en/pressrelease/20767.wss This technology included Lucene as index technology and had full support by IBM – 45,000+ downloads from the website http://omnifind.ibm.yahoo.net – IBM support contracts for clients with “IBM Elite Support for OmniFind Yahoo Edition“ – Below 15 incidents regarding index technology Technology is seen as success for IBM 6 © 2011 IBM Corporation
  • 7. IBM Content Analytics with Enterprise Search Content Analytics generates new insights and aggregates key findings gathered from large data volumes in a visualized form Extracted Concept Claimant: Soft Tissue Injury Automatic Visualizing Person Injury Body Part Location Results of concept evaluation are displayed to the users Noun Verb Noun Phrase Prep Phrase Claus sprained his ankle on the step Analysed documents with identified concepts Sources of Information Internal (ECM, Files, DBMS, etc.) and External (Social, News, etc.) 7 © 2011 IBM Corporation
  • 8. IBM Content Analytics with Enterprise Search Rapid Insights from Automotive Complaints We will be using publically available data from the National Highway Traffic Safety Agency (NHTSA) to demonstrate how IBM Content Analytics can be used to identify problems with automobiles. NHTSA receives various reports about malfunctions, accidents, and other issues with automobiles from dealerships, repair facilities, and from the general public. NHTSA publishes the data at http://www.nhtsa.gov. For this demo we have created a collection from the NHTSA “complaints” data spanning several years ending in early 2010. We will show how this and similar data can be analyzed to arrive at rapid insights not possible by manually reading through the complaint records. 8 © 2011 IBM Corporation
  • 9. IBM Content Analytics with Enterprise Search See Content Analytics live! 9 © 2011 IBM Corporation
  • 10. IBM Content Analytics with Enterprise Search See Content Analytics live! 10 © 2011 IBM Corporation
  • 11. IBM Content Analytics with Enterprise Search Be enlightened ! 11 © 2011 IBM Corporation
  • 12. LIGHTNING TALKS Powered by Lucene: IBM Content Analytics with Enterprise Search Wolfgang Jung Barcelona, 19th October 2011 © 2011 IBM Corporation