SlideShare a Scribd company logo
Open Source Search for the
Enterprise
Charlie Hull
Managing Director, Flax
3rd
November 2010
OVUM Briefing, Search Across the Enterprise
charlie@flax.co.uk
www.flax.co.uk/blog
+44 (0) 8700 118334
Twitter: @FlaxSearch
Search engine specialists with decades of experience
Developers, innovators and strategists
Based in Cambridge, UK
Technology agnostic – but open source exponents
Recently selected as UK Authorized Partner by Lucid
Imagination
Customers include Mydeco, NLA, Durrants Ltd, Financial
Times, MediaMiser, MySkreen, Accenture, University of
Cambridge
Recently asked to present at British Computer Society
and Lucene Revolution conferences
Who are Flax?
“Open-source software (OSS) is computer
software that is available in source code form
for which the source code and certain other
rights normally reserved for copyright holders
are provided under a software license that
permits users to study, change, and improve
the software. […] Some open source software is
available within the public domain” (Wikipedia)
What is open source?
“Open-source software (OSS) is computer
software that is available in source code form
for which the source code and certain other
rights normally reserved for copyright holders
are provided under a software license that
permits users to study, change, and improve
the software. […] Some open source software is
available within the public domain” (Wikipedia)
What is open source?
It's the work of amateur developers
Myths about open source
It's the work of amateur developers
If I use open source, I have to open up my
software/servers/network to all and sundry
Myths about open source
It's the work of amateur developers
If I use open source, I have to open up my
software/servers/network to all and sundry
Open source software isn't reliable or
scalable
Myths about open source
It's the work of amateur developers
If I use open source, I have to open up my
software/servers/network to all and sundry
Open source software isn't reliable or
scalable
It's free
Myths about open source
It's the work of amateur developers
If I use open source, I have to open up my
software/servers/network to all and sundry
Open source software isn't reliable or
scalable
It's free
It's unsupported
Myths about open source
Open source search software
Apache Lucene and Solr are trademarks of The Apache Software Foundation
- Flexible licensing
- Vector space model
- Java and other languages
- Well known and supportedApache Lucene and Solr are trademarks of The Apache Software Foundation
Open source search software
Apache Lucene and Solr are trademarks of The Apache Software Foundation
- The successor to Muscat
- Bayesian probabilistic ranking
- C/C++ with language bindings
- Highly accurate & scalable
- Flexible licensing
- Vector space model
- Java and other languages
- Well known and supportedApache Lucene and Solr are trademarks of The Apache Software Foundation
Open source search software
Apache Lucene and Solr are trademarks of The Apache Software Foundation
- The successor to Muscat
- Bayesian probabilistic ranking
- C/C++ with language bindings
- Highly accurate & scalable
- Flexible licensing
- Vector space model
- Java and other languages
- Well known and supported
And more....
Apache Lucene and Solr are trademarks of The Apache Software Foundation
Some examples
http://www.nla-clipshare.com
Newspaper Licensing Agency – NLA Clipshare
20 million newspaper stories
6500 users
Content from every major newspaper (and
most regionals)
Used by journalists, clippings agencies,
media monitors
Replacing internal systems at major
newspapers
Some examples
http://www.nla-clipshare.com
Newspaper Licensing Agency – NLA Clipshare
20 million newspaper stories
6500 users
Content from every major newspaper (and
most regionals)
Used by journalists, clippings agencies,
media monitors
Replacing internal systems at major
newspapers
One of very few ways to search content
from all the papers within hours of
publication
Some examples
Financial Times – press cuttings
Web Service for easy integration
XML source data
Faceted search
Area filters (whole article, body, headline,
byline or any combination)
Synonyms, spelling suggestions
http://presscuttings.ft.com
Some examples
Financial Times – press cuttings
Web Service for easy integration
XML source data
Faceted search
Area filters (whole article, body, headline,
byline or any combination)
Synonyms, spelling suggestions
Built from scratch in a fortnight
Designed as a prototype, scaled to
production use without significant change
http://presscuttings.ft.com
Some examples
Durrants Ltd. Media monitoring platform
Thousands of client search profiles
Hundreds of thousands of articles per day
Complex publication heirarchy
Established pipeline
Solution
Flexible query language allows OCR
errors, punctuation, fuzzy matching,
weighting
Supports features of previous engine
Scalable master-slave architecture
Some examples
Durrants Ltd. Media monitoring platform
Thousands of client search profiles
Hundreds of thousands of articles per day
Complex publication heirarchy
Established pipeline
Solution
Flexible query language allows OCR
errors, punctuation, fuzzy matching,
weighting
Supports features of previous engine
Scalable master-slave architecture
Accuracy improved in some cases from 95%
rejected to 95% accepted
Hardware budget 15% of previous system
Some examples
(Unnamed multinational radio suppliers)
Intranet search
12 million documents
Multiple formats – Office, PDF, HTML...
User and group-based security (LDAP)
Faceted search
Users can 'tag' interesting documents – for
example to identify a 'reference' version
Some examples
(Unnamed multinational radio suppliers)
Intranet search
12 million documents
Multiple formats – Office, PDF, HTML...
User and group-based security (LDAP)
Faceted search
Users can 'tag' interesting documents – for
example to identify a 'reference' version
Open source chosen because of significant
cost advantage – commercial solutions
uneconomic at this scale
A look at Lucene & Solr
Among the top 15 open source projects
Installations at over 4,000 companies
Downloads have grown nearly 10x over the past three
years
Over 7,000 downloads a day.
A look at Lucene & Solr
Among the top 15 open source projects
Installations at over 4,000 companies
Downloads have grown nearly 10x over the past three
years
Over 7,000 downloads a day.
USA based
Employs 9 out of 15 top Lucene committers
Offers training, consulting and up to 24x7
support
Developing value-add software
A look at Lucene & Solr
Among the top 15 open source projects
Installations at over 4,000 companies
Downloads have grown nearly 10x over the past three
years
Over 7,000 downloads a day.
USA based
Employs 9 out of 15 top Lucene committers
Offers training, consulting and up to 24x7
support
Developing value-add software
Flax are UK partners & resellers
Lucid Works Enterprise
Who are Lucid working with?
Some Lucene & Solr numbers
LinkedIn – 30 million users
Internet Archive – a billion indexed pages
Salesforce.com – 8 terabytes of searchable data
Twitter – a billion queries a day
Why open source search?
Flexible, extendable
Why open source search?
Flexible, extendable
Powerful & scalable
Why open source search?
Flexible, extendable
Powerful & scalable
Lower cost, especially when planning for growth
Why open source search?
Flexible, extendable
Powerful & scalable
Lower cost, especially when planning for growth
Commercial support available as necessary
Why open source search?
Flexible, extendable
Powerful & scalable
Lower cost, especially when planning for growth
Commercial support available as necessary
- Freedom to innovate
Looking to the future
Looking to the future
More and more content including social media
Looking to the future
More and more content including social media
Multiple delivery platforms
Looking to the future
More and more content including social media
Multiple delivery platforms
Search-powered applications
Looking to the future
More and more content including social media
Multiple delivery platforms
Search-powered applications
Cloud computing
Looking to the future
More and more content including social media
Multiple delivery platforms
Search-powered applications
Cloud computing
More use of entity extraction & sentiment analysis
Looking to the future
More and more content including social media
Multiple delivery platforms
Search-powered applications
Cloud computing
More use of entity extraction & sentiment analysis
Search no longer a bolt-on, but a
platform for innovation
Looking to the future
More and more content including social media
Multiple delivery platforms
Search-powered applications
Cloud computing
More use of entity extraction & sentiment analysis
Search no longer a bolt-on, but a
platform for innovation
Open source no longer an outsider,
but the obvious choice
Thankyou!
Any questions?
charlie@flax.co.uk
www.flax.co.uk/blog
+44 (0) 8700 118334
Twitter: @FlaxSearch

More Related Content

What's hot

Linked Data and OCLC
Linked Data and OCLCLinked Data and OCLC
Linked Data and OCLC
Richard Wallis
 
GraphDB Connectors – Powering Complex SPARQL Queries
GraphDB Connectors – Powering Complex SPARQL QueriesGraphDB Connectors – Powering Complex SPARQL Queries
GraphDB Connectors – Powering Complex SPARQL Queries
Marin Dimitrov
 
A whirlwind tour of graph databases
A whirlwind tour of graph databasesA whirlwind tour of graph databases
A whirlwind tour of graph databases
jexp
 
Fire kit ios (r-baldwin)
Fire kit ios (r-baldwin)Fire kit ios (r-baldwin)
Fire kit ios (r-baldwin)
DevDays
 
On-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudOn-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the Cloud
Marin Dimitrov
 
Schema.org: Where did that come from!
Schema.org: Where did that come from!Schema.org: Where did that come from!
Schema.org: Where did that come from!
Richard Wallis
 
GraphQL - The new "Lingua Franca" for API-Development
GraphQL - The new "Lingua Franca" for API-DevelopmentGraphQL - The new "Lingua Franca" for API-Development
GraphQL - The new "Lingua Franca" for API-Development
jexp
 
Neo4j-Databridge: Enterprise-scale ETL for Neo4j
Neo4j-Databridge: Enterprise-scale ETL for Neo4jNeo4j-Databridge: Enterprise-scale ETL for Neo4j
Neo4j-Databridge: Enterprise-scale ETL for Neo4j
Neo4j
 
Intro to Cypher
Intro to CypherIntro to Cypher
Intro to Cypher
Neo4j
 
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFiReal-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Timothy Spann
 
Contextual Computing - Knowledge Graphs & Web of Entities
Contextual Computing - Knowledge Graphs & Web of EntitiesContextual Computing - Knowledge Graphs & Web of Entities
Contextual Computing - Knowledge Graphs & Web of Entities
Richard Wallis
 
Text Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-ServiceText Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-Service
Marin Dimitrov
 
Infrastructure crossroads... and the way we walked them in DKPro
Infrastructure crossroads... and the way we walked them in DKProInfrastructure crossroads... and the way we walked them in DKPro
Infrastructure crossroads... and the way we walked them in DKPro
openminted_eu
 
Getting the Most out of Your Translation Memories (TM-Town ProZ Webinar April...
Getting the Most out of Your Translation Memories (TM-Town ProZ Webinar April...Getting the Most out of Your Translation Memories (TM-Town ProZ Webinar April...
Getting the Most out of Your Translation Memories (TM-Town ProZ Webinar April...
Kevin Dias
 
Visual Ontology Modeling for Domain Experts and Business Users with metaphactory
Visual Ontology Modeling for Domain Experts and Business Users with metaphactoryVisual Ontology Modeling for Domain Experts and Business Users with metaphactory
Visual Ontology Modeling for Domain Experts and Business Users with metaphactory
Peter Haase
 
CodeOne 2018 - Microservices in action at the Dutch National Police
CodeOne 2018 - Microservices in action at the Dutch National PoliceCodeOne 2018 - Microservices in action at the Dutch National Police
CodeOne 2018 - Microservices in action at the Dutch National Police
Bert Jan Schrijver
 
Talis Platform: A Linked Data Engine
Talis Platform: A Linked Data EngineTalis Platform: A Linked Data Engine
Talis Platform: A Linked Data Engine
Leigh Dodds
 
Ephedra: efficiently combining RDF data and services using SPARQL federation
Ephedra: efficiently combining RDF data and services using SPARQL federationEphedra: efficiently combining RDF data and services using SPARQL federation
Ephedra: efficiently combining RDF data and services using SPARQL federation
Peter Haase
 
The Kasabi Information Marketplace
The Kasabi Information MarketplaceThe Kasabi Information Marketplace
The Kasabi Information Marketplace
Knud Möller
 
Full Stack Graph in the Cloud
Full Stack Graph in the CloudFull Stack Graph in the Cloud
Full Stack Graph in the Cloud
Neo4j
 

What's hot (20)

Linked Data and OCLC
Linked Data and OCLCLinked Data and OCLC
Linked Data and OCLC
 
GraphDB Connectors – Powering Complex SPARQL Queries
GraphDB Connectors – Powering Complex SPARQL QueriesGraphDB Connectors – Powering Complex SPARQL Queries
GraphDB Connectors – Powering Complex SPARQL Queries
 
A whirlwind tour of graph databases
A whirlwind tour of graph databasesA whirlwind tour of graph databases
A whirlwind tour of graph databases
 
Fire kit ios (r-baldwin)
Fire kit ios (r-baldwin)Fire kit ios (r-baldwin)
Fire kit ios (r-baldwin)
 
On-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudOn-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the Cloud
 
Schema.org: Where did that come from!
Schema.org: Where did that come from!Schema.org: Where did that come from!
Schema.org: Where did that come from!
 
GraphQL - The new "Lingua Franca" for API-Development
GraphQL - The new "Lingua Franca" for API-DevelopmentGraphQL - The new "Lingua Franca" for API-Development
GraphQL - The new "Lingua Franca" for API-Development
 
Neo4j-Databridge: Enterprise-scale ETL for Neo4j
Neo4j-Databridge: Enterprise-scale ETL for Neo4jNeo4j-Databridge: Enterprise-scale ETL for Neo4j
Neo4j-Databridge: Enterprise-scale ETL for Neo4j
 
Intro to Cypher
Intro to CypherIntro to Cypher
Intro to Cypher
 
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFiReal-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
 
Contextual Computing - Knowledge Graphs & Web of Entities
Contextual Computing - Knowledge Graphs & Web of EntitiesContextual Computing - Knowledge Graphs & Web of Entities
Contextual Computing - Knowledge Graphs & Web of Entities
 
Text Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-ServiceText Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-Service
 
Infrastructure crossroads... and the way we walked them in DKPro
Infrastructure crossroads... and the way we walked them in DKProInfrastructure crossroads... and the way we walked them in DKPro
Infrastructure crossroads... and the way we walked them in DKPro
 
Getting the Most out of Your Translation Memories (TM-Town ProZ Webinar April...
Getting the Most out of Your Translation Memories (TM-Town ProZ Webinar April...Getting the Most out of Your Translation Memories (TM-Town ProZ Webinar April...
Getting the Most out of Your Translation Memories (TM-Town ProZ Webinar April...
 
Visual Ontology Modeling for Domain Experts and Business Users with metaphactory
Visual Ontology Modeling for Domain Experts and Business Users with metaphactoryVisual Ontology Modeling for Domain Experts and Business Users with metaphactory
Visual Ontology Modeling for Domain Experts and Business Users with metaphactory
 
CodeOne 2018 - Microservices in action at the Dutch National Police
CodeOne 2018 - Microservices in action at the Dutch National PoliceCodeOne 2018 - Microservices in action at the Dutch National Police
CodeOne 2018 - Microservices in action at the Dutch National Police
 
Talis Platform: A Linked Data Engine
Talis Platform: A Linked Data EngineTalis Platform: A Linked Data Engine
Talis Platform: A Linked Data Engine
 
Ephedra: efficiently combining RDF data and services using SPARQL federation
Ephedra: efficiently combining RDF data and services using SPARQL federationEphedra: efficiently combining RDF data and services using SPARQL federation
Ephedra: efficiently combining RDF data and services using SPARQL federation
 
The Kasabi Information Marketplace
The Kasabi Information MarketplaceThe Kasabi Information Marketplace
The Kasabi Information Marketplace
 
Full Stack Graph in the Cloud
Full Stack Graph in the CloudFull Stack Graph in the Cloud
Full Stack Graph in the Cloud
 

Similar to Flax ovum search-across_the_enterprise

Republica 2014 open-source_in_the_wild
Republica 2014 open-source_in_the_wildRepublica 2014 open-source_in_the_wild
Republica 2014 open-source_in_the_wild
Acquia
 
Workshop slides - Introduction to AtoM and Archivematica
Workshop slides - Introduction to AtoM and ArchivematicaWorkshop slides - Introduction to AtoM and Archivematica
Workshop slides - Introduction to AtoM and Archivematica
Artefactual Systems - Archivematica
 
National Archives of Norway - AtoM and Archivematica intro workshop
National Archives of Norway - AtoM and Archivematica intro workshopNational Archives of Norway - AtoM and Archivematica intro workshop
National Archives of Norway - AtoM and Archivematica intro workshop
Artefactual Systems - AtoM
 
Opensource
OpensourceOpensource
Opensource
digitaldan
 
Open Source Movement
Open Source MovementOpen Source Movement
Open Source Movement
Mesut Yılmaz
 
Why we need oa infrastructure - STM Association Beyond Open Access Seminar
Why we need oa infrastructure - STM Association Beyond Open Access SeminarWhy we need oa infrastructure - STM Association Beyond Open Access Seminar
Why we need oa infrastructure - STM Association Beyond Open Access Seminar
National Information Standards Organization (NISO)
 
Artefactual and Open Source Development
Artefactual and Open Source DevelopmentArtefactual and Open Source Development
Artefactual and Open Source Development
Artefactual Systems - AtoM
 
Why Open Source, I have Microsoft ?
Why Open Source, I have Microsoft ?Why Open Source, I have Microsoft ?
Why Open Source, I have Microsoft ?Neeraj Agarwal
 
Why Open Source, I have Microsoft ?
Why Open Source, I have Microsoft ?Why Open Source, I have Microsoft ?
Why Open Source, I have Microsoft ?
Neeraj Agarwal
 
20080602 Microsoft and Open Source
20080602 Microsoft and Open Source20080602 Microsoft and Open Source
20080602 Microsoft and Open Source
David Chou
 
Open Source Software: A Study
Open Source Software: A StudyOpen Source Software: A Study
Open Source Software: A Study
Iqbal Ahmad Ansari
 
Prospero: A Web-based Document Delivery System
Prospero: A Web-based Document Delivery SystemProspero: A Web-based Document Delivery System
Prospero: A Web-based Document Delivery SystemEric Schnell
 
Koha Presentation at Uttara University
Koha Presentation at Uttara UniversityKoha Presentation at Uttara University
Koha Presentation at Uttara University
Nur Ahammad
 
Opensource development and apache software foundation
Opensource development and apache software foundationOpensource development and apache software foundation
Opensource development and apache software foundationEran Chinthaka Withana
 
Open source 101
Open source 101Open source 101
Open source 101
Tom Rieger
 
Cilip Seminar 6th October - Integrating With Open Source
Cilip Seminar 6th October - Integrating With Open SourceCilip Seminar 6th October - Integrating With Open Source
Cilip Seminar 6th October - Integrating With Open SourceJonathan Field
 
Day 2-presentation
Day 2-presentationDay 2-presentation
Day 2-presentation
Deb Forsten
 
Open source: Making connections by Sunny Pai
Open source: Making connections by Sunny PaiOpen source: Making connections by Sunny Pai
Open source: Making connections by Sunny Pai
Hawaii Library Association
 
Open Source Software R
Open Source Software ROpen Source Software R
Open Source Software R
msimanau7824
 

Similar to Flax ovum search-across_the_enterprise (20)

Republica 2014 open-source_in_the_wild
Republica 2014 open-source_in_the_wildRepublica 2014 open-source_in_the_wild
Republica 2014 open-source_in_the_wild
 
Workshop slides - Introduction to AtoM and Archivematica
Workshop slides - Introduction to AtoM and ArchivematicaWorkshop slides - Introduction to AtoM and Archivematica
Workshop slides - Introduction to AtoM and Archivematica
 
National Archives of Norway - AtoM and Archivematica intro workshop
National Archives of Norway - AtoM and Archivematica intro workshopNational Archives of Norway - AtoM and Archivematica intro workshop
National Archives of Norway - AtoM and Archivematica intro workshop
 
Opensource
OpensourceOpensource
Opensource
 
Open Source Movement
Open Source MovementOpen Source Movement
Open Source Movement
 
Why we need oa infrastructure - STM Association Beyond Open Access Seminar
Why we need oa infrastructure - STM Association Beyond Open Access SeminarWhy we need oa infrastructure - STM Association Beyond Open Access Seminar
Why we need oa infrastructure - STM Association Beyond Open Access Seminar
 
Artefactual and Open Source Development
Artefactual and Open Source DevelopmentArtefactual and Open Source Development
Artefactual and Open Source Development
 
Why Open Source, I have Microsoft ?
Why Open Source, I have Microsoft ?Why Open Source, I have Microsoft ?
Why Open Source, I have Microsoft ?
 
Why Open Source, I have Microsoft ?
Why Open Source, I have Microsoft ?Why Open Source, I have Microsoft ?
Why Open Source, I have Microsoft ?
 
20080602 Microsoft and Open Source
20080602 Microsoft and Open Source20080602 Microsoft and Open Source
20080602 Microsoft and Open Source
 
Open Source & Open Development
Open Source & Open Development Open Source & Open Development
Open Source & Open Development
 
Open Source Software: A Study
Open Source Software: A StudyOpen Source Software: A Study
Open Source Software: A Study
 
Prospero: A Web-based Document Delivery System
Prospero: A Web-based Document Delivery SystemProspero: A Web-based Document Delivery System
Prospero: A Web-based Document Delivery System
 
Koha Presentation at Uttara University
Koha Presentation at Uttara UniversityKoha Presentation at Uttara University
Koha Presentation at Uttara University
 
Opensource development and apache software foundation
Opensource development and apache software foundationOpensource development and apache software foundation
Opensource development and apache software foundation
 
Open source 101
Open source 101Open source 101
Open source 101
 
Cilip Seminar 6th October - Integrating With Open Source
Cilip Seminar 6th October - Integrating With Open SourceCilip Seminar 6th October - Integrating With Open Source
Cilip Seminar 6th October - Integrating With Open Source
 
Day 2-presentation
Day 2-presentationDay 2-presentation
Day 2-presentation
 
Open source: Making connections by Sunny Pai
Open source: Making connections by Sunny PaiOpen source: Making connections by Sunny Pai
Open source: Making connections by Sunny Pai
 
Open Source Software R
Open Source Software ROpen Source Software R
Open Source Software R
 

More from Charlie Hull

Lucene, Solr and java 9 - opportunities and challenges
Lucene, Solr and java 9 - opportunities and challengesLucene, Solr and java 9 - opportunities and challenges
Lucene, Solr and java 9 - opportunities and challenges
Charlie Hull
 
Making sense of big data
Making sense of big dataMaking sense of big data
Making sense of big data
Charlie Hull
 
Search Solutions 2015: Towards a new model of search relevance testing
Search Solutions 2015:  Towards a new model of search relevance testingSearch Solutions 2015:  Towards a new model of search relevance testing
Search Solutions 2015: Towards a new model of search relevance testing
Charlie Hull
 
BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015
BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015
BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015
Charlie Hull
 
Bio solr building a better search for bioinformatics
Bio solr   building a better search for bioinformaticsBio solr   building a better search for bioinformatics
Bio solr building a better search for bioinformatics
Charlie Hull
 
Solr and Elasticsearch, a performance study
Solr and Elasticsearch, a performance studySolr and Elasticsearch, a performance study
Solr and Elasticsearch, a performance study
Charlie Hull
 

More from Charlie Hull (6)

Lucene, Solr and java 9 - opportunities and challenges
Lucene, Solr and java 9 - opportunities and challengesLucene, Solr and java 9 - opportunities and challenges
Lucene, Solr and java 9 - opportunities and challenges
 
Making sense of big data
Making sense of big dataMaking sense of big data
Making sense of big data
 
Search Solutions 2015: Towards a new model of search relevance testing
Search Solutions 2015:  Towards a new model of search relevance testingSearch Solutions 2015:  Towards a new model of search relevance testing
Search Solutions 2015: Towards a new model of search relevance testing
 
BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015
BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015
BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015
 
Bio solr building a better search for bioinformatics
Bio solr   building a better search for bioinformaticsBio solr   building a better search for bioinformatics
Bio solr building a better search for bioinformatics
 
Solr and Elasticsearch, a performance study
Solr and Elasticsearch, a performance studySolr and Elasticsearch, a performance study
Solr and Elasticsearch, a performance study
 

Recently uploaded

PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 

Recently uploaded (20)

PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 

Flax ovum search-across_the_enterprise

  • 1. Open Source Search for the Enterprise Charlie Hull Managing Director, Flax 3rd November 2010 OVUM Briefing, Search Across the Enterprise charlie@flax.co.uk www.flax.co.uk/blog +44 (0) 8700 118334 Twitter: @FlaxSearch
  • 2. Search engine specialists with decades of experience Developers, innovators and strategists Based in Cambridge, UK Technology agnostic – but open source exponents Recently selected as UK Authorized Partner by Lucid Imagination Customers include Mydeco, NLA, Durrants Ltd, Financial Times, MediaMiser, MySkreen, Accenture, University of Cambridge Recently asked to present at British Computer Society and Lucene Revolution conferences Who are Flax?
  • 3. “Open-source software (OSS) is computer software that is available in source code form for which the source code and certain other rights normally reserved for copyright holders are provided under a software license that permits users to study, change, and improve the software. […] Some open source software is available within the public domain” (Wikipedia) What is open source?
  • 4. “Open-source software (OSS) is computer software that is available in source code form for which the source code and certain other rights normally reserved for copyright holders are provided under a software license that permits users to study, change, and improve the software. […] Some open source software is available within the public domain” (Wikipedia) What is open source?
  • 5. It's the work of amateur developers Myths about open source
  • 6. It's the work of amateur developers If I use open source, I have to open up my software/servers/network to all and sundry Myths about open source
  • 7. It's the work of amateur developers If I use open source, I have to open up my software/servers/network to all and sundry Open source software isn't reliable or scalable Myths about open source
  • 8. It's the work of amateur developers If I use open source, I have to open up my software/servers/network to all and sundry Open source software isn't reliable or scalable It's free Myths about open source
  • 9. It's the work of amateur developers If I use open source, I have to open up my software/servers/network to all and sundry Open source software isn't reliable or scalable It's free It's unsupported Myths about open source
  • 10. Open source search software Apache Lucene and Solr are trademarks of The Apache Software Foundation - Flexible licensing - Vector space model - Java and other languages - Well known and supportedApache Lucene and Solr are trademarks of The Apache Software Foundation
  • 11. Open source search software Apache Lucene and Solr are trademarks of The Apache Software Foundation - The successor to Muscat - Bayesian probabilistic ranking - C/C++ with language bindings - Highly accurate & scalable - Flexible licensing - Vector space model - Java and other languages - Well known and supportedApache Lucene and Solr are trademarks of The Apache Software Foundation
  • 12. Open source search software Apache Lucene and Solr are trademarks of The Apache Software Foundation - The successor to Muscat - Bayesian probabilistic ranking - C/C++ with language bindings - Highly accurate & scalable - Flexible licensing - Vector space model - Java and other languages - Well known and supported And more.... Apache Lucene and Solr are trademarks of The Apache Software Foundation
  • 13. Some examples http://www.nla-clipshare.com Newspaper Licensing Agency – NLA Clipshare 20 million newspaper stories 6500 users Content from every major newspaper (and most regionals) Used by journalists, clippings agencies, media monitors Replacing internal systems at major newspapers
  • 14. Some examples http://www.nla-clipshare.com Newspaper Licensing Agency – NLA Clipshare 20 million newspaper stories 6500 users Content from every major newspaper (and most regionals) Used by journalists, clippings agencies, media monitors Replacing internal systems at major newspapers One of very few ways to search content from all the papers within hours of publication
  • 15.
  • 16.
  • 17.
  • 18. Some examples Financial Times – press cuttings Web Service for easy integration XML source data Faceted search Area filters (whole article, body, headline, byline or any combination) Synonyms, spelling suggestions http://presscuttings.ft.com
  • 19. Some examples Financial Times – press cuttings Web Service for easy integration XML source data Faceted search Area filters (whole article, body, headline, byline or any combination) Synonyms, spelling suggestions Built from scratch in a fortnight Designed as a prototype, scaled to production use without significant change http://presscuttings.ft.com
  • 20.
  • 21. Some examples Durrants Ltd. Media monitoring platform Thousands of client search profiles Hundreds of thousands of articles per day Complex publication heirarchy Established pipeline Solution Flexible query language allows OCR errors, punctuation, fuzzy matching, weighting Supports features of previous engine Scalable master-slave architecture
  • 22. Some examples Durrants Ltd. Media monitoring platform Thousands of client search profiles Hundreds of thousands of articles per day Complex publication heirarchy Established pipeline Solution Flexible query language allows OCR errors, punctuation, fuzzy matching, weighting Supports features of previous engine Scalable master-slave architecture Accuracy improved in some cases from 95% rejected to 95% accepted Hardware budget 15% of previous system
  • 23. Some examples (Unnamed multinational radio suppliers) Intranet search 12 million documents Multiple formats – Office, PDF, HTML... User and group-based security (LDAP) Faceted search Users can 'tag' interesting documents – for example to identify a 'reference' version
  • 24. Some examples (Unnamed multinational radio suppliers) Intranet search 12 million documents Multiple formats – Office, PDF, HTML... User and group-based security (LDAP) Faceted search Users can 'tag' interesting documents – for example to identify a 'reference' version Open source chosen because of significant cost advantage – commercial solutions uneconomic at this scale
  • 25. A look at Lucene & Solr Among the top 15 open source projects Installations at over 4,000 companies Downloads have grown nearly 10x over the past three years Over 7,000 downloads a day.
  • 26. A look at Lucene & Solr Among the top 15 open source projects Installations at over 4,000 companies Downloads have grown nearly 10x over the past three years Over 7,000 downloads a day. USA based Employs 9 out of 15 top Lucene committers Offers training, consulting and up to 24x7 support Developing value-add software
  • 27. A look at Lucene & Solr Among the top 15 open source projects Installations at over 4,000 companies Downloads have grown nearly 10x over the past three years Over 7,000 downloads a day. USA based Employs 9 out of 15 top Lucene committers Offers training, consulting and up to 24x7 support Developing value-add software Flax are UK partners & resellers
  • 29. Who are Lucid working with?
  • 30. Some Lucene & Solr numbers LinkedIn – 30 million users Internet Archive – a billion indexed pages Salesforce.com – 8 terabytes of searchable data Twitter – a billion queries a day
  • 31. Why open source search? Flexible, extendable
  • 32. Why open source search? Flexible, extendable Powerful & scalable
  • 33. Why open source search? Flexible, extendable Powerful & scalable Lower cost, especially when planning for growth
  • 34. Why open source search? Flexible, extendable Powerful & scalable Lower cost, especially when planning for growth Commercial support available as necessary
  • 35. Why open source search? Flexible, extendable Powerful & scalable Lower cost, especially when planning for growth Commercial support available as necessary - Freedom to innovate
  • 36. Looking to the future
  • 37. Looking to the future More and more content including social media
  • 38. Looking to the future More and more content including social media Multiple delivery platforms
  • 39. Looking to the future More and more content including social media Multiple delivery platforms Search-powered applications
  • 40. Looking to the future More and more content including social media Multiple delivery platforms Search-powered applications Cloud computing
  • 41. Looking to the future More and more content including social media Multiple delivery platforms Search-powered applications Cloud computing More use of entity extraction & sentiment analysis
  • 42. Looking to the future More and more content including social media Multiple delivery platforms Search-powered applications Cloud computing More use of entity extraction & sentiment analysis Search no longer a bolt-on, but a platform for innovation
  • 43. Looking to the future More and more content including social media Multiple delivery platforms Search-powered applications Cloud computing More use of entity extraction & sentiment analysis Search no longer a bolt-on, but a platform for innovation Open source no longer an outsider, but the obvious choice