Your SlideShare is downloading. ×
Flax ovum search-across_the_enterprise
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Flax ovum search-across_the_enterprise

1,315

Published on

See some common myths, discover the various open source enterprise search packages available and see some case studies on how open source software has helped organisations build effective search.

See some common myths, discover the various open source enterprise search packages available and see some case studies on how open source software has helped organisations build effective search.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,315
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
9
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Open Source Search for the Enterprise Charlie Hull Managing Director, Flax 3rd November 2010 OVUM Briefing, Search Across the Enterprise charlie@flax.co.uk www.flax.co.uk/blog +44 (0) 8700 118334 Twitter: @FlaxSearch
  • 2. Search engine specialists with decades of experience Developers, innovators and strategists Based in Cambridge, UK Technology agnostic – but open source exponents Recently selected as UK Authorized Partner by Lucid Imagination Customers include Mydeco, NLA, Durrants Ltd, Financial Times, MediaMiser, MySkreen, Accenture, University of Cambridge Recently asked to present at British Computer Society and Lucene Revolution conferences Who are Flax?
  • 3. “Open-source software (OSS) is computer software that is available in source code form for which the source code and certain other rights normally reserved for copyright holders are provided under a software license that permits users to study, change, and improve the software. […] Some open source software is available within the public domain” (Wikipedia) What is open source?
  • 4. “Open-source software (OSS) is computer software that is available in source code form for which the source code and certain other rights normally reserved for copyright holders are provided under a software license that permits users to study, change, and improve the software. […] Some open source software is available within the public domain” (Wikipedia) What is open source?
  • 5. It's the work of amateur developers Myths about open source
  • 6. It's the work of amateur developers If I use open source, I have to open up my software/servers/network to all and sundry Myths about open source
  • 7. It's the work of amateur developers If I use open source, I have to open up my software/servers/network to all and sundry Open source software isn't reliable or scalable Myths about open source
  • 8. It's the work of amateur developers If I use open source, I have to open up my software/servers/network to all and sundry Open source software isn't reliable or scalable It's free Myths about open source
  • 9. It's the work of amateur developers If I use open source, I have to open up my software/servers/network to all and sundry Open source software isn't reliable or scalable It's free It's unsupported Myths about open source
  • 10. Open source search software Apache Lucene and Solr are trademarks of The Apache Software Foundation - Flexible licensing - Vector space model - Java and other languages - Well known and supportedApache Lucene and Solr are trademarks of The Apache Software Foundation
  • 11. Open source search software Apache Lucene and Solr are trademarks of The Apache Software Foundation - The successor to Muscat - Bayesian probabilistic ranking - C/C++ with language bindings - Highly accurate & scalable - Flexible licensing - Vector space model - Java and other languages - Well known and supportedApache Lucene and Solr are trademarks of The Apache Software Foundation
  • 12. Open source search software Apache Lucene and Solr are trademarks of The Apache Software Foundation - The successor to Muscat - Bayesian probabilistic ranking - C/C++ with language bindings - Highly accurate & scalable - Flexible licensing - Vector space model - Java and other languages - Well known and supported And more.... Apache Lucene and Solr are trademarks of The Apache Software Foundation
  • 13. Some examples http://www.nla-clipshare.com Newspaper Licensing Agency – NLA Clipshare 20 million newspaper stories 6500 users Content from every major newspaper (and most regionals) Used by journalists, clippings agencies, media monitors Replacing internal systems at major newspapers
  • 14. Some examples http://www.nla-clipshare.com Newspaper Licensing Agency – NLA Clipshare 20 million newspaper stories 6500 users Content from every major newspaper (and most regionals) Used by journalists, clippings agencies, media monitors Replacing internal systems at major newspapers One of very few ways to search content from all the papers within hours of publication
  • 15. Some examples Financial Times – press cuttings Web Service for easy integration XML source data Faceted search Area filters (whole article, body, headline, byline or any combination) Synonyms, spelling suggestions http://presscuttings.ft.com
  • 16. Some examples Financial Times – press cuttings Web Service for easy integration XML source data Faceted search Area filters (whole article, body, headline, byline or any combination) Synonyms, spelling suggestions Built from scratch in a fortnight Designed as a prototype, scaled to production use without significant change http://presscuttings.ft.com
  • 17. Some examples Durrants Ltd. Media monitoring platform Thousands of client search profiles Hundreds of thousands of articles per day Complex publication heirarchy Established pipeline Solution Flexible query language allows OCR errors, punctuation, fuzzy matching, weighting Supports features of previous engine Scalable master-slave architecture
  • 18. Some examples Durrants Ltd. Media monitoring platform Thousands of client search profiles Hundreds of thousands of articles per day Complex publication heirarchy Established pipeline Solution Flexible query language allows OCR errors, punctuation, fuzzy matching, weighting Supports features of previous engine Scalable master-slave architecture Accuracy improved in some cases from 95% rejected to 95% accepted Hardware budget 15% of previous system
  • 19. Some examples (Unnamed multinational radio suppliers) Intranet search 12 million documents Multiple formats – Office, PDF, HTML... User and group-based security (LDAP) Faceted search Users can 'tag' interesting documents – for example to identify a 'reference' version
  • 20. Some examples (Unnamed multinational radio suppliers) Intranet search 12 million documents Multiple formats – Office, PDF, HTML... User and group-based security (LDAP) Faceted search Users can 'tag' interesting documents – for example to identify a 'reference' version Open source chosen because of significant cost advantage – commercial solutions uneconomic at this scale
  • 21. A look at Lucene & Solr Among the top 15 open source projects Installations at over 4,000 companies Downloads have grown nearly 10x over the past three years Over 7,000 downloads a day.
  • 22. A look at Lucene & Solr Among the top 15 open source projects Installations at over 4,000 companies Downloads have grown nearly 10x over the past three years Over 7,000 downloads a day. USA based Employs 9 out of 15 top Lucene committers Offers training, consulting and up to 24x7 support Developing value-add software
  • 23. A look at Lucene & Solr Among the top 15 open source projects Installations at over 4,000 companies Downloads have grown nearly 10x over the past three years Over 7,000 downloads a day. USA based Employs 9 out of 15 top Lucene committers Offers training, consulting and up to 24x7 support Developing value-add software Flax are UK partners & resellers
  • 24. Lucid Works Enterprise
  • 25. Who are Lucid working with?
  • 26. Some Lucene & Solr numbers LinkedIn – 30 million users Internet Archive – a billion indexed pages Salesforce.com – 8 terabytes of searchable data Twitter – a billion queries a day
  • 27. Why open source search? Flexible, extendable
  • 28. Why open source search? Flexible, extendable Powerful & scalable
  • 29. Why open source search? Flexible, extendable Powerful & scalable Lower cost, especially when planning for growth
  • 30. Why open source search? Flexible, extendable Powerful & scalable Lower cost, especially when planning for growth Commercial support available as necessary
  • 31. Why open source search? Flexible, extendable Powerful & scalable Lower cost, especially when planning for growth Commercial support available as necessary - Freedom to innovate
  • 32. Looking to the future
  • 33. Looking to the future More and more content including social media
  • 34. Looking to the future More and more content including social media Multiple delivery platforms
  • 35. Looking to the future More and more content including social media Multiple delivery platforms Search-powered applications
  • 36. Looking to the future More and more content including social media Multiple delivery platforms Search-powered applications Cloud computing
  • 37. Looking to the future More and more content including social media Multiple delivery platforms Search-powered applications Cloud computing More use of entity extraction & sentiment analysis
  • 38. Looking to the future More and more content including social media Multiple delivery platforms Search-powered applications Cloud computing More use of entity extraction & sentiment analysis Search no longer a bolt-on, but a platform for innovation
  • 39. Looking to the future More and more content including social media Multiple delivery platforms Search-powered applications Cloud computing More use of entity extraction & sentiment analysis Search no longer a bolt-on, but a platform for innovation Open source no longer an outsider, but the obvious choice
  • 40. Thankyou! Any questions? charlie@flax.co.uk www.flax.co.uk/blog +44 (0) 8700 118334 Twitter: @FlaxSearch

×