Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine

767 views

Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine

  1. 1. Confidential and Proprietary © Copyright 2013Confidential and Proprietary © Copyright 2013 This Ain’t Your Parents’ Search Engine Grant Ingersoll CTO, LucidWorks Twitter: @gsingers
  2. 2. Confidential and Proprietary © Copyright 2013Confidential and Proprietary © Copyright 2013 Search is dead.
  3. 3. Confidential and Proprietary © Copyright 2013Confidential and Proprietary © Copyright 2013 Long live search
  4. 4. Confidential and Proprietary © Copyright 2013 Search is good for… • Traditional: Fast, fuzzy text matching across a large document collection • De-normalized data - “light” relational • Top N problems - Key-value (n=1) - Recommendations - “Good enough” classification, clustering • Faceting, aggregations, analytical slicing and dicing of data • Spatial, record/event linkage, alerting http://cheezburger.com/5243950080
  5. 5. Confidential and Proprietary © Copyright 2013 Foundational Changes in Lucene/Solr 4 •Reduced Memory usage •Pluggable Codecs/similarity •FS(A|T) •Doc Values (column oriented) •Spatial upgrade •New facets and functions •Cursors (deep paging) •Distributed capabilities •Joins/Grouping
  6. 6. Confidential and Proprietary © Copyright 2013 Search + Hadoop •What’s Old is New Again •“Traditional” Use Cases: - Build/Store indexes - https://cwiki.apache.org/confluence/display/solr/ Running+Solr+on+HDFS •Enrichment and Signal processing - PageRank, Statistically Interesting Phrases, etc.
  7. 7. Confidential and Proprietary © Copyright 2013 LucidWorks + Hadoop •Ingestion Help - Flexible Map-Reduce content ingestion supporting: »Directory of files »CSV, Writable, etc. »LogStash »Build Your Own •Pig Load/Store and UDFs •Hive 2-way support •http://www.lucidworks.com/search-for- hadoop/ - Open source this summer
  8. 8. Confidential and Proprietary © Copyright 2013Confidential and Proprietary © Copyright 2013 LucidWorks SiLK LucidWorks Search JDBC Connector Web/File System Crawl Data Warehouse Hadoop Connectors Clickstream Networking Data Sources Connectors Servers
  9. 9. Confidential and Proprietary © Copyright 2013Confidential and Proprietary © Copyright 2013 Solr/Solr Cloud Search Analytics—Data Ingestion & Visualization Gateway (Reverse Proxy) Solr Output Writer for LogStash (Http) Search Logs Visualization Configurable Dashboards Hadoop Connector GrokIngestMapperLogStash
  10. 10. Confidential and Proprietary © Copyright 2013Confidential and Proprietary © Copyright 2013 LucidWorks Open Source • Logstash for Solr: https://github.com/LucidWorks/solrlogmanager • Banana (Kibana for Solr): https://github.com/LucidWorks/banana • Effortless AWS deployment and monitoring: http://www.github.com/lucidworks/solr-scale-tk • Data Quality Toolkit: https://github.com/LucidWorks/data-quality
  11. 11. Confidential and Proprietary © Copyright 2013Confidential and Proprietary © Copyright 2013 Demos
  12. 12. Confidential and Proprietary © Copyright 2013 12 Fly the friendly skies http://www.ibm.com/developerworks/library/j-solr-lucene/index.html
  13. 13. Confidential and Proprietary © Copyright 2013Confidential and Proprietary © Copyright 2013 Make $$$ • Leverage time series data and visualization using LucidWorks SiLK • Monitor Social • Traditional Research https://github.com/lucidworks/lws-financial-demo
  14. 14. Confidential and Proprietary © Copyright 2013Confidential and Proprietary © Copyright 2013 Cure what ails you
  15. 15. Confidential and Proprietary © Copyright 2013 15 Space-Time Continuum • Leverage Solr’s spatial capabilities to index non- spatial data, such as time ranges - Useful for Open Hours, Shifts, etc. • Query using rectangle intersections - q = shift:"Intersects(0 19 23 365)” https://people.apache.org/~hossman/spatial-for-non-spatial-meetup-20130117/
  16. 16. Confidential and Proprietary © Copyright 2013Confidential and Proprietary © Copyright 2013 Signal Processing for Search and Discovery • Signals power modern relevance – Clicks, conversions, sharing, history, signatures • LucidWorks 5 makes it easy to capture and leverage signals – Recommendations, analytics, discovery • Simplifies your data workflow • Simplify your operational footprint
  17. 17. Confidential and Proprietary © Copyright 2013Confidential and Proprietary © Copyright 2013 Solr Powered Signal Processing • Use Case: eCommerce • Data: – Product catalog (~1.2m items) – Click data (~3.9M clicks)
  18. 18. Confidential and Proprietary © Copyright 2013Confidential and Proprietary © Copyright 2013 Meta • http://www.lucidworks.com – grant@lucidworks.com – @gsingers • Sales – Steve Drane (based here in Chicago) – steve.drane@lucidworks.com • Lucene/Solr Revolution – Washington DC, Nov 11-14 – http://www.lucenerevolution.org

×