SlideShare a Scribd company logo
O C T O B E R 	
   1 1 -­‐ 1 4 , 	
   2 0 1 6 	
   	
   • 	
   	
   B O S T O N , 	
   M A 	
  
O C T O B E R 	
   1 1 -­‐ 1 4 , 	
   2 0 1 6 	
   	
   • 	
   	
   B O S T O N , 	
   M A 	
  
Searching the Enterprise Data Lake with Solr - Watch us do it!
Paul Nelson – pnelson@searchtechnologies.com
Chief Architect, Search Technologies
THERE	
  WILL	
  BE	
  A	
  DEMO	
  
Stay	
  Tuned!	
  
205+	
  Search	
  Consultants	
  Worldwide	
  
San	
  Diego	
  
San	
  Jose,	
  CR	
  
Cincinna6	
  
Manila,	
  PH	
  
Washington	
  
	
  	
  	
  	
  	
  	
  (HQ)	
  
•  Founded	
  2005	
  
•  Deep	
  search	
  experLse	
  
•  900+	
  customers	
  worldwide	
  
•  Consistent	
  profitability	
  
•  Search	
  engines	
  &	
  Big	
  Data	
  
•  Vendor	
  independent	
  
London,	
  UK	
  
Frankfurt,	
  DE	
  
Prague,	
  CZ	
  
Agenda	
  
•  The	
  Enterprise	
  Data	
  Lake	
  (EDL)	
  
•  Why	
  Search	
  the	
  EDL?	
  
•  The	
  Process	
  
•  How	
  To:	
  	
  Step	
  By	
  Step	
  
•  And	
  then	
  what?	
  
In	
  The	
  Beginning	
  
Applica6on	
  
Computer	
  Users	
  
Database	
  
Dashboards	
  
Reports	
  
Search	
  &	
  
Troubleshoo6ng	
  
Alerts	
  
This	
  Evolved	
  to	
  Data	
  Warehouses	
  
Many	
  Computer	
  Users	
  
Dozens	
  of	
  	
  
Applica6ons	
  Dozens	
  of	
  	
  
Applica6ons	
  Dozens	
  of	
  	
  
Applica6ons	
  Dozens	
  of	
  	
  
Applica6ons	
  Dozens	
  of	
  	
  
Applica6ons	
  Dozens	
  of	
  	
  
Applica6ons	
  
Extract	
  
Transform	
  
Load	
  
Enterprise	
  
Data	
  Warehouse	
  
Dashboards	
  
Reports	
  
Search	
  &	
  
Troubleshoo6ng	
  
Alerts	
  
And	
  Now	
  the	
  Enterprise	
  Data	
  Lake	
  
Many,	
  many,	
  many	
  
	
  Computer	
  Users	
  
Enterprise	
  
Data	
  Lake	
  
Dashboards	
  
Reports	
  
Search	
  &	
  
Troubleshoo6ng	
  
Alerts	
  
Analyze	
  
Hundreds	
  of	
  
Applica6ons	
   Raw	
  Data	
  
And	
  Processed	
  
Data	
  
What’s	
  new	
  about	
  the	
  Data	
  Lake?	
  
•  Ingest	
  RAW	
  DATA	
  
•  Keep	
  it	
  FOREVER	
  
•  Make	
  it	
  ALL	
  AVAILABLE	
  
•  Analyze	
  it	
  ONLY	
  WHEN	
  NEEDED	
  
•  Do	
  it	
  at	
  MASSIVE	
  SCALE	
  
Why	
  the	
  Data	
  Lake?	
  
•  You	
  never	
  know	
  what’s	
  important	
  up	
  front	
  
–  New	
  data	
  mining	
  techniques	
  invented	
  daily	
  
–  Therefore,	
  keep	
  everything	
  
•  There	
  is	
  too	
  much	
  data	
  variety	
  
–  Therefore,	
  only	
  process	
  what	
  you	
  need	
  
•  Save	
  money	
  by	
  not	
  ETL’ing	
  useless	
  stuff	
  
•  There	
  are	
  many	
  different	
  use	
  cases	
  
–  Shared	
  re-­‐use	
  of	
  data	
  by	
  anyone	
  
–  Data	
  is	
  power!	
  Power	
  to	
  the	
  people!	
  
But	
  Now	
  There’s	
  a	
  Problem:	
  
•  10’s	
  of	
  thousands	
  of	
  databases	
  
•  Billions	
  of	
  records	
  
	
  
How	
  to	
  find	
  the	
  data	
  you	
  need?	
  
SO	
  LET’S	
  SEARCH	
  THE	
  DATA	
  LAKE	
  
“People	
  today	
  think	
  search	
  and	
  big	
  
data	
  are	
  separate	
  but	
  in	
  two	
  or	
  three	
  
years,	
  everyone	
  will	
  wonder	
  why	
  we	
  
ever	
  thought	
  that.”	
  
	
  
Doug	
  Cu?ng	
  
Chief	
  Architect,	
  Cloudera	
  
Creator	
  of	
  Lucene	
  &	
  Hadoop	
  
The	
  Process	
  
Ingest	
  
1	
  
Research	
  
the	
  Data	
  
2	
  
Configure	
  Solr	
  
3	
  
Parse	
  &	
  	
  
Index	
  
4	
  
Search	
  &	
  
Analyze	
  
5	
  
Produc6on	
  
6	
  
1.	
  	
  Ingest	
  
HDFS	
  
Load	
  
Data	
  
Hadoop	
  
2.	
  	
  Research	
  the	
  Data	
  
HDFS	
  
Research	
  
Hadoop	
  
3.	
  	
  Configure	
  Solr	
  
HDFS	
  
solrconfig.xml	
  
schema.xml	
  
Hadoop	
  
4.	
  	
  Parse	
  &	
  Index	
  
HDFS	
  
Index	
  Morphlines	
  
Hadoop	
  
5.	
  	
  Search	
  &	
  Analyze	
  
HDFS	
  
Index	
  
Hadoop	
  
Hue	
  Morphlines	
  
6.	
  Move	
  to	
  Produc6on	
  
•  Tes6ng,	
  Quality	
  Control	
  
–  Field	
  processing	
  
–  Search	
  Features	
  
–  Analy6cs	
  
•  Incremental	
  Processing	
  
–  Flume,	
  Spark	
  Streaming,	
  Incremental	
  Batches	
  
•  Workflow	
  /	
  Scheduled	
  Jobs	
  (Oozie)	
  
•  Security	
  Controls	
  
WATCH	
  US	
  DO	
  IT!	
  
Resources	
  
•  HDFS	
  File	
  System	
  Commands	
  
–  hips://hadoop.apache.org/docs/r2.7.3/hadoop-­‐project-­‐dist/hadoop-­‐common/FileSystemShell.html	
  
•  solrctl	
  Reference	
  Guide	
  
–  hips://www.cloudera.com/documenta6on/enterprise/5-­‐7-­‐x/topics/search_solrctl_ref.html	
  	
  
•  Morphlines	
  Reference	
  Guide	
  
–  hip://kitesdk.org/docs/1.1.0/morphlines/morphlines-­‐reference-­‐guide.html	
  
–  hips://github.com/typesafehub/config/blob/master/HOCON.md	
  	
  
•  MapReduce	
  Indexer	
  Tool	
  
–  hips://github.com/cloudera/search/tree/cdh5-­‐1.0.0_5.2.1/search-­‐mr	
  	
  
•  Crunch	
  Indexer	
  
–  hips://github.com/cloudera/search/tree/cdh5-­‐1.0.0_5.2.1/search-­‐crunch	
  	
  
•  Lily	
  HBase	
  Indexer	
  
–  hip://www.cloudera.com/documenta6on/enterprise/latest/topics/search_hbase_batch_indexer.html	
  	
  
What’s	
  Next	
  
•  Explore	
  other	
  analy6c	
  interfaces	
  
–  Banana,	
  Zoom	
  Data	
  
•  Spark	
  
–  Streaming	
  Data	
  
–  Complex	
  Analy6cs	
  à	
  Store	
  results	
  in	
  Solr	
  à	
  More	
  analy6cs!	
  
•  Index	
  Many	
  More	
  Collec6ons	
  
–  Create	
  a	
  Process:	
  	
  Data	
  research	
  à	
  Data	
  Model	
  Design	
  à	
  Implement	
  
•  Self-­‐Service	
  Inges6on	
  
–  Document	
  processes	
  for	
  others	
  to	
  use	
  
–  Templates	
  for	
  inges6on	
  
•  Hire	
  Search	
  Technologies!	
  
QUESTIONS?	
  ANSWERS!	
  
Thank	
  you!	
  
Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies
Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies
Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies
Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies
Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies
Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies
Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies
Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies
Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies
Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies
Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies
Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies
Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies
Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies
Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies
Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies
Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies
Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies

More Related Content

What's hot

SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...
SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...
SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...Lucidworks
 
Webinar: Solr & Fusion for Big Data
Webinar: Solr & Fusion for Big DataWebinar: Solr & Fusion for Big Data
Webinar: Solr & Fusion for Big DataLucidworks
 
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...Lucidworks
 
Efficient Scalable Search in a Multi-Tenant Environment: Presented by Harry H...
Efficient Scalable Search in a Multi-Tenant Environment: Presented by Harry H...Efficient Scalable Search in a Multi-Tenant Environment: Presented by Harry H...
Efficient Scalable Search in a Multi-Tenant Environment: Presented by Harry H...Lucidworks
 
Searching for Better Code: Presented by Grant Ingersoll, Lucidworks
Searching for Better Code: Presented by Grant Ingersoll, LucidworksSearching for Better Code: Presented by Grant Ingersoll, Lucidworks
Searching for Better Code: Presented by Grant Ingersoll, LucidworksLucidworks
 
Solr on HDFS - Past, Present, and Future: Presented by Mark Miller, Cloudera
Solr on HDFS - Past, Present, and Future: Presented by Mark Miller, ClouderaSolr on HDFS - Past, Present, and Future: Presented by Mark Miller, Cloudera
Solr on HDFS - Past, Present, and Future: Presented by Mark Miller, ClouderaLucidworks
 
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...Spark Summit
 
Solr cloud the 'search first' nosql database extended deep dive
Solr cloud the 'search first' nosql database   extended deep diveSolr cloud the 'search first' nosql database   extended deep dive
Solr cloud the 'search first' nosql database extended deep divelucenerevolution
 
Building a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrBuilding a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrRahul Jain
 
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...rhatr
 
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena EdelsonStreaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena EdelsonSpark Summit
 
Centralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackCentralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackRich Lee
 
Case study of Rujhaan.com (A social news app )
Case study of Rujhaan.com (A social news app )Case study of Rujhaan.com (A social news app )
Case study of Rujhaan.com (A social news app )Rahul Jain
 
Securing Spark Applications by Kostas Sakellis and Marcelo Vanzin
Securing Spark Applications by Kostas Sakellis and Marcelo VanzinSecuring Spark Applications by Kostas Sakellis and Marcelo Vanzin
Securing Spark Applications by Kostas Sakellis and Marcelo VanzinSpark Summit
 
Stream All Things—Patterns of Modern Data Integration with Gwen Shapira
Stream All Things—Patterns of Modern Data Integration with Gwen ShapiraStream All Things—Patterns of Modern Data Integration with Gwen Shapira
Stream All Things—Patterns of Modern Data Integration with Gwen ShapiraDatabricks
 
Solr + Hadoop = Big Data Search
Solr + Hadoop = Big Data SearchSolr + Hadoop = Big Data Search
Solr + Hadoop = Big Data SearchMark Miller
 
Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)
Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)
Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)Cohesive Networks
 
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...Databricks
 
Ubiquitous Solr - A Database's Not-So-Evil Twin: Presented by Ayon Sinha, Wal...
Ubiquitous Solr - A Database's Not-So-Evil Twin: Presented by Ayon Sinha, Wal...Ubiquitous Solr - A Database's Not-So-Evil Twin: Presented by Ayon Sinha, Wal...
Ubiquitous Solr - A Database's Not-So-Evil Twin: Presented by Ayon Sinha, Wal...Lucidworks
 

What's hot (20)

SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...
SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...
SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...
 
Webinar: Solr & Fusion for Big Data
Webinar: Solr & Fusion for Big DataWebinar: Solr & Fusion for Big Data
Webinar: Solr & Fusion for Big Data
 
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...
 
Efficient Scalable Search in a Multi-Tenant Environment: Presented by Harry H...
Efficient Scalable Search in a Multi-Tenant Environment: Presented by Harry H...Efficient Scalable Search in a Multi-Tenant Environment: Presented by Harry H...
Efficient Scalable Search in a Multi-Tenant Environment: Presented by Harry H...
 
Searching for Better Code: Presented by Grant Ingersoll, Lucidworks
Searching for Better Code: Presented by Grant Ingersoll, LucidworksSearching for Better Code: Presented by Grant Ingersoll, Lucidworks
Searching for Better Code: Presented by Grant Ingersoll, Lucidworks
 
Solr on HDFS - Past, Present, and Future: Presented by Mark Miller, Cloudera
Solr on HDFS - Past, Present, and Future: Presented by Mark Miller, ClouderaSolr on HDFS - Past, Present, and Future: Presented by Mark Miller, Cloudera
Solr on HDFS - Past, Present, and Future: Presented by Mark Miller, Cloudera
 
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
 
Solr cloud the 'search first' nosql database extended deep dive
Solr cloud the 'search first' nosql database   extended deep diveSolr cloud the 'search first' nosql database   extended deep dive
Solr cloud the 'search first' nosql database extended deep dive
 
Building a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrBuilding a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache Solr
 
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...
 
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena EdelsonStreaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
 
Centralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackCentralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stack
 
Case study of Rujhaan.com (A social news app )
Case study of Rujhaan.com (A social news app )Case study of Rujhaan.com (A social news app )
Case study of Rujhaan.com (A social news app )
 
Securing Spark Applications by Kostas Sakellis and Marcelo Vanzin
Securing Spark Applications by Kostas Sakellis and Marcelo VanzinSecuring Spark Applications by Kostas Sakellis and Marcelo Vanzin
Securing Spark Applications by Kostas Sakellis and Marcelo Vanzin
 
Stream All Things—Patterns of Modern Data Integration with Gwen Shapira
Stream All Things—Patterns of Modern Data Integration with Gwen ShapiraStream All Things—Patterns of Modern Data Integration with Gwen Shapira
Stream All Things—Patterns of Modern Data Integration with Gwen Shapira
 
Solr + Hadoop = Big Data Search
Solr + Hadoop = Big Data SearchSolr + Hadoop = Big Data Search
Solr + Hadoop = Big Data Search
 
Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)
Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)
Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)
 
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
 
Ubiquitous Solr - A Database's Not-So-Evil Twin: Presented by Ayon Sinha, Wal...
Ubiquitous Solr - A Database's Not-So-Evil Twin: Presented by Ayon Sinha, Wal...Ubiquitous Solr - A Database's Not-So-Evil Twin: Presented by Ayon Sinha, Wal...
Ubiquitous Solr - A Database's Not-So-Evil Twin: Presented by Ayon Sinha, Wal...
 
What's new in solr june 2014
What's new in solr june 2014What's new in solr june 2014
What's new in solr june 2014
 

Viewers also liked

Build a Great Application in Minutes!: Presented by Stefan Olafsson, Twigkit
Build a Great Application in Minutes!: Presented by Stefan Olafsson, TwigkitBuild a Great Application in Minutes!: Presented by Stefan Olafsson, Twigkit
Build a Great Application in Minutes!: Presented by Stefan Olafsson, TwigkitLucidworks
 
Parallel SQL and Analytics with Solr: Presented by Yonik Seeley, Cloudera
Parallel SQL and Analytics with Solr: Presented by Yonik Seeley, ClouderaParallel SQL and Analytics with Solr: Presented by Yonik Seeley, Cloudera
Parallel SQL and Analytics with Solr: Presented by Yonik Seeley, ClouderaLucidworks
 
Solr+Hadoop = Big Data Search
Solr+Hadoop = Big Data SearchSolr+Hadoop = Big Data Search
Solr+Hadoop = Big Data SearchCloudera, Inc.
 
The First Class Integration of Solr with Hadoop
The First Class Integration of Solr with HadoopThe First Class Integration of Solr with Hadoop
The First Class Integration of Solr with Hadooplucenerevolution
 
Large Scale ETL for Hadoop and Cloudera Search using Morphlines
Large Scale ETL for Hadoop and Cloudera Search using MorphlinesLarge Scale ETL for Hadoop and Cloudera Search using Morphlines
Large Scale ETL for Hadoop and Cloudera Search using Morphlineswhoschek
 
Search UI and Lucidworks View: Presented by Josh Ellinger, Lucidworks
Search UI and Lucidworks View: Presented by Josh Ellinger, LucidworksSearch UI and Lucidworks View: Presented by Josh Ellinger, Lucidworks
Search UI and Lucidworks View: Presented by Josh Ellinger, LucidworksLucidworks
 
Cloudera Search Webinar: Big Data Search, Bigger Insights
Cloudera Search Webinar: Big Data Search, Bigger InsightsCloudera Search Webinar: Big Data Search, Bigger Insights
Cloudera Search Webinar: Big Data Search, Bigger InsightsCloudera, Inc.
 
World of Watson 2016 - Architecting your Analytics House
World of Watson 2016 - Architecting your Analytics HouseWorld of Watson 2016 - Architecting your Analytics House
World of Watson 2016 - Architecting your Analytics HouseKeith Redman
 
Real-Time Analytics with Solr: Presented by Yonik Seeley, Cloudera
Real-Time Analytics with Solr: Presented by Yonik Seeley, ClouderaReal-Time Analytics with Solr: Presented by Yonik Seeley, Cloudera
Real-Time Analytics with Solr: Presented by Yonik Seeley, ClouderaLucidworks
 
Ovum Fireside Chat: Governing the data lake - Understanding what's in there
Ovum Fireside Chat: Governing the data lake - Understanding what's in thereOvum Fireside Chat: Governing the data lake - Understanding what's in there
Ovum Fireside Chat: Governing the data lake - Understanding what's in thereZaloni
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationslucenerevolution
 
Time Series Processing with Solr and Spark: Presented by Josef Adersberger, Q...
Time Series Processing with Solr and Spark: Presented by Josef Adersberger, Q...Time Series Processing with Solr and Spark: Presented by Josef Adersberger, Q...
Time Series Processing with Solr and Spark: Presented by Josef Adersberger, Q...Lucidworks
 
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...lucenerevolution
 
Using Morphlines for On-the-Fly ETL
Using Morphlines for On-the-Fly ETLUsing Morphlines for On-the-Fly ETL
Using Morphlines for On-the-Fly ETLCloudera, Inc.
 
Search++: Cognitive transformation of human-system interaction: Presented by ...
Search++: Cognitive transformation of human-system interaction: Presented by ...Search++: Cognitive transformation of human-system interaction: Presented by ...
Search++: Cognitive transformation of human-system interaction: Presented by ...Lucidworks
 
Understanding Metadata: Why it's essential to your big data solution and how ...
Understanding Metadata: Why it's essential to your big data solution and how ...Understanding Metadata: Why it's essential to your big data solution and how ...
Understanding Metadata: Why it's essential to your big data solution and how ...Zaloni
 
Aesop change data propagation
Aesop change data propagationAesop change data propagation
Aesop change data propagationRegunath B
 
Visualize Solr Data with Banana: Presented by Andrew Thanalertvisuti, Lucidworks
Visualize Solr Data with Banana: Presented by Andrew Thanalertvisuti, LucidworksVisualize Solr Data with Banana: Presented by Andrew Thanalertvisuti, Lucidworks
Visualize Solr Data with Banana: Presented by Andrew Thanalertvisuti, LucidworksLucidworks
 
Etsy Search: How We Index and Query 26 Million One-of-a-kind Items
Etsy Search: How We Index and Query 26 Million One-of-a-kind ItemsEtsy Search: How We Index and Query 26 Million One-of-a-kind Items
Etsy Search: How We Index and Query 26 Million One-of-a-kind ItemsC4Media
 

Viewers also liked (20)

Build a Great Application in Minutes!: Presented by Stefan Olafsson, Twigkit
Build a Great Application in Minutes!: Presented by Stefan Olafsson, TwigkitBuild a Great Application in Minutes!: Presented by Stefan Olafsson, Twigkit
Build a Great Application in Minutes!: Presented by Stefan Olafsson, Twigkit
 
Parallel SQL and Analytics with Solr: Presented by Yonik Seeley, Cloudera
Parallel SQL and Analytics with Solr: Presented by Yonik Seeley, ClouderaParallel SQL and Analytics with Solr: Presented by Yonik Seeley, Cloudera
Parallel SQL and Analytics with Solr: Presented by Yonik Seeley, Cloudera
 
Solr+Hadoop = Big Data Search
Solr+Hadoop = Big Data SearchSolr+Hadoop = Big Data Search
Solr+Hadoop = Big Data Search
 
The First Class Integration of Solr with Hadoop
The First Class Integration of Solr with HadoopThe First Class Integration of Solr with Hadoop
The First Class Integration of Solr with Hadoop
 
Large Scale ETL for Hadoop and Cloudera Search using Morphlines
Large Scale ETL for Hadoop and Cloudera Search using MorphlinesLarge Scale ETL for Hadoop and Cloudera Search using Morphlines
Large Scale ETL for Hadoop and Cloudera Search using Morphlines
 
Search UI and Lucidworks View: Presented by Josh Ellinger, Lucidworks
Search UI and Lucidworks View: Presented by Josh Ellinger, LucidworksSearch UI and Lucidworks View: Presented by Josh Ellinger, Lucidworks
Search UI and Lucidworks View: Presented by Josh Ellinger, Lucidworks
 
Cloudera Search Webinar: Big Data Search, Bigger Insights
Cloudera Search Webinar: Big Data Search, Bigger InsightsCloudera Search Webinar: Big Data Search, Bigger Insights
Cloudera Search Webinar: Big Data Search, Bigger Insights
 
World of Watson 2016 - Architecting your Analytics House
World of Watson 2016 - Architecting your Analytics HouseWorld of Watson 2016 - Architecting your Analytics House
World of Watson 2016 - Architecting your Analytics House
 
Real-Time Analytics with Solr: Presented by Yonik Seeley, Cloudera
Real-Time Analytics with Solr: Presented by Yonik Seeley, ClouderaReal-Time Analytics with Solr: Presented by Yonik Seeley, Cloudera
Real-Time Analytics with Solr: Presented by Yonik Seeley, Cloudera
 
Ovum Fireside Chat: Governing the data lake - Understanding what's in there
Ovum Fireside Chat: Governing the data lake - Understanding what's in thereOvum Fireside Chat: Governing the data lake - Understanding what's in there
Ovum Fireside Chat: Governing the data lake - Understanding what's in there
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applications
 
Time Series Processing with Solr and Spark: Presented by Josef Adersberger, Q...
Time Series Processing with Solr and Spark: Presented by Josef Adersberger, Q...Time Series Processing with Solr and Spark: Presented by Josef Adersberger, Q...
Time Series Processing with Solr and Spark: Presented by Josef Adersberger, Q...
 
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...
 
Using Morphlines for On-the-Fly ETL
Using Morphlines for On-the-Fly ETLUsing Morphlines for On-the-Fly ETL
Using Morphlines for On-the-Fly ETL
 
Tuning Solr & Pipeline for Logs
Tuning Solr & Pipeline for LogsTuning Solr & Pipeline for Logs
Tuning Solr & Pipeline for Logs
 
Search++: Cognitive transformation of human-system interaction: Presented by ...
Search++: Cognitive transformation of human-system interaction: Presented by ...Search++: Cognitive transformation of human-system interaction: Presented by ...
Search++: Cognitive transformation of human-system interaction: Presented by ...
 
Understanding Metadata: Why it's essential to your big data solution and how ...
Understanding Metadata: Why it's essential to your big data solution and how ...Understanding Metadata: Why it's essential to your big data solution and how ...
Understanding Metadata: Why it's essential to your big data solution and how ...
 
Aesop change data propagation
Aesop change data propagationAesop change data propagation
Aesop change data propagation
 
Visualize Solr Data with Banana: Presented by Andrew Thanalertvisuti, Lucidworks
Visualize Solr Data with Banana: Presented by Andrew Thanalertvisuti, LucidworksVisualize Solr Data with Banana: Presented by Andrew Thanalertvisuti, Lucidworks
Visualize Solr Data with Banana: Presented by Andrew Thanalertvisuti, Lucidworks
 
Etsy Search: How We Index and Query 26 Million One-of-a-kind Items
Etsy Search: How We Index and Query 26 Million One-of-a-kind ItemsEtsy Search: How We Index and Query 26 Million One-of-a-kind Items
Etsy Search: How We Index and Query 26 Million One-of-a-kind Items
 

Similar to Searching The Enterprise Data Lake With Solr - Watch Us Do It!: Presented by Paul Nelson, Search Technologies

5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game ChangerCaserta
 
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014ALTER WAY
 
The modern analytics architecture
The modern analytics architectureThe modern analytics architecture
The modern analytics architectureJoseph D'Antoni
 
Ncku csie talk about Spark
Ncku csie talk about SparkNcku csie talk about Spark
Ncku csie talk about SparkGiivee The
 
The Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They NeedThe Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They NeedDunn Solutions Group
 
05_Decision Support and OLAP.pdf
05_Decision Support and OLAP.pdf05_Decision Support and OLAP.pdf
05_Decision Support and OLAP.pdfINyomanSwitrayana
 
Testing Big Data: Automated Testing of Hadoop with QuerySurge
Testing Big Data: Automated  Testing of Hadoop with QuerySurgeTesting Big Data: Automated  Testing of Hadoop with QuerySurge
Testing Big Data: Automated Testing of Hadoop with QuerySurgeRTTS
 
Transform from database professional to a Big Data architect
Transform from database professional to a Big Data architectTransform from database professional to a Big Data architect
Transform from database professional to a Big Data architectSaurabh K. Gupta
 
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...Platfora
 
No sql and sql - open analytics summit
No sql and sql - open analytics summitNo sql and sql - open analytics summit
No sql and sql - open analytics summitOpen Analytics
 
Cloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinarCloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinarHortonworks
 
Big Data + Sentiment Analysis = Awesome
Big Data + Sentiment Analysis = AwesomeBig Data + Sentiment Analysis = Awesome
Big Data + Sentiment Analysis = AwesomeAdel Rahimi
 
Utah Big Mountain Big Data Baby Steps (4-12-2014) Final
Utah Big Mountain   Big Data Baby Steps (4-12-2014) FinalUtah Big Mountain   Big Data Baby Steps (4-12-2014) Final
Utah Big Mountain Big Data Baby Steps (4-12-2014) FinalNick Baguley
 
Hadoop is dead - long live Hadoop | BiDaTA 2013 Genoa
Hadoop is dead - long live Hadoop | BiDaTA 2013 GenoaHadoop is dead - long live Hadoop | BiDaTA 2013 Genoa
Hadoop is dead - long live Hadoop | BiDaTA 2013 Genoalarsgeorge
 
Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017
Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017
Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017AWS Chicago
 
Navigating SAP’s Integration Options (Mastering SAP Technologies 2013)
Navigating SAP’s Integration Options (Mastering SAP Technologies 2013)Navigating SAP’s Integration Options (Mastering SAP Technologies 2013)
Navigating SAP’s Integration Options (Mastering SAP Technologies 2013)Sascha Wenninger
 
Turn Data Into Actionable Insights - StampedeCon 2016
Turn Data Into Actionable Insights - StampedeCon 2016Turn Data Into Actionable Insights - StampedeCon 2016
Turn Data Into Actionable Insights - StampedeCon 2016StampedeCon
 

Similar to Searching The Enterprise Data Lake With Solr - Watch Us Do It!: Presented by Paul Nelson, Search Technologies (20)

5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer
 
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
 
The modern analytics architecture
The modern analytics architectureThe modern analytics architecture
The modern analytics architecture
 
Ncku csie talk about Spark
Ncku csie talk about SparkNcku csie talk about Spark
Ncku csie talk about Spark
 
The Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They NeedThe Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They Need
 
05_Decision Support and OLAP.pdf
05_Decision Support and OLAP.pdf05_Decision Support and OLAP.pdf
05_Decision Support and OLAP.pdf
 
Testing Big Data: Automated Testing of Hadoop with QuerySurge
Testing Big Data: Automated  Testing of Hadoop with QuerySurgeTesting Big Data: Automated  Testing of Hadoop with QuerySurge
Testing Big Data: Automated Testing of Hadoop with QuerySurge
 
Transform from database professional to a Big Data architect
Transform from database professional to a Big Data architectTransform from database professional to a Big Data architect
Transform from database professional to a Big Data architect
 
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
 
No sql and sql - open analytics summit
No sql and sql - open analytics summitNo sql and sql - open analytics summit
No sql and sql - open analytics summit
 
Cloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinarCloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinar
 
Big Data + Sentiment Analysis = Awesome
Big Data + Sentiment Analysis = AwesomeBig Data + Sentiment Analysis = Awesome
Big Data + Sentiment Analysis = Awesome
 
Utah Big Mountain Big Data Baby Steps (4-12-2014) Final
Utah Big Mountain   Big Data Baby Steps (4-12-2014) FinalUtah Big Mountain   Big Data Baby Steps (4-12-2014) Final
Utah Big Mountain Big Data Baby Steps (4-12-2014) Final
 
Big Data, Baby Steps
Big Data, Baby StepsBig Data, Baby Steps
Big Data, Baby Steps
 
The EDW Ecosystem
The EDW EcosystemThe EDW Ecosystem
The EDW Ecosystem
 
Hadoop is dead - long live Hadoop | BiDaTA 2013 Genoa
Hadoop is dead - long live Hadoop | BiDaTA 2013 GenoaHadoop is dead - long live Hadoop | BiDaTA 2013 Genoa
Hadoop is dead - long live Hadoop | BiDaTA 2013 Genoa
 
Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017
Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017
Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017
 
Navigating SAP’s Integration Options (Mastering SAP Technologies 2013)
Navigating SAP’s Integration Options (Mastering SAP Technologies 2013)Navigating SAP’s Integration Options (Mastering SAP Technologies 2013)
Navigating SAP’s Integration Options (Mastering SAP Technologies 2013)
 
Turn Data Into Actionable Insights - StampedeCon 2016
Turn Data Into Actionable Insights - StampedeCon 2016Turn Data Into Actionable Insights - StampedeCon 2016
Turn Data Into Actionable Insights - StampedeCon 2016
 
Big data
Big dataBig data
Big data
 

More from Lucidworks

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategyLucidworks
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceLucidworks
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsLucidworks
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesLucidworks
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Lucidworks
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...Lucidworks
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Lucidworks
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Lucidworks
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteLucidworks
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentLucidworks
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeLucidworks
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Lucidworks
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchLucidworks
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Lucidworks
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyLucidworks
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Lucidworks
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceLucidworks
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchLucidworks
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondLucidworks
 

More from Lucidworks (20)

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce Strategy
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in Salesforce
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant Products
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized Experiences
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and Rosette
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - Europe
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 Research
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise Search
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and Beyond
 

Recently uploaded

"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Product School
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeCzechDreamin
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...Product School
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Alison B. Lowndes
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Product School
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomCzechDreamin
 
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka DoktorováCzechDreamin
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyJohn Staveley
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...CzechDreamin
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...Elena Simperl
 
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2DianaGray10
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...Product School
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlPeter Udo Diehl
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...Product School
 

Recently uploaded (20)

"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
 
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 

Searching The Enterprise Data Lake With Solr - Watch Us Do It!: Presented by Paul Nelson, Search Technologies

  • 1. O C T O B E R   1 1 -­‐ 1 4 ,   2 0 1 6     •     B O S T O N ,   M A  
  • 2. O C T O B E R   1 1 -­‐ 1 4 ,   2 0 1 6     •     B O S T O N ,   M A   Searching the Enterprise Data Lake with Solr - Watch us do it! Paul Nelson – pnelson@searchtechnologies.com Chief Architect, Search Technologies
  • 3. THERE  WILL  BE  A  DEMO   Stay  Tuned!  
  • 4. 205+  Search  Consultants  Worldwide   San  Diego   San  Jose,  CR   Cincinna6   Manila,  PH   Washington              (HQ)   •  Founded  2005   •  Deep  search  experLse   •  900+  customers  worldwide   •  Consistent  profitability   •  Search  engines  &  Big  Data   •  Vendor  independent   London,  UK   Frankfurt,  DE   Prague,  CZ  
  • 5. Agenda   •  The  Enterprise  Data  Lake  (EDL)   •  Why  Search  the  EDL?   •  The  Process   •  How  To:    Step  By  Step   •  And  then  what?  
  • 6. In  The  Beginning   Applica6on   Computer  Users   Database   Dashboards   Reports   Search  &   Troubleshoo6ng   Alerts  
  • 7. This  Evolved  to  Data  Warehouses   Many  Computer  Users   Dozens  of     Applica6ons  Dozens  of     Applica6ons  Dozens  of     Applica6ons  Dozens  of     Applica6ons  Dozens  of     Applica6ons  Dozens  of     Applica6ons   Extract   Transform   Load   Enterprise   Data  Warehouse   Dashboards   Reports   Search  &   Troubleshoo6ng   Alerts  
  • 8. And  Now  the  Enterprise  Data  Lake   Many,  many,  many    Computer  Users   Enterprise   Data  Lake   Dashboards   Reports   Search  &   Troubleshoo6ng   Alerts   Analyze   Hundreds  of   Applica6ons   Raw  Data   And  Processed   Data  
  • 9. What’s  new  about  the  Data  Lake?   •  Ingest  RAW  DATA   •  Keep  it  FOREVER   •  Make  it  ALL  AVAILABLE   •  Analyze  it  ONLY  WHEN  NEEDED   •  Do  it  at  MASSIVE  SCALE  
  • 10. Why  the  Data  Lake?   •  You  never  know  what’s  important  up  front   –  New  data  mining  techniques  invented  daily   –  Therefore,  keep  everything   •  There  is  too  much  data  variety   –  Therefore,  only  process  what  you  need   •  Save  money  by  not  ETL’ing  useless  stuff   •  There  are  many  different  use  cases   –  Shared  re-­‐use  of  data  by  anyone   –  Data  is  power!  Power  to  the  people!  
  • 11. But  Now  There’s  a  Problem:   •  10’s  of  thousands  of  databases   •  Billions  of  records     How  to  find  the  data  you  need?  
  • 12. SO  LET’S  SEARCH  THE  DATA  LAKE  
  • 13. “People  today  think  search  and  big   data  are  separate  but  in  two  or  three   years,  everyone  will  wonder  why  we   ever  thought  that.”     Doug  Cu?ng   Chief  Architect,  Cloudera   Creator  of  Lucene  &  Hadoop  
  • 14. The  Process   Ingest   1   Research   the  Data   2   Configure  Solr   3   Parse  &     Index   4   Search  &   Analyze   5   Produc6on   6  
  • 15. 1.    Ingest   HDFS   Load   Data   Hadoop  
  • 16. 2.    Research  the  Data   HDFS   Research   Hadoop  
  • 17. 3.    Configure  Solr   HDFS   solrconfig.xml   schema.xml   Hadoop  
  • 18. 4.    Parse  &  Index   HDFS   Index  Morphlines   Hadoop  
  • 19. 5.    Search  &  Analyze   HDFS   Index   Hadoop   Hue  Morphlines  
  • 20. 6.  Move  to  Produc6on   •  Tes6ng,  Quality  Control   –  Field  processing   –  Search  Features   –  Analy6cs   •  Incremental  Processing   –  Flume,  Spark  Streaming,  Incremental  Batches   •  Workflow  /  Scheduled  Jobs  (Oozie)   •  Security  Controls  
  • 21. WATCH  US  DO  IT!  
  • 22. Resources   •  HDFS  File  System  Commands   –  hips://hadoop.apache.org/docs/r2.7.3/hadoop-­‐project-­‐dist/hadoop-­‐common/FileSystemShell.html   •  solrctl  Reference  Guide   –  hips://www.cloudera.com/documenta6on/enterprise/5-­‐7-­‐x/topics/search_solrctl_ref.html     •  Morphlines  Reference  Guide   –  hip://kitesdk.org/docs/1.1.0/morphlines/morphlines-­‐reference-­‐guide.html   –  hips://github.com/typesafehub/config/blob/master/HOCON.md     •  MapReduce  Indexer  Tool   –  hips://github.com/cloudera/search/tree/cdh5-­‐1.0.0_5.2.1/search-­‐mr     •  Crunch  Indexer   –  hips://github.com/cloudera/search/tree/cdh5-­‐1.0.0_5.2.1/search-­‐crunch     •  Lily  HBase  Indexer   –  hip://www.cloudera.com/documenta6on/enterprise/latest/topics/search_hbase_batch_indexer.html    
  • 23. What’s  Next   •  Explore  other  analy6c  interfaces   –  Banana,  Zoom  Data   •  Spark   –  Streaming  Data   –  Complex  Analy6cs  à  Store  results  in  Solr  à  More  analy6cs!   •  Index  Many  More  Collec6ons   –  Create  a  Process:    Data  research  à  Data  Model  Design  à  Implement   •  Self-­‐Service  Inges6on   –  Document  processes  for  others  to  use   –  Templates  for  inges6on   •  Hire  Search  Technologies!