SlideShare a Scribd company logo
Solutions Engineer
@MarcSelwan
Marc Selwan
Enabling Search in your Cassandra Application with Datastax Enterprise
1
Why Search?
Confidential
Confidential
Confidential
The bright blue butterfly hangs on the breeze.
[the] [bright] [blue] [butterfly] [hangs] [on] [the] [breeze]
Terms
Confidential Credit: https://developer.apple.com/library/mac/documentation/userexperience/conceptual/SearchKitConcepts/searchKit_basics/searchKit_basics.html
What is Solr Missing?
Not a
Database
Doesn’t
Cluster
Not
transparently
sharded
Requires ETL
to injest
application data
Doesn’t
Reindex
Confidential
7
OLTP DB Search Cluster
Your Application
DB API Search API
Your
ETL
Transactional
Workloads
Search
Workloads
Open Source Search Reference Architecture
Confidential
+ =
DSE Search Reference Architecture
Confidential
9
Search
+
Cassandra
80
10
3050
70
60
40
20
Your Application
CQL
Easy CQL API
All the goodness of DataStax driver
Distributed, Replicated, Always On
Data locality and shared memory
• Automatic indexing on db insert
• Higher ingestion throughput
• Distributed query optimization
Compared to open source search
• No separate search cluster to manage
• Probably less total hardware required
• No “Split Brain” data inconsistencies
• No ETL or synch to build and maintain
• No app level data management code
Data stored in Cassandra
Indexes stored in Solr/Lucene
Disk
Memory
Solr Cassandra
Disk
Memory
Mem-
Table
Index
Segments
Ram Buffer
Index
Segments
Index
Segments
Mem-
Table
Mem-
table
Index
Segments
SSTables
Commit
Log
Coordinator
Index
Segments
Shard Router
UPDATE videos (videoid, tags)
SET tags = {‘cat tubes’, ‘Al Gore’s Internet’,
‘NoSQL Fairytales’}
WHERE voided = b3a76c6b-7c7f-4af6-964f-803a9283c401
OSS Solr
Disk
Memory
Index
Segments
Ram Buffer
Index
Segments
Index
Segments
Index
Segments
Index
Segments
Not Searchable
Searchable
DSE Search
Disk
Memory
Index
Segments
Ram Buffer
Index
Segments
Index
Segments
Index
Segments
Index
Segments
Searchable
Confidential
Let’s see this in action!
Search in Retail
Filter queries: These are awesome because the result set gets cached in memory.
SELECT * FROM amazon.metadata WHERE solr_query='{"q":"title:Noir~", "fq":"categories:Books", "sort":"title
asc"}' limit 10;
Faceting: Get counts of fields
SELECT * FROM amazon.metadata WHERE solr_query='{"q":"title:Noir~", "facet":{"field":"categories"}}' limit 10;
Geospatial Searches: Supports box and radius
SELECT * FROM amazon.clicks WHERE solr_query='{"q":"asin:*", "fq":"+{!geofilt pt="37.7484,-122.4156"
sfield=location d=1}"}' limit 10;
Joins: Not your relational joins. These queries 'borrow' indexes from other tables to add filter logic. These are
fast!
SELECT * FROM amazon.metadata WHERE solr_query='{"q":"*:*", "fq":"{!join from=asin to=asin force=true
fromIndex=amazon.clicks}area_code:415"}' limit 5;
Fun all in one.
SELECT * FROM amazon.metadata WHERE solr_query='{"q":"*:*", "facet":{"field":"categories"}, "fq":"{!join
from=asin to=asin force=true fromIndex=amazon.clicks}area_code:415"}' limit 5;
How do you get started??
Confidential
1) Spin up a new C* Cluster with search enabled using the DSE installer.
$ sudo service dse cassandra -s
2) Run your schema DDL to create the C* keyspace and tables.
3) Run dse_tool on the videos table*
$ dsetool create_core keyspace.table generateResources=true reindex=true
4) Write a CQL query with a Solr Search in it.
SELECT * FROM keyspace.table
WHERE solr_query=‘column:*’
*This will create lucene indexes on ALL the columns in your table.
Behind the scenes…
dse_tool
schema.xml
solrconfig.xml
CQL Query
$ dsetool create_core killrvideo.videos generateResources=true
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<schema name="autoSolrSchema" version="1.5">
<types>
…
<fields>
<field indexed="true" multiValued="false" name="added_date" stored="true" type="TrieDateField"/>
<field indexed="true" multiValued="false" name="location" stored="true" type="TextField"/>
<field indexed="true" multiValued="false" name="preview_image_location" stored="true" type="TextField"/>
<field indexed="true" multiValued="false" name="name" termVectors="true" stored="true" type="TextField"/>
<field indexed="true" multiValued="true" name="tags" termVectors="true" stored="true" type="TextField"/>
<field indexed="true" multiValued="false" name="userid" stored="true" type="UUIDField"/>
<field indexed="true" multiValued="false" name="videoid" stored="true" type="UUIDField"/>
<field indexed="true" multiValued="false" name="location_type" stored="true" type="TrieIntField"/>
<field indexed="true" multiValued="false" name="description" termVectors="true" stored="true" type="TextField"/>
</fields>
<uniqueKey>videoid</uniqueKey>
</schema>
<!--
=======
Copyright DataStax, Inc.
Please see the included license file for details.
-->
<!--
For more details about configurations options that may appear in
this file, see http://wiki.apache.org/solr/SolrConfigXml.
-->
<config>
<!-- In all configuration below, a prefix of "solr." for class names
is an alias that causes solr to search appropriate packages,
including org.apache.solr.(search|update|request|core|analysis)
You may also specify a fully qualified Java classname if you
have your own custom plugins.
-->
…
SELECT * FROM killrvideo.videos
WHERE solr_query=‘name:*’
Thank you!
25

More Related Content

What's hot

Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
Alluxio, Inc.
 
Apache Iceberg Presentation for the St. Louis Big Data IDEA
Apache Iceberg Presentation for the St. Louis Big Data IDEAApache Iceberg Presentation for the St. Louis Big Data IDEA
Apache Iceberg Presentation for the St. Louis Big Data IDEA
Adam Doyle
 
Building Data Quality pipelines with Apache Spark and Delta Lake
Building Data Quality pipelines with Apache Spark and Delta LakeBuilding Data Quality pipelines with Apache Spark and Delta Lake
Building Data Quality pipelines with Apache Spark and Delta Lake
Databricks
 
Get Started with the Most Advanced Edition Yet of Neo4j Graph Data Science
Get Started with the Most Advanced Edition Yet of Neo4j Graph Data ScienceGet Started with the Most Advanced Edition Yet of Neo4j Graph Data Science
Get Started with the Most Advanced Edition Yet of Neo4j Graph Data Science
Neo4j
 
Dense Retrieval with Apache Solr Neural Search.pdf
Dense Retrieval with Apache Solr Neural Search.pdfDense Retrieval with Apache Solr Neural Search.pdf
Dense Retrieval with Apache Solr Neural Search.pdf
Sease
 
File Format Benchmark - Avro, JSON, ORC and Parquet
File Format Benchmark - Avro, JSON, ORC and ParquetFile Format Benchmark - Avro, JSON, ORC and Parquet
File Format Benchmark - Avro, JSON, ORC and Parquet
DataWorks Summit/Hadoop Summit
 
Data lineage and observability with Marquez - subsurface 2020
Data lineage and observability with Marquez - subsurface 2020Data lineage and observability with Marquez - subsurface 2020
Data lineage and observability with Marquez - subsurface 2020
Julien Le Dem
 
Apache Atlas: Governance for your Data
Apache Atlas: Governance for your DataApache Atlas: Governance for your Data
Apache Atlas: Governance for your Data
DataWorks Summit/Hadoop Summit
 
Observability for Data Pipelines With OpenLineage
Observability for Data Pipelines With OpenLineageObservability for Data Pipelines With OpenLineage
Observability for Data Pipelines With OpenLineage
Databricks
 
Snowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at ScaleSnowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at Scale
Adam Doyle
 
Empower Splunk and other SIEMs with the Databricks Lakehouse for Cybersecurity
Empower Splunk and other SIEMs with the Databricks Lakehouse for CybersecurityEmpower Splunk and other SIEMs with the Databricks Lakehouse for Cybersecurity
Empower Splunk and other SIEMs with the Databricks Lakehouse for Cybersecurity
Databricks
 
High-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQLHigh-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQL
ScyllaDB
 
Dynamic Partition Pruning in Apache Spark
Dynamic Partition Pruning in Apache SparkDynamic Partition Pruning in Apache Spark
Dynamic Partition Pruning in Apache Spark
Databricks
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
Databricks
 
My first 90 days with ClickHouse.pdf
My first 90 days with ClickHouse.pdfMy first 90 days with ClickHouse.pdf
My first 90 days with ClickHouse.pdf
Alkin Tezuysal
 
Data Sharing with Snowflake
Data Sharing with SnowflakeData Sharing with Snowflake
Data Sharing with Snowflake
Snowflake Computing
 
Apache Hadoop Security - Ranger
Apache Hadoop Security - RangerApache Hadoop Security - Ranger
Apache Hadoop Security - Ranger
Isheeta Sanghi
 
A Reference Architecture for ETL 2.0
A Reference Architecture for ETL 2.0 A Reference Architecture for ETL 2.0
A Reference Architecture for ETL 2.0
DataWorks Summit
 
Azure Data Factory v2
Azure Data Factory v2Azure Data Factory v2
Azure Data Factory v2
inovex GmbH
 
Neural Search Comes to Apache Solr_ Approximate Nearest Neighbor, BERT and Mo...
Neural Search Comes to Apache Solr_ Approximate Nearest Neighbor, BERT and Mo...Neural Search Comes to Apache Solr_ Approximate Nearest Neighbor, BERT and Mo...
Neural Search Comes to Apache Solr_ Approximate Nearest Neighbor, BERT and Mo...
Sease
 

What's hot (20)

Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
 
Apache Iceberg Presentation for the St. Louis Big Data IDEA
Apache Iceberg Presentation for the St. Louis Big Data IDEAApache Iceberg Presentation for the St. Louis Big Data IDEA
Apache Iceberg Presentation for the St. Louis Big Data IDEA
 
Building Data Quality pipelines with Apache Spark and Delta Lake
Building Data Quality pipelines with Apache Spark and Delta LakeBuilding Data Quality pipelines with Apache Spark and Delta Lake
Building Data Quality pipelines with Apache Spark and Delta Lake
 
Get Started with the Most Advanced Edition Yet of Neo4j Graph Data Science
Get Started with the Most Advanced Edition Yet of Neo4j Graph Data ScienceGet Started with the Most Advanced Edition Yet of Neo4j Graph Data Science
Get Started with the Most Advanced Edition Yet of Neo4j Graph Data Science
 
Dense Retrieval with Apache Solr Neural Search.pdf
Dense Retrieval with Apache Solr Neural Search.pdfDense Retrieval with Apache Solr Neural Search.pdf
Dense Retrieval with Apache Solr Neural Search.pdf
 
File Format Benchmark - Avro, JSON, ORC and Parquet
File Format Benchmark - Avro, JSON, ORC and ParquetFile Format Benchmark - Avro, JSON, ORC and Parquet
File Format Benchmark - Avro, JSON, ORC and Parquet
 
Data lineage and observability with Marquez - subsurface 2020
Data lineage and observability with Marquez - subsurface 2020Data lineage and observability with Marquez - subsurface 2020
Data lineage and observability with Marquez - subsurface 2020
 
Apache Atlas: Governance for your Data
Apache Atlas: Governance for your DataApache Atlas: Governance for your Data
Apache Atlas: Governance for your Data
 
Observability for Data Pipelines With OpenLineage
Observability for Data Pipelines With OpenLineageObservability for Data Pipelines With OpenLineage
Observability for Data Pipelines With OpenLineage
 
Snowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at ScaleSnowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at Scale
 
Empower Splunk and other SIEMs with the Databricks Lakehouse for Cybersecurity
Empower Splunk and other SIEMs with the Databricks Lakehouse for CybersecurityEmpower Splunk and other SIEMs with the Databricks Lakehouse for Cybersecurity
Empower Splunk and other SIEMs with the Databricks Lakehouse for Cybersecurity
 
High-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQLHigh-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQL
 
Dynamic Partition Pruning in Apache Spark
Dynamic Partition Pruning in Apache SparkDynamic Partition Pruning in Apache Spark
Dynamic Partition Pruning in Apache Spark
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
 
My first 90 days with ClickHouse.pdf
My first 90 days with ClickHouse.pdfMy first 90 days with ClickHouse.pdf
My first 90 days with ClickHouse.pdf
 
Data Sharing with Snowflake
Data Sharing with SnowflakeData Sharing with Snowflake
Data Sharing with Snowflake
 
Apache Hadoop Security - Ranger
Apache Hadoop Security - RangerApache Hadoop Security - Ranger
Apache Hadoop Security - Ranger
 
A Reference Architecture for ETL 2.0
A Reference Architecture for ETL 2.0 A Reference Architecture for ETL 2.0
A Reference Architecture for ETL 2.0
 
Azure Data Factory v2
Azure Data Factory v2Azure Data Factory v2
Azure Data Factory v2
 
Neural Search Comes to Apache Solr_ Approximate Nearest Neighbor, BERT and Mo...
Neural Search Comes to Apache Solr_ Approximate Nearest Neighbor, BERT and Mo...Neural Search Comes to Apache Solr_ Approximate Nearest Neighbor, BERT and Mo...
Neural Search Comes to Apache Solr_ Approximate Nearest Neighbor, BERT and Mo...
 

Viewers also liked

Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
DataStax Academy
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
DataStax Academy
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
DataStax Academy
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
DataStax Academy
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
DataStax Academy
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
DataStax Academy
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
DataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
DataStax Academy
 
Using Event-Driven Architectures with Cassandra
Using Event-Driven Architectures with CassandraUsing Event-Driven Architectures with Cassandra
Using Event-Driven Architectures with Cassandra
DataStax Academy
 
Apache Cassandra and Drivers
Apache Cassandra and DriversApache Cassandra and Drivers
Apache Cassandra and Drivers
DataStax Academy
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
DataStax Academy
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
DataStax Academy
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
DataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
DataStax Academy
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
DataStax Academy
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
DataStax Academy
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
DataStax Academy
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
DataStax Academy
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
DataStax Academy
 

Viewers also liked (20)

Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
 
Using Event-Driven Architectures with Cassandra
Using Event-Driven Architectures with CassandraUsing Event-Driven Architectures with Cassandra
Using Event-Driven Architectures with Cassandra
 
Apache Cassandra and Drivers
Apache Cassandra and DriversApache Cassandra and Drivers
Apache Cassandra and Drivers
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
 

Similar to Enabling Search in your Cassandra Application with DataStax Enterprise

DataStax: Enabling Search in your Cassandra Application with DataStax Enterprise
DataStax: Enabling Search in your Cassandra Application with DataStax EnterpriseDataStax: Enabling Search in your Cassandra Application with DataStax Enterprise
DataStax: Enabling Search in your Cassandra Application with DataStax Enterprise
DataStax Academy
 
A Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax EnterpriseA Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
Patrick McFadin
 
From 0 to hero adf cicd pass mdpug oslo feb 2020
From 0 to hero adf cicd pass mdpug oslo feb 2020From 0 to hero adf cicd pass mdpug oslo feb 2020
From 0 to hero adf cicd pass mdpug oslo feb 2020
Halvar Trøyel Nerbø
 
GreenDao Introduction
GreenDao IntroductionGreenDao Introduction
GreenDao Introduction
Booch Lin
 
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
StampedeCon
 
Data Exploration with Apache Drill: Day 1
Data Exploration with Apache Drill:  Day 1Data Exploration with Apache Drill:  Day 1
Data Exploration with Apache Drill: Day 1
Charles Givre
 
SQL KONFERENZ 2020 Azure Key Vault, Azure Dev Ops and Azure Data Factory how...
SQL KONFERENZ 2020  Azure Key Vault, Azure Dev Ops and Azure Data Factory how...SQL KONFERENZ 2020  Azure Key Vault, Azure Dev Ops and Azure Data Factory how...
SQL KONFERENZ 2020 Azure Key Vault, Azure Dev Ops and Azure Data Factory how...
Erwin de Kreuk
 
Beyond the Query: A Cassandra + Solr + Spark Love Triangle Using Datastax Ent...
Beyond the Query: A Cassandra + Solr + Spark Love Triangle Using Datastax Ent...Beyond the Query: A Cassandra + Solr + Spark Love Triangle Using Datastax Ent...
Beyond the Query: A Cassandra + Solr + Spark Love Triangle Using Datastax Ent...
DataStax Academy
 
Elasticsearch sur Azure : Make sense of your (BIG) data !
Elasticsearch sur Azure : Make sense of your (BIG) data !Elasticsearch sur Azure : Make sense of your (BIG) data !
Elasticsearch sur Azure : Make sense of your (BIG) data !
Microsoft
 
TIAD : Automating the modern datacenter
TIAD : Automating the modern datacenterTIAD : Automating the modern datacenter
TIAD : Automating the modern datacenter
The Incredible Automation Day
 
Webinar: What's New in Solr 6
Webinar: What's New in Solr 6Webinar: What's New in Solr 6
Webinar: What's New in Solr 6
Lucidworks
 
Painless Persistence in a Disconnected World
Painless Persistence in a Disconnected WorldPainless Persistence in a Disconnected World
Painless Persistence in a Disconnected World
Christian Melchior
 
MySQL Without the SQL -- Oh My! Longhorn PHP Conference
MySQL Without the SQL -- Oh My!  Longhorn PHP ConferenceMySQL Without the SQL -- Oh My!  Longhorn PHP Conference
MySQL Without the SQL -- Oh My! Longhorn PHP Conference
Dave Stokes
 
C* for Deep Learning (Andrew Jefferson, Tracktable) | Cassandra Summit 2016
C* for Deep Learning (Andrew Jefferson, Tracktable) | Cassandra Summit 2016C* for Deep Learning (Andrew Jefferson, Tracktable) | Cassandra Summit 2016
C* for Deep Learning (Andrew Jefferson, Tracktable) | Cassandra Summit 2016
DataStax
 
Find Anything In Your APEX App - Fuzzy Search with Oracle Text
Find Anything In Your APEX App - Fuzzy Search with Oracle TextFind Anything In Your APEX App - Fuzzy Search with Oracle Text
Find Anything In Your APEX App - Fuzzy Search with Oracle Text
Carsten Czarski
 
MySQL Without the MySQL -- Oh My!
MySQL Without the MySQL -- Oh My!MySQL Without the MySQL -- Oh My!
MySQL Without the MySQL -- Oh My!
Dave Stokes
 
Slick – the modern way to access your Data
Slick – the modern way to access your DataSlick – the modern way to access your Data
Slick – the modern way to access your Data
Jochen Huelss
 
Iac d.damyanov 4.pptx
Iac d.damyanov 4.pptxIac d.damyanov 4.pptx
Iac d.damyanov 4.pptx
Dimitar Damyanov
 
DatamindsConnect2019 Azure Key Vault, Azure Dev Ops and Azure Data Factory ho...
DatamindsConnect2019 Azure Key Vault, Azure Dev Ops and Azure Data Factory ho...DatamindsConnect2019 Azure Key Vault, Azure Dev Ops and Azure Data Factory ho...
DatamindsConnect2019 Azure Key Vault, Azure Dev Ops and Azure Data Factory ho...
Erwin de Kreuk
 
Content search api in sitecore 8.1
Content search api in sitecore 8.1Content search api in sitecore 8.1
Content search api in sitecore 8.1
Anindita Bhattacharya
 

Similar to Enabling Search in your Cassandra Application with DataStax Enterprise (20)

DataStax: Enabling Search in your Cassandra Application with DataStax Enterprise
DataStax: Enabling Search in your Cassandra Application with DataStax EnterpriseDataStax: Enabling Search in your Cassandra Application with DataStax Enterprise
DataStax: Enabling Search in your Cassandra Application with DataStax Enterprise
 
A Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax EnterpriseA Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
 
From 0 to hero adf cicd pass mdpug oslo feb 2020
From 0 to hero adf cicd pass mdpug oslo feb 2020From 0 to hero adf cicd pass mdpug oslo feb 2020
From 0 to hero adf cicd pass mdpug oslo feb 2020
 
GreenDao Introduction
GreenDao IntroductionGreenDao Introduction
GreenDao Introduction
 
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
 
Data Exploration with Apache Drill: Day 1
Data Exploration with Apache Drill:  Day 1Data Exploration with Apache Drill:  Day 1
Data Exploration with Apache Drill: Day 1
 
SQL KONFERENZ 2020 Azure Key Vault, Azure Dev Ops and Azure Data Factory how...
SQL KONFERENZ 2020  Azure Key Vault, Azure Dev Ops and Azure Data Factory how...SQL KONFERENZ 2020  Azure Key Vault, Azure Dev Ops and Azure Data Factory how...
SQL KONFERENZ 2020 Azure Key Vault, Azure Dev Ops and Azure Data Factory how...
 
Beyond the Query: A Cassandra + Solr + Spark Love Triangle Using Datastax Ent...
Beyond the Query: A Cassandra + Solr + Spark Love Triangle Using Datastax Ent...Beyond the Query: A Cassandra + Solr + Spark Love Triangle Using Datastax Ent...
Beyond the Query: A Cassandra + Solr + Spark Love Triangle Using Datastax Ent...
 
Elasticsearch sur Azure : Make sense of your (BIG) data !
Elasticsearch sur Azure : Make sense of your (BIG) data !Elasticsearch sur Azure : Make sense of your (BIG) data !
Elasticsearch sur Azure : Make sense of your (BIG) data !
 
TIAD : Automating the modern datacenter
TIAD : Automating the modern datacenterTIAD : Automating the modern datacenter
TIAD : Automating the modern datacenter
 
Webinar: What's New in Solr 6
Webinar: What's New in Solr 6Webinar: What's New in Solr 6
Webinar: What's New in Solr 6
 
Painless Persistence in a Disconnected World
Painless Persistence in a Disconnected WorldPainless Persistence in a Disconnected World
Painless Persistence in a Disconnected World
 
MySQL Without the SQL -- Oh My! Longhorn PHP Conference
MySQL Without the SQL -- Oh My!  Longhorn PHP ConferenceMySQL Without the SQL -- Oh My!  Longhorn PHP Conference
MySQL Without the SQL -- Oh My! Longhorn PHP Conference
 
C* for Deep Learning (Andrew Jefferson, Tracktable) | Cassandra Summit 2016
C* for Deep Learning (Andrew Jefferson, Tracktable) | Cassandra Summit 2016C* for Deep Learning (Andrew Jefferson, Tracktable) | Cassandra Summit 2016
C* for Deep Learning (Andrew Jefferson, Tracktable) | Cassandra Summit 2016
 
Find Anything In Your APEX App - Fuzzy Search with Oracle Text
Find Anything In Your APEX App - Fuzzy Search with Oracle TextFind Anything In Your APEX App - Fuzzy Search with Oracle Text
Find Anything In Your APEX App - Fuzzy Search with Oracle Text
 
MySQL Without the MySQL -- Oh My!
MySQL Without the MySQL -- Oh My!MySQL Without the MySQL -- Oh My!
MySQL Without the MySQL -- Oh My!
 
Slick – the modern way to access your Data
Slick – the modern way to access your DataSlick – the modern way to access your Data
Slick – the modern way to access your Data
 
Iac d.damyanov 4.pptx
Iac d.damyanov 4.pptxIac d.damyanov 4.pptx
Iac d.damyanov 4.pptx
 
DatamindsConnect2019 Azure Key Vault, Azure Dev Ops and Azure Data Factory ho...
DatamindsConnect2019 Azure Key Vault, Azure Dev Ops and Azure Data Factory ho...DatamindsConnect2019 Azure Key Vault, Azure Dev Ops and Azure Data Factory ho...
DatamindsConnect2019 Azure Key Vault, Azure Dev Ops and Azure Data Factory ho...
 
Content search api in sitecore 8.1
Content search api in sitecore 8.1Content search api in sitecore 8.1
Content search api in sitecore 8.1
 

More from DataStax Academy

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
DataStax Academy
 
Getting Started with Graph Databases
Getting Started with Graph DatabasesGetting Started with Graph Databases
Getting Started with Graph Databases
DataStax Academy
 
Cassandra Data Maintenance with Spark
Cassandra Data Maintenance with SparkCassandra Data Maintenance with Spark
Cassandra Data Maintenance with Spark
DataStax Academy
 
Analytics with Spark and Cassandra
Analytics with Spark and CassandraAnalytics with Spark and Cassandra
Analytics with Spark and Cassandra
DataStax Academy
 
Make 2016 your year of SMACK talk
Make 2016 your year of SMACK talkMake 2016 your year of SMACK talk
Make 2016 your year of SMACK talk
DataStax Academy
 
Client Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right WayClient Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right Way
DataStax Academy
 
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
DataStax Academy
 
Traveler's Guide to Cassandra
Traveler's Guide to CassandraTraveler's Guide to Cassandra
Traveler's Guide to Cassandra
DataStax Academy
 
Cassandra: One (is the loneliest number)
Cassandra: One (is the loneliest number)Cassandra: One (is the loneliest number)
Cassandra: One (is the loneliest number)
DataStax Academy
 
Spark Cassandra Connector: Past, Present and Furure
Spark Cassandra Connector: Past, Present and FurureSpark Cassandra Connector: Past, Present and Furure
Spark Cassandra Connector: Past, Present and Furure
DataStax Academy
 
New features in 3.0
New features in 3.0New features in 3.0
New features in 3.0
DataStax Academy
 

More from DataStax Academy (11)

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
 
Getting Started with Graph Databases
Getting Started with Graph DatabasesGetting Started with Graph Databases
Getting Started with Graph Databases
 
Cassandra Data Maintenance with Spark
Cassandra Data Maintenance with SparkCassandra Data Maintenance with Spark
Cassandra Data Maintenance with Spark
 
Analytics with Spark and Cassandra
Analytics with Spark and CassandraAnalytics with Spark and Cassandra
Analytics with Spark and Cassandra
 
Make 2016 your year of SMACK talk
Make 2016 your year of SMACK talkMake 2016 your year of SMACK talk
Make 2016 your year of SMACK talk
 
Client Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right WayClient Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right Way
 
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
 
Traveler's Guide to Cassandra
Traveler's Guide to CassandraTraveler's Guide to Cassandra
Traveler's Guide to Cassandra
 
Cassandra: One (is the loneliest number)
Cassandra: One (is the loneliest number)Cassandra: One (is the loneliest number)
Cassandra: One (is the loneliest number)
 
Spark Cassandra Connector: Past, Present and Furure
Spark Cassandra Connector: Past, Present and FurureSpark Cassandra Connector: Past, Present and Furure
Spark Cassandra Connector: Past, Present and Furure
 
New features in 3.0
New features in 3.0New features in 3.0
New features in 3.0
 

Recently uploaded

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
Abida Shariff
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
Fwdays
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 

Recently uploaded (20)

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 

Enabling Search in your Cassandra Application with DataStax Enterprise

  • 1. Solutions Engineer @MarcSelwan Marc Selwan Enabling Search in your Cassandra Application with Datastax Enterprise 1
  • 4. Confidential The bright blue butterfly hangs on the breeze. [the] [bright] [blue] [butterfly] [hangs] [on] [the] [breeze] Terms
  • 6. What is Solr Missing? Not a Database Doesn’t Cluster Not transparently sharded Requires ETL to injest application data Doesn’t Reindex
  • 7. Confidential 7 OLTP DB Search Cluster Your Application DB API Search API Your ETL Transactional Workloads Search Workloads Open Source Search Reference Architecture
  • 9. DSE Search Reference Architecture Confidential 9 Search + Cassandra 80 10 3050 70 60 40 20 Your Application CQL Easy CQL API All the goodness of DataStax driver Distributed, Replicated, Always On Data locality and shared memory • Automatic indexing on db insert • Higher ingestion throughput • Distributed query optimization Compared to open source search • No separate search cluster to manage • Probably less total hardware required • No “Split Brain” data inconsistencies • No ETL or synch to build and maintain • No app level data management code
  • 10. Data stored in Cassandra Indexes stored in Solr/Lucene
  • 12. Disk Memory Mem- Table Index Segments Ram Buffer Index Segments Index Segments Mem- Table Mem- table Index Segments SSTables Commit Log Coordinator Index Segments Shard Router UPDATE videos (videoid, tags) SET tags = {‘cat tubes’, ‘Al Gore’s Internet’, ‘NoSQL Fairytales’} WHERE voided = b3a76c6b-7c7f-4af6-964f-803a9283c401
  • 17. Filter queries: These are awesome because the result set gets cached in memory. SELECT * FROM amazon.metadata WHERE solr_query='{"q":"title:Noir~", "fq":"categories:Books", "sort":"title asc"}' limit 10; Faceting: Get counts of fields SELECT * FROM amazon.metadata WHERE solr_query='{"q":"title:Noir~", "facet":{"field":"categories"}}' limit 10; Geospatial Searches: Supports box and radius SELECT * FROM amazon.clicks WHERE solr_query='{"q":"asin:*", "fq":"+{!geofilt pt="37.7484,-122.4156" sfield=location d=1}"}' limit 10; Joins: Not your relational joins. These queries 'borrow' indexes from other tables to add filter logic. These are fast! SELECT * FROM amazon.metadata WHERE solr_query='{"q":"*:*", "fq":"{!join from=asin to=asin force=true fromIndex=amazon.clicks}area_code:415"}' limit 5; Fun all in one. SELECT * FROM amazon.metadata WHERE solr_query='{"q":"*:*", "facet":{"field":"categories"}, "fq":"{!join from=asin to=asin force=true fromIndex=amazon.clicks}area_code:415"}' limit 5;
  • 18. How do you get started??
  • 19. Confidential 1) Spin up a new C* Cluster with search enabled using the DSE installer. $ sudo service dse cassandra -s 2) Run your schema DDL to create the C* keyspace and tables. 3) Run dse_tool on the videos table* $ dsetool create_core keyspace.table generateResources=true reindex=true 4) Write a CQL query with a Solr Search in it. SELECT * FROM keyspace.table WHERE solr_query=‘column:*’ *This will create lucene indexes on ALL the columns in your table.
  • 20. Behind the scenes… dse_tool schema.xml solrconfig.xml CQL Query $ dsetool create_core killrvideo.videos generateResources=true <?xml version="1.0" encoding="UTF-8" standalone="no"?> <schema name="autoSolrSchema" version="1.5"> <types> … <fields> <field indexed="true" multiValued="false" name="added_date" stored="true" type="TrieDateField"/> <field indexed="true" multiValued="false" name="location" stored="true" type="TextField"/> <field indexed="true" multiValued="false" name="preview_image_location" stored="true" type="TextField"/> <field indexed="true" multiValued="false" name="name" termVectors="true" stored="true" type="TextField"/> <field indexed="true" multiValued="true" name="tags" termVectors="true" stored="true" type="TextField"/> <field indexed="true" multiValued="false" name="userid" stored="true" type="UUIDField"/> <field indexed="true" multiValued="false" name="videoid" stored="true" type="UUIDField"/> <field indexed="true" multiValued="false" name="location_type" stored="true" type="TrieIntField"/> <field indexed="true" multiValued="false" name="description" termVectors="true" stored="true" type="TextField"/> </fields> <uniqueKey>videoid</uniqueKey> </schema> <!-- ======= Copyright DataStax, Inc. Please see the included license file for details. --> <!-- For more details about configurations options that may appear in this file, see http://wiki.apache.org/solr/SolrConfigXml. --> <config> <!-- In all configuration below, a prefix of "solr." for class names is an alias that causes solr to search appropriate packages, including org.apache.solr.(search|update|request|core|analysis) You may also specify a fully qualified Java classname if you have your own custom plugins. --> … SELECT * FROM killrvideo.videos WHERE solr_query=‘name:*’

Editor's Notes

  1. Search is everywhere. Tell story about kids with search and recommendations. Properly written queries, on the other hand, are completely precise. A query will give you exactly the result you ask for. Exactly what you asked for, which may or not be exactly what you want. The hardest part of a query is asking the right question. Search, on the other hand, is not precise. A dumb search just gives you a randomly ordered list of items in which your search term occurs. A smart search will provide you a ranked list of items that are strongly related to the search terms you entered, even if they do not match the exactly. This means that even though some of the results will doubtless be unrelated, and sometimes absurd, there is a very good chance that there is data you can use pretty high up in those results, even if you didn’t ask quite the right question. And there may well be relevant data you didn’t even think to ask for. Being smart in this way is a key benefit of search. Queries cannot be smart. Queries must always give you exactly what you asked for. There can be no tolerance for serendipity in query results. Search can be smart, but query must be dumb and strictly obedient. “Find me all webpages that contain “Cassandra” and “Optimization”” “Recommend artists who are “like” “Taylor Swift”” “Highlight the keywords “Awesome,” “Good,” and “Amazing.”” Find me all shoes with: Size: 12 Price: $15 - $60 Brand: Nike
  2. Lucene is the base to nearly every popular search engine out there, including elastic search (which is more a of fried taco shell)Open source , Fast, high performance, scalable search/IR library and was Initially developed by Doug Cutting (Also author of Hadoop) Provides advanced Search options like synonyms, stopwords, based on similarity, proximity. Faceted search (search by price, size, manufacturer, etc.) Geospatial search (combining location information, filtering by distance, and more) Solr was Created by Yonik Seeley for CNET as a Enterprise Search platform for Apache Lucene Open source, Highly reliable, scalable, fault tolerant. Its Support distributed Indexing (SolrCloud),Replication, and load balanced querying. Solr “wraps” the Lucene open sourc information retrieval engine You customize Solr via configuration & plug-ins Free-form text search includes wildcards and phrases Query support includes filtering (ranges, geo-spatial) and sorting You can run Solr as a webapp, or in stand-alone (embedded) mode You make search requests using HTTP GET requests Documents are added and deleted via HTTP POST requests
  3. These are extremely efficient and fast structures that live on disk. Simple queries are typically fulfilled in < 1ms. More complex ones, < 10ms
  4. Not a database- A user has to create an interface to add a database, no built in security, no SQL like language Requires ETL- data needs to be moved from an DBMS or other system to be indexed, increasing operational complexity, risk and reducing time for the data to be available. no automatic indexing- no reindex after lucene upgrade, to reindiex, you need to reinvest the data must reindex data after upgrade Not transparently sharded- shared is only available in SolrCloud, and is not transparently sharded. Doesn’t cluster- no HA or failover builtin. no load rebalancing when need to scale out.
  5. Cassandra has a lot of the things that Solr is missing, so the great minds at Datastax engineering put them together to create all the good things in one, easy to use package
  6. What is cassandra? High level overview: OSS NoSQL database optimized for extreme OLTP workloads. I will proceed that you understand the basics of C* from here; but if you don’t, please check out some of the other great intro videos on C* available on DSA.
  7. You get a lot of the benefits from using Solr on top of C* - No ETL! Just load the data in cassandra and it’s automatically replicated to Solr! Data is automatically replicated from the Cassandra DC to the Solr DC No single point of failure No ETL
  8. Let’s look closer at the how these 2 technologies are integrated. C Cassandra stores the data, and solr will create the lucene indexes required to perform searches on that particular node’s data.
  9. Cassandra and Solr reside together on a single DSE Search node in the same JVM. Consequently, you will want to size these nodes a bit larger than a typical cassandra node, esp when it comes to memory. This also does cause a very busy garbage collector, so it is recommended that you pay close attention to your JVM and garbage collection settings
  10. Review C* write path. The solr write path in DSE search is very similar. Coordinator communicates with the shard router to determine which node(s) get the write. DSE figures out which shards to query Favors local shards over remote shards Reduces fan-out by using multiple shards on each node This assumes replication factor > 1 Cassandra Router implementation called ShardRouter DSE sends a distributed query to remote nodes Uses random (shuffle) to pick which nodes by default Other shard shuffle strategies available. Each query has an fq=id:[token range start TO token range end] This limits the request to a subset of the shards on a node The data is first written to the ram buffer. From the ram buffer, based on the soft commit threshold, that data is written to index segments that live on disk, but are not fsynced. At the time the C* memtables flush to SSTables on disk, the data in the in-memory index segments are fsyncd and “hard committed” to disk.
  11. In traditional OSS Solr, the ram buffer is not searchable so data is not available to be searched until it is soft commited. These soft commit are not durable, so if the JVM crashes before these are synced to disk, you may lose some of your indexes. Creates a delay in being able to read newly invested data and the Soft commit threshold needs to be tuned… this is typically 1 - 10second delay.
  12. Datastax has modified OSS Solr so that the Ram buffer is now searchable, a bit like Cassandra memtables. This means that the data is available to be searched very shortly after being injested by Cassandra… less than 1 second in some cases. Now your real time application data will be available to the application to be searched without having to wait for soft commit. This will not only improve your application’s performance but also allow more data to be indexed in a shorter period of time.
  13. “Find me all webpages that contain “Cassandra” and “Optimization”” “Recommend artists who are “like” “Taylor Swift”” “Highlight the keywords “Awesome,” “Good,” and “Amazing.”” Find me all shoes with: Size: 12 Price: $15 - $60 Brand: Nike
  14. This ONLY allows search on tags, and the “index” table needs to be maintained manually. This is fine if this is all you want to do, and is very common. Location_type, currently ith only 2 values, will work with a2i, name on the other hand, would not. So how do you search by name? Secondary indexes will allow you to search on the whole field, but those aren’t efficient for higher cardinality values, like email address Secondary index creates additional data structures on each node that hold table partitions Each ‘local index’ indexes values in rows stored locally Query on an indexed column requires accessing ‘local indexes’ on all nodes Expensive Query on a partition key and an indexed column requires accessing a ‘local index’ on one (or a few) nodes Efficient But say you want to search the text in description or title? What now?