Your SlideShare is downloading. ×
An Introduction to Distributed Search with Datastax Enterprise Search
An Introduction to Distributed Search with Datastax Enterprise Search
An Introduction to Distributed Search with Datastax Enterprise Search
An Introduction to Distributed Search with Datastax Enterprise Search
An Introduction to Distributed Search with Datastax Enterprise Search
An Introduction to Distributed Search with Datastax Enterprise Search
An Introduction to Distributed Search with Datastax Enterprise Search
An Introduction to Distributed Search with Datastax Enterprise Search
An Introduction to Distributed Search with Datastax Enterprise Search
An Introduction to Distributed Search with Datastax Enterprise Search
An Introduction to Distributed Search with Datastax Enterprise Search
An Introduction to Distributed Search with Datastax Enterprise Search
An Introduction to Distributed Search with Datastax Enterprise Search
An Introduction to Distributed Search with Datastax Enterprise Search
An Introduction to Distributed Search with Datastax Enterprise Search
An Introduction to Distributed Search with Datastax Enterprise Search
An Introduction to Distributed Search with Datastax Enterprise Search
An Introduction to Distributed Search with Datastax Enterprise Search
An Introduction to Distributed Search with Datastax Enterprise Search
An Introduction to Distributed Search with Datastax Enterprise Search
An Introduction to Distributed Search with Datastax Enterprise Search
An Introduction to Distributed Search with Datastax Enterprise Search
An Introduction to Distributed Search with Datastax Enterprise Search
An Introduction to Distributed Search with Datastax Enterprise Search
An Introduction to Distributed Search with Datastax Enterprise Search
An Introduction to Distributed Search with Datastax Enterprise Search
An Introduction to Distributed Search with Datastax Enterprise Search
An Introduction to Distributed Search with Datastax Enterprise Search
An Introduction to Distributed Search with Datastax Enterprise Search
An Introduction to Distributed Search with Datastax Enterprise Search
An Introduction to Distributed Search with Datastax Enterprise Search
An Introduction to Distributed Search with Datastax Enterprise Search
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

An Introduction to Distributed Search with Datastax Enterprise Search

2,758

Published on

This is an overview of distributed search with Cassandra and Solr.

This is an overview of distributed search with Cassandra and Solr.

Published in: Technology
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,758
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
32
Comments
0
Likes
4
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. TOO BIGTO FAILAn Introduction to Distributed Search with Cassandra and SolrOpenSource Connections@PatriciaGorlapgorla@o19s.com
  • 2. ABOUT MESystems AnalystProgrammingInformation Retrieval
  • 3. Created at Facebook topower inbox searchDistributed data store runon commodity serversHighly availableNo one single point of failureCASSANDRA
  • 4. WHO USES CASSANDRA?
  • 5. SEARCH + CASSANDRA, 1• First implementation: Solandra (originally Lucandra)• Replaced Lucene index with Cassandra column families
  • 6. SEARCH + CASSANDRA, 2• DataStax Enterprise Search• Uses native Lucene index• All data is retrieved from Cassandra
  • 7. Datastax Enterprise Search ClusterDistributedLinearly ScalableHighly AvailableEventually ConsistentFull-text searchAggregation
  • 8. SETTING UPTHE SCHEMA<fields><field name="id" type="string" indexed="true" stored="true"/><field name="name" type="text" indexed="true" stored="true"/><field name="body" type="text" indexed="true" stored="true"/><field name="title" type="text" indexed="true" stored="true"/><field name="date" type="string" indexed="true" stored="true"/></fields>
  • 9. WRITINGTO CLUSTER• Write to either Cassandra clients or Solr API• Write process is the same• True atomic updates to Cassandra
  • 10. Cassandra nodes are set up according to row-key hash.
  • 11. Data can be written directly to Cassandra
  • 12. Data is distributed according to row key hash and replicationfactor
  • 13. DSE first writes toCassandra
  • 14. And then updates thesecondary index on Solr
  • 15. The quorum responds with success / failure
  • 16. Data is now distributed evenly
  • 17. READING FROM CLUSTER• Read either Cassandra-side or through Solr API• Cassandra: fast reads*• Solr: full-text search• Read direction affects performance• Data is stored in Cassandra
  • 18. Query is sent to node
  • 19. Node uses gossip to find who has the information
  • 20. QUERYING CASSANDRA• Can query Solr or Cassandra directly• Limited syntax with CQL, can use solr_query parameter
  • 21. Querying Cassandradirectly
  • 22. Cassandra retrievesinformation from columnfamily
  • 23. Querying Solr index
  • 24. Row-key hashes arestored in Solr, andCassandra is queried forstored data
  • 25. Cassandra node sendsrequest to node with thecorresponding hash,returns information
  • 26. Data is always synced
  • 27. Both nodes respond with information
  • 28. Updates can be committed and searched over in real time
  • 29. PRODUCTION USE• Will want a mix of analytics, search nodes
  • 30. An OLTP - OLAP integrated solution
  • 31. TRADEOFFS• Changing the Solr schema requires reindex (standard for Solr)• No multi-valued fields or composite columns
  • 32. Q&A@PatriciaGorlapgorla@o19s.como19s.com/blog

×