Polyglot Persistence in the
Real World

Anton Yazovskiy
Thumbtack Technology
›  Software
›  an

Engineer at Thumbtack Technology

active user of various NoSQL solutions
›  consulting with focus on...
›  NoSQL

– not a silver bullet
›  Choices that we make
›  Cassandra: operational workload
›  Cassandra: analytical wo...
• 

well known ways to scale
• 

• 
• 
• 

scale in/out, scale by
function, data
denormalization

really works
each has di...
›  solve

exactly these kind of problem
›  rapid application development
aggregate
›  schema flexibility
›  auto-scale...
›  splendors

and miseries of aggregate
›  CAP theorem dilemma

Consistency

Availability

Partition
Tolerance
Analytical

Operational

Consistency

Availability

Performance

Reliability
Analytical

Operational

Consistency

Availability

Performance

Reliability

I want it all
(released by Facebook in 2008)
›  elastic

scalability & linear performance *
›  dynamic schema
›  very high write thro...
›  Large

data set on commodity hardware
›  Tradeoff between speed and reliability
›  Heavy-write workload
›  Time-ser...
Cassandra
Analytical

Performance

Operational

Reliability

Small demo after this slide
TIMESTAMP	
  
12344567	
  
SERVER	
  1	
   12326346	
  
13124124	
  
13237457	
  
SERVER	
  2	
   13627236	
  

›  expens...
› 
› 

all columns are sorted by name
row – aggregate item (never sharded)

get slice
row	
  key	
  1	
  

Column	
  
Fa...
› 
› 

all columns are sorted by name
row – aggregate item (never sharded)
get_slice(row_key, from, to, count)
SERVER	
 ...
› 
› 

all columns are sorted by name
row – aggregate item (never sharded)
get_slice(row_key, from, to, count)
SERVER	
 ...
›  Time-range
›  “get

with filter:

all events for User J from N to M”
›  “get all success events for User J from N to...
›  Time-range

with filter:

›  “get

all events for User J from N to M”
›  “get all success events for User J from N t...
›  Counters:
›  “get

# of events for User J grouped by hour”
›  “get # of events for User J grouped by day”

events::s...
›  row

key should consist of combination of fields with
high cardinality of values:
› 

name, id, etc..

›  boolean
›...
In theory – possible in real-time
›  average, 3 dimensional filters, group by, etc..
But:
›  hard to tune data model
› ...
“I want interactive reports”

Auto update
somehow

Cassandra

“Reports could be a little bit out of date, but I
want to co...
›  Impact

on
production system
or

›  Higher

total cost
of ownership
›  Difficulties with
scalability
›  hard to sup...
http://aws.amazon.com
›  Hadoop

tech.stack
›  Automatic deployment
›  Management API
›  Temporal cluster
›  Amazon S3 as data storage *

*...
JobFlowInstancesConfig instances = ..
instances.setHadoopVersion(..)
instances.setInstanceCount(dataNodeCount + 1)
instanc...
Execute job on running cluster:
StepConfig stepConfig = new StepConfig(name, jar)
AddJobFlowStepsRequest addReq = …
addReq...
cluster lifecycle: Long-Running or Transient
›  cold start = ~20 min
›  tradeoff: cluster cost VS availability
›  Compr...
try {
long txId = cassandra.persist(entity)
sql.insert(some)
sql.update(someElse)
cassandra.commit(txId)
sql.commit()
} ca...
insert into CHANGES (key, commited, data)
values ('tx_id-58e0a7d7-eebc', ’false’, ..)
update CHANGES set commited = ’true’...
I

numbers
non-production setup:
•  3 nodes (cassandra)
•  m1.medium EC2 instance
•  1 data center
•  1 app instance
real-time metrics update (sync):
›  average latency - 60 msec
›  process > 2,000 events per second
›  generate > 1000 r...
›  distributed

systems force you to make decisions
›  systems like Cassandra trade speed for
Consistency
›  CAP theore...
›  Cassandra

– great for time series data and
heavy-write workload…
›  ... but use cases should be clearly defined
›  Amazon
›  simple,

›  Amazon

S3 – is great
slow, but predictable storage

EMR

›  integration

with S3 – great
› ...
/**

*/

/**
*/

ayazovskiy@thumbtack.net
@yazovsky
www.linkedin.com/in/yazovsky

http://www.thumbtack.net
http://thumbtac...
Polyglot Persistence in the Real World: Cassandra + S3 + MapReduce
Polyglot Persistence in the Real World: Cassandra + S3 + MapReduce
Polyglot Persistence in the Real World: Cassandra + S3 + MapReduce
Upcoming SlideShare
Loading in...5
×

Polyglot Persistence in the Real World: Cassandra + S3 + MapReduce

1,920

Published on

This talk focuses on building a system from scratch, showing how to perform analytical queries in near real-time and still get the benefits of high performance database engine of Cassandra. The key subjects of my speech are:
● The splendors and miseries of NoSQL
● Apache Cassandra use-cases
● Difficulties of using MapReduce directly in Cassandra
● Amazon cloud solutions: Elastic MapReduce and S3
● “real-enough” time analysis
In particular the talk dives into ways of handling different kinds of semi-ad-hoc queries when using Cassandra, the pitfalls in designing a schema around a specific analytics use case. Some attention will be paid towards dealing with time series data in particular, which can present a real problem when using Column-Family or Key-Value store databases.

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,920
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
12
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Polyglot Persistence in the Real World: Cassandra + S3 + MapReduce

  1. 1. Polyglot Persistence in the Real World Anton Yazovskiy Thumbtack Technology
  2. 2. ›  Software ›  an Engineer at Thumbtack Technology active user of various NoSQL solutions ›  consulting with focus on scalability ›  a significant part of my work is advising people on which solutions to use and why ›  big fan of BigData and clouds
  3. 3. ›  NoSQL – not a silver bullet ›  Choices that we make ›  Cassandra: operational workload ›  Cassandra: analytical workload ›  The best of both worlds ›  Some benchmarks ›  Conclusions
  4. 4. •  well known ways to scale •  •  •  •  scale in/out, scale by function, data denormalization really works each has disadvantages mostly manual process (newSQL) http://qsec.deviantart.com
  5. 5. ›  solve exactly these kind of problem ›  rapid application development aggregate ›  schema flexibility ›  auto-scale-out ›  auto-failover ›  ›  amount of data able to handle ›  shared nothing architecture, no SPOF ›  performance
  6. 6. ›  splendors and miseries of aggregate ›  CAP theorem dilemma Consistency Availability Partition Tolerance
  7. 7. Analytical Operational Consistency Availability Performance Reliability
  8. 8. Analytical Operational Consistency Availability Performance Reliability I want it all
  9. 9. (released by Facebook in 2008) ›  elastic scalability & linear performance * ›  dynamic schema ›  very high write throughput ›  tunable per request consistency ›  fault-tolerant design ›  multiple datacenter and cloud readiness ›  CaS transaction support * * http://www.datastax.com/what-we-offer/products-services/datastax-enterprise/apache-cassandra
  10. 10. ›  Large data set on commodity hardware ›  Tradeoff between speed and reliability ›  Heavy-write workload ›  Time-series data http://www.datastax.com/what-we-offer/products-services/datastax-enterprise/apache-cassandra
  11. 11. Cassandra Analytical Performance Operational Reliability Small demo after this slide
  12. 12. TIMESTAMP   12344567   SERVER  1   12326346   13124124   13237457   SERVER  2   13627236   ›  expensive FIELD  1   DATA   DATA   DATA   DATA   DATA   …             select * from table where timestamp > 12344567 and timestamp < 13237457 range queries across cluster ›  unless shard by timestamp ›  become a bottleneck for heavy-write workload
  13. 13. ›  ›  all columns are sorted by name row – aggregate item (never sharded) get slice row  key  1   Column   Family   row  key  2   column  1   value  1.1   column  2   value  1.2   column  3   value  1.3   ..   ..   column  1   column  2   ...   column  M   value  2.1   value  2.2   …   value  2.M   column  N   value  1.N   get key get range + combinations of these queries + composite columns Super columns are discouraged and omitted here
  14. 14. ›  ›  all columns are sorted by name row – aggregate item (never sharded) get_slice(row_key, from, to, count) SERVER  1   SERVER  2   row  key  1   row  key  2   row  key  3   row  key  4   row  key  5   Emestamp   Emestamp   Emestamp   Emestamp   Emestamp   Emestamp   Emestamp   Emestamp   Emestamp   Emestamp   Emestamp   Emestamp   Emestamp   Emestamp   get_slice(“row key 1”, from:“timestamp 1”, null, 11)
  15. 15. ›  ›  all columns are sorted by name row – aggregate item (never sharded) get_slice(row_key, from, to, count) SERVER  1   SERVER  2   row  key  1   row  key  2   row  key  3   row  key  4   row  key  5   Emestamp   Emestamp   Emestamp   Emestamp   Emestamp   Emestamp   Emestamp   Emestamp   Emestamp   Emestamp   Emestamp   Emestamp   Emestamp   Emestamp   get_slice(“row key 1”, from:“timestamp 1”, null, 11) Next page get_slice(“row key 1”, from:“timestamp 11”, null, 11) get_slice(“row key 1”, null, to:“timestamp 11”, 11) Prev.page
  16. 16. ›  Time-range ›  “get with filter: all events for User J from N to M” ›  “get all success events for User J from N to M” ›  “get all events for all user from N to M”
  17. 17. ›  Time-range with filter: ›  “get all events for User J from N to M” ›  “get all success events for User J from N to M” Emestamp  1   ›  “get all events for all user from N to M” events::success::User_123   events::success   events::User_123   value  1   Emestamp  1   value  1   Emestamp  1   value  1  
  18. 18. ›  Counters: ›  “get # of events for User J grouped by hour” ›  “get # of events for User J grouped by day” events::success::User_123   events::User_123   1380400000   14   1380400000   842   1380403600   42   1380403600   1024   (group by day – same but in different column family for TTL support)
  19. 19. ›  row key should consist of combination of fields with high cardinality of values: ›  name, id, etc.. ›  boolean ›  values are bad option composite columns – good option for it ›  timestamp ›  otherwise, may help to spread historical data scalability will not be linear
  20. 20. In theory – possible in real-time ›  average, 3 dimensional filters, group by, etc.. But: ›  hard to tune data model ›  lack of aggregation options ›  aggregation by historical data
  21. 21. “I want interactive reports” Auto update somehow Cassandra “Reports could be a little bit out of date, but I want to control this delay value”
  22. 22. ›  Impact on production system or ›  Higher total cost of ownership ›  Difficulties with scalability ›  hard to support with multiple clusters http://www.datastax.com/docs/0.7/map_reduce/hadoop_mr
  23. 23. http://aws.amazon.com
  24. 24. ›  Hadoop tech.stack ›  Automatic deployment ›  Management API ›  Temporal cluster ›  Amazon S3 as data storage * * copy from S3 to EMR HDFS and back
  25. 25. JobFlowInstancesConfig instances = .. instances.setHadoopVersion(..) instances.setInstanceCount(dataNodeCount + 1) instances.setMasterInstanceType(..) instances.setSlaveInstanceType(..) RunJobFlowRequest req = ..(name, instances) req.addSteps(new StepConfig(name, jar)) AmazonElasticMapReduce emr = .. emr.runJobFlow(req)
  26. 26. Execute job on running cluster: StepConfig stepConfig = new StepConfig(name, jar) AddJobFlowStepsRequest addReq = … addReq.setJobFlowId(jobFlowId) addReq.setSteps(Arrays.asList(stepConfig)) AmazonElasticMapReduce emr = emr.addJobFlowSteps(addReq)
  27. 27. cluster lifecycle: Long-Running or Transient ›  cold start = ~20 min ›  tradeoff: cluster cost VS availability ›  Compressing and Combiner tuning may speed-up jobs very much ›  common problems for all big data processing tools monitoring, testability and debug (MRUnit, local hadoop, smaller data set) › 
  28. 28. try { long txId = cassandra.persist(entity) sql.insert(some) sql.update(someElse) cassandra.commit(txId) sql.commit() } catch (Exception e) { sql.rollback() cassandra.rollback(txId) }
  29. 29. insert into CHANGES (key, commited, data) values ('tx_id-58e0a7d7-eebc', ’false’, ..) update CHANGES set commited = ’true’ where key = 'tx_id-58e0a7d7-eebc’ delete from CHANGES where key = 'tx_id-58e0a7d7-eebc’
  30. 30. I numbers non-production setup: •  3 nodes (cassandra) •  m1.medium EC2 instance •  1 data center •  1 app instance
  31. 31. real-time metrics update (sync): ›  average latency - 60 msec ›  process > 2,000 events per second ›  generate > 1000 reports per second real-time metrics update (async): ›  process > 15,000 events per second uploading to AWS S3: slow, but multi-threading helps * it is more then enough, but what if …
  32. 32. ›  distributed systems force you to make decisions ›  systems like Cassandra trade speed for Consistency ›  CAP theorem is oversimplified ›  you have much more options ›  polyglot persistence can make this world a better place ›  do not try to hammer every nail with the same hammer
  33. 33. ›  Cassandra – great for time series data and heavy-write workload… ›  ... but use cases should be clearly defined
  34. 34. ›  Amazon ›  simple, ›  Amazon S3 – is great slow, but predictable storage EMR ›  integration with S3 – great ›  very good API, but … ›  … isn’t a magic trick and require knowledge about Hadoop and skills for effective usage
  35. 35. /** */ /** */ ayazovskiy@thumbtack.net @yazovsky www.linkedin.com/in/yazovsky http://www.thumbtack.net http://thumbtack.net/whitepapers
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×