Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Scalable data systems at Traveloka


Published on

Scalable data systems traveloka @ ruang guru 20180920

Published in: Technology
  • Be the first to comment

Scalable data systems at Traveloka

  1. 1. Scalable Data Systems How to store and process data at scale, and what are the challenges Rendy B. Junior - Data System Architect @ traveloka
  2. 2. So my friends… Our data is growing, we need to scale where we store our data. We all hope it is not an afterthought question...
  3. 3. Start simple - replicas Replicas: same set of data on nodes. Usually to separate read throughput based on load characteristic. Challenges: eventually consistent and… Write throughput does not increase... A, B, C A, B, C Read write production analytical queries
  4. 4. Let’s scale write throughput - shard Shards: different set of data on nodes. Since you have more nodes to write, it means more write throughput. A, B C Read write production
  5. 5. Sharding Techniques Any idea? :)
  6. 6. Sharding Techniques ● Range Sharding ○ Efficient scan. Require coordinator. Eg. MongoDB ● Hash Sharding ○ Efficient key-based retrieval. No coordinator. ○ Reassign all rows when need to rebalance ● Consistent Hashing ○ Assign value not to physical server but to logical partition ○ Doesn’t need to reassign all rows, just need to move the logical partition ○ Eg. Cassandra, DynamoDB
  7. 7. Other names... Usually in managed databases, where shard method is determined by default ● Distribution key, eg. Redshift ● Partition key, eg. DynamoDB ● And other names...
  8. 8. It’s simple! 1. Shard : distribute write load 2. Replica : distribute read (even more), high availability, durability Congrats! You have understood fundamental concept of any scalable data system!!! It’s all about shards and replicas And in most cases, it is configurable...
  9. 9. Something people loves, but distributed system hate Hayo tebak...
  10. 10. Something people loves, but distributed system hate Hotspot! Hotspot is injustice… "Injustice anywhere is a threat to justice everywhere." - Martin Luther King Jr. How to solve: rebalance, or fix sharding config Sometime it is inevitable from db point of view, eg. single row hotspot
  11. 11. Scaling from Different View - Distributed File System Storage and compute scale out separately - loosely coupled. Rebalance on compute is a lot faster since it’s just redistributing metadata Eg. BigTable / HBase
  12. 12. Serverless? Less thing to worry, but less configurable ● Usually sharding is determined by default ● Usually rebalance is a background process we don’t know ● Usually replica set is a magic, we only know SLA Doesn’t mean there’s no problem at all... It has limitations, or less features. Also, serverless won’t help bad design, eg. hotspot will always happen anywhere if you choose a bad key...
  13. 13. Some tips ● Always test how scalable your database is ○ Sometime it works, in theory, but not practical ● Make sure you design it correctly before scaling up ○ Money doesn’t solve everything... ● Cloud? Vertical scale before horizontal ○ Why? Usually, price of 2 machines 1 core ~ 1 machine 2 cores ○ Why? Horizontal = more network overhead ● Beware of injustice! ○ I mean hotspots… put proper instrumentation ● Read more ○ New type of databases keep coming...
  14. 14. So how to scale data processing part... It’s a distributed process, but it’s stateful...
  15. 15. Similar but different Similar. “Shard”. Related to unit of parallelism. How many parts of data splitted among workers But for fault tolerance, make it resilient instead of more replicas Eg. in Spark, it builds lineage graph so if one node is failing, it could recompute missing partition due to node failure
  16. 16. More on Parallelism Need to learn how to do this efficiently on some tools. Usually: ● Determine how many files & size you should have ● Determine how many CPUs for your job Sample rules: x cores = x*n num of partitions, ~200MB per file Some tools learn data source to decide num of workers (autoscale). Eg. Dataflow, using BoundedSource.estimate_size on the SDK
  17. 17. Another concepts on processing - Shuffling Some tips: - Filter early - Broadcast small data - Learn engine specific optimization...
  18. 18. Another concepts (again) - Worker Management How to distribute tasks among nodes. The concept is, usually there’s agent where we could submit request of job with certain specification (CPU, RAM, …) Eg. YARN, Mesos