Your SlideShare is downloading. ×
0
Using Cassandra for RTB systems
Using Cassandra for RTB systems
Using Cassandra for RTB systems
Using Cassandra for RTB systems
Using Cassandra for RTB systems
Using Cassandra for RTB systems
Using Cassandra for RTB systems
Using Cassandra for RTB systems
Using Cassandra for RTB systems
Using Cassandra for RTB systems
Using Cassandra for RTB systems
Using Cassandra for RTB systems
Using Cassandra for RTB systems
Using Cassandra for RTB systems
Using Cassandra for RTB systems
Using Cassandra for RTB systems
Using Cassandra for RTB systems
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Using Cassandra for RTB systems

1,418

Published on

This Tel-Aviv Cassandra 2014 Meetup Presentation

This Tel-Aviv Cassandra 2014 Meetup Presentation

Published in: Technology
1 Comment
2 Likes
Statistics
Notes
  • Hello, i'm very interrested on the bid decision trees, could you give more details please ?
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
1,418
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
57
Comments
1
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Real Time Bidding with Apache Cassandra
  • 2. RTB @ Kenshoo: Introducing RTB - Concepts - Architecture - Challenges
  • 3. Real Time Bidding (RTB) ● Real-time bidding is a dynamic auction process where each impression is a bid for in (near) real time versus a static auction ● Kenshoo is engaged In Facebook Exchange (FBX) ● In FBX, each bid has a life-time of 120ms. All transactions have to complete within that period, and the winning ad is presented to the user. ● Kenshoo employs ad re-targeting, where search engine campaigns are extended to the social network, thus giving a much higher ROI for our customers
  • 4. Flow WebSite
  • 5. RTB Logical Architecture RTB RTB Front Opt Out Bidder Win Error Pixel Matcher Cassandra Cookie to Segment(s) RTB Backend Bid decision Trees Campaigns Metadata RTB Brain RTB Reporter
  • 6. RTB @ Kenshoo: Focus on RTB Cassandra - Architecture - Challenges
  • 7. Requirements ● ● Handle 25K+ requests within the 120ms bid time-frame including network latencies Ability to scale up to 1M per minute requests while keeping the current latency ● Handle ~10K writes/second with low latency ● Multi DC Configuration, all nodes must be sync-ed in real-time ● Seamless Operations: Compactions and Repairs ● High Security
  • 8. C* Physical Architecture (US) West Region (US) East Region App App App App App App Internet GRE VPN FBX WEST VPN FBX EAST
  • 9. C* Cluster Information ● ● ● ● ● ● ● ● ● Cassandra version 1.2.6 Oracle Java 7 Manual tokens, Vnodes Are Coming Soon Multi-DC Configuration Network Topology DC Connectivity between VPCs via Linux GRE Amazon C3.2xlarge instance type Ubuntu 13.10 with EXT4 SSD (Ephemeral) The Ring
  • 10. C* Cluster Network Between Sites ● For security reasons we, ○ ○ ● Do not use EC2Snitch or EC2MultiRegionSnitch Connected the nodes via VPN (Linux GRE) Linux GRE is fast, reliable and provides high throughput (~1Gb/s)
  • 11. C* Cluster Storage ● We started with Amazon EBS: ○ ○ ○ ● With small #nodes (up to 4 nodes): You want persistent storage; avoid running repairs if you lose a node 4xEBS devices in RAID10 configuration: Provide up to 1000 IOPs and bursts of up to 2000 IOPS Cheap in AWS 8 nodes with Ephemeral Devices: ○ ○ ○ ○ Lower risk: if you lose a node, recovery isn’t as heavy on the whole cluster We used RAID0 Higher performance (double than EBS) Free, bundled within the instances
  • 12. C* Cluster Storage continued ● 16 nodes with Ephemeral Devices: ○ ○ ○ ● When load became heavy we grew to 16 nodes Compactions and repairs harmed the cluster latency We had to use Provisioned IOPs devices for C* maintenance C3 Instance type with SSD: ○ Came just in time providing ephemeral SSD storage ○ They solved our performance problems and enabled seamless compactions and repairs ○ Amazon currently has scarce deployment of this H/W and nodes are not stable ○ Not available yet in all regions ○ C3 Nodes Deployment are not always a possiblity due to AWS capacity issues ○ Amazon promised to resolve the C3 issues next month
  • 13. C* Cluster Performance
  • 14. Monitoring ● We heavily rely on DataStax OpsCenter ● We grab OpsCenter Metrics out for graphings ● We wrote our own Read/Write Speed Test on separate dedicated KeySpace on each node to detect bottlenecks and problematic nodes ● We Sample the data separately from the Application to detect if the problem origins are C* or the application
  • 15. What have we learned ● ● ● ● Storage: ○ Use SSD: ■ It provides high and stable disk performance ■ Neutralizes Compaction and Repair effects on the cluster ■ Worth the money Network: ■ Use highest bandwidth VPN possible ■ GRE is great (lacks encryption, but provides best bandwidth) Maintenance: ○ Run Compact Daily: It does miracle to performance on heavy loads ○ If you are not on SSD, disable thrift on the node before running compaction ○ Do compactions in sequence, node by node ○ On high-load systems, avoid repair as possible, it’s better to decommission and recommission a node than to run repair! ○ If you have to repair, always use “-pr” flag and if possible use the incremental repair option (requires heavy scripting) Monitoring: ○ Write a sampler and speed tester for each node to detect bottlenecks and performance issues sources
  • 16. Thank you

×