Grab is a “superapp,” an all-in-one app used by tens of millions of users daily across Southeast Asia to shop, hail rides, make payments and order food deliveries. Discover how Grab is using Scylla today to continuously evolve and expand its platform and capabilities to new use cases.
2. Chao Wang
Engineer Dev Manager
2
Arun Sandu
Engineer
■ Chao is working on Grab’s Trust team and contributing on building the ML
driven fraud detection platform
■ Worked on moving the engineering and the machine learning challenges
forward in the last 5 more years
■ Arun works on design, automation of reliable and scalable NoSQL
datastores, operating in a cloud environment across Grab.
■ Worked at Starbucks building highly resilient scalable infrastructure and
a distributed datastore for Rewards program in North America and
Japan markets.
3. Agenda
▪ What is Grab?
▪ Fraud detection at Grab
▪ Use cases
▪ Optimizations
3
5. Delivering everyday services to improve
the quality of life for Southeast Asians
TRANSPORT FOOD DELIVERY
REWARDS
MOBILE WALLET
FINANCIAL
SERVICES
GROCERIES
5
7. ■ Examples - number of rides a passenger completes within X hours; or
volume of declined transactions a driver, passenger pair within X hours
■ We use various types of counters to detect potential fraud/identity/safety risk,
e.g. if a passenger A and driver B together take more than 10 rides in last hour,
then it is very suspicious, we may take some action for further rides
■ e.g. booking_count:passenger_id:71008:driver_id:3546, value = 20
7
Counter Service – Real-time Aggregation
8. The conventional method
▪ Offline big data process
▪ Data analysts and engineers work on the scripts
Bottleneck
▪ Not in real time and it is important!
▪ Long development life-cycle for new data points.
Challenges
▪ Scalability
▪ Self-serving
▪ Manageable and extendable
8
11. 11
Now
10:35am
10:00am09:00am08:00am07:45am07:30am
Query: 3 hours ago to now
Buckets: Daily, Hourly, 15 Mins
key timestamp value
pax_1 2020-10-11 07:30:00 1
pax_1 2020-10-11 07:45:00 5
key timestamp value
pax_1 2020-10-11 08:00:00 6
pax_1 2020-10-11 09:00:00 1
pax_1 2020-10-11 10:00:00 8
Min table Hourly table
Improve the Aggregation Performance
12. ■ Ads
■ Kairosdb
■ Stream Processing
■ Segmentation Platform
Use cases at Grab
12
13. 13
Ads
Supports logging every user event,
clicks, reporting impressions,
statistics, capping etc.
Kairosdb
Distributed scalable time series database
uses scylla as storage backend for
metrics data.
15. 15
Segmentation Platform
■ Experiments on targeted segments
■ Driver loyalty rewards
■ Eligibility check of user and apply promo in
real-time
■ Target Ads and run campaigns
■ Promo recommendations using ML models
■ Target customers for any communication
16. 16
Frontend UI
■ Create, delete and refresh segments.
■ Schedule jobs for segment creation
■ Passenger lookup in a segment
19. Latencies
Datadog agent created CPU hogging which affected scylla performance.
19
Before After
P99 Read Latency 100ms 25ms
Error rate 1% 0
20. 20
Cost savings
■ TTL with default expiry
■ Rate limiter for writes and reads based on the desired qps
■ New mc storage format
■ Delete unused segments
Tombstones
■ A scheduled major compaction helped achieve better latencies.
Large Partitions
■ Dynamically create partitions based on the size of the segments.