Improving Mobile Payments With Real time Spark

Improving Mobile
Payments with Real time
Spark

● Madhukara Phatak
● Big data consultant and
trainer at datamantra.io
● Consult in Hadoop, Spark
and Scala
● www.madhukaraphatak.com

Agenda
● Mobile as drive for big data
● Our customer solution
● Existing data solution
● Improved solution
● Technical details
● Future enhancements
● Q & A

Mobile as Big data drive
● Mobile has changed the way in which we interact with
world
● Most of the buy/sell happens on mobile today
○ Myntra went fully mobile
○ Flipkart and amazon say their 50% buy happens on
mobile
○ Quikr and OLX is mobile based selling platform
○ Ola etc

Challenges in Mobile
● Customers expect the service to available 24/7
● Tiny screens make very challenging to typical software
flows
● Flaky connectivity of mobile networks makes it tougher
● Constant moving results in drop in interactions
● No more downtime
● Everything has to be done in realtime

Mobile payments
● Almost every app earlier mentioned needs some kind of
payment
● Getting payments right on mobile is very hard
● Globally 21% of online shoppers abandon their basket
due to payment failures or delays
● Some companies are building sdk’s to help the app
developers
● Our customer is one of them

Our customer solution
● Mobile sdk for applications simplify the payments
● SDK provides better user interface like big buttons to
generate OTP or other flows
● SDK also helps in filling up different kind of forms given
by different banks using consistent UI
● Better user experience across applications
● Application sends anonymous payments details across
apps to our customer servers

Some numbers
● 40 + customers
● Over 1 million transactions per month as per March
● Around 55% success rate ( 5 % above average)
● Supports major banks, payment gateways and wallet
providers
● Soon will be available in other than mobile payment
space

Why data matters?
● As number of transaction increases, things will go
wrong
● There are so many different combinations to go wrong
● Example
○ Airtel OTP failing with state bank netbanking
○ Customers stuck in password page
○ Not able to read OTP from some specific
● Understanding customer pain and reacting to it is
paramount
● Every help results in payment

Initial BI solution
Events
Hourly
Push
JSON
Data
S3FS
Session Wise
Aggregations

Initial BI solution
● Phone sdk pushes events like transaction initiation,
payment complete to logging servers
● Logging servers roll log for every one hour and push to
s3
● A single node spark machine aggregates data by
sessions and pushes it to mysql
● Google BigQuery is used for adhoc querying

Challenges with BI solution
● Batch processing
● Geared towards more of report generation oriented flow
● Very minimal use of Spark API’s as team was not well
aware of it’s potential
● No integration with mobile sdk for feed back loop

Requirements for consulting
● Bring the same reporting calculations to real time
● Understanding the user behaviour and tracking his/her
flow over a session
● Closing the loop by providing automatic alerts based on
the metric calculations
● Some new specific business cases like loyalty
management etc
● Improving team expertise on spark

Choosing Spark streaming
● Company was already invested in Spark so spark
streaming was no brainer
● Also porting spark batch code to streaming was mostly
straight forward as both talk same API
● Company used python as Spark API language which
was supported by streaming also
● So we didn't consider storm we went ahead with Spark
streaming

First version
Events
Five Minute Push
JSON
Data
FileStream
Session Wise
Aggregations

First version
● We used fileStream API of spark streaming which
allowed us to poll a s3 bucket for every few mins
● A new rolling appender was added to log servers to
push logs to s3 every 5 mins
● Exact same batch code was used for calculations which
made transition very easy
● All downstream applications remained same

Second version
Events
JSON
Data
Session Wise
Aggregations
Hourly
Push Realtime

Amazon Kinesis
● A kafka like distributed message queue by Amazon
● It’s used as managed kafka source on AWS web
services
● Highly scalable and low latency support
● Persistence with fault tolerance across multiple
availability zones
● Great integration with Spark

Second version
● Amazon kinesis is added as real time stream source
● Logging server push logs to kinesis as they arrive
● Streaming application pulls the data from kinesis for
every few mins
● Multiple partitions support added for parallel streams

Challenges with Python
● Spark streaming API for python was introduced in 1.2
whereas spark-streaming for Scala/Java is available
from 0.8
● No aws kinesis connector was available as of March
● Team has to write it’s own
● No support for python in Spark job server

Challenges from batch to streaming
● Session typically last from 1-10 mins. Batch is easy
most of the time session is done for a one hour data but
challenging for real time data
● Designing state for session
● Designing checkpointing and deciding on interval
● Weird checkpointing issues with s3 due to eventual
consistency

Improvements to batch code
● Most of the code was written in rdd paradigm as it was
only know to team
● Team was trained on spark sql and spark streaming
● Majority code was ported to Spark sql based solution to
improve readability and maintainability
● Recently moved into Dataframe based code

Third version
Events
JSON
Data
Session Wise
Aggregations
Hourly
Push Realtime

Choosing Mesos
● Mesos is a great cluster manager for Spark only
workloads
● Has specific coarse-grain mode which is dedicated for
the real time systems
● Minimal overhead compared to YARN
● Easy to setup on EC2

Fourth version
Events
JSON
Data
Session Wise
Aggregations
Hourly
Push Realtime

Grafana
● Added grafana for visualization and dashboards
● Graphana = Graphite + influxDB
● Moved away from mysql to time series database influx
DB
● Scales much better compared to mysql
● Data scientists or product managers can monitor
customers using these dashboards
● Integrates with mobile sdk

Improving Mobile Payments With Real time Spark

More Related Content

What's hot

Viewers also liked

Similar to Improving Mobile Payments With Real time Spark

More from datamantra

Recently uploaded

Improving Mobile Payments With Real time Spark