Flipkart's Hybrid Cloud Infrastructure Strategy for Optimal Cost Efficiency and Business Agility
1. Flipkart's Hybrid Cloud Infrastructure Strategy
for Optimal Cost Efficiency and Business Agility
Sudhir Reddy
Senior Principal Architect, Flipkart
2. Agenda ● Flipkart’s Infrastructure Requirements
● Private, Public, or Hybrid Cloud
● Flipkart’s Hybrid Cloud Strategy
● Building Blocks for Hybrid Cloud Applications
● Cloud Optimizations
3. Flipkart
Flipkart Business
Peak User Traffic
New Customer Acquisition
BAU Growth
New Features
Short Duration
Brand Visibility
>450M Users >15M Products
Millions of Daily
Shipments
Market Growth
4. Flipkart Infrastructure Requirements
● Flexibility in provisioning for dynamic BAU growth
● Variability of peak compute anywhere between 2x to 10x
● Business Continuity - Highly reliable and available
● Cost Effective - Higher ROI
Millions of Cores 100s of GPUs
100s of Gbps of
Internet
Connectivity
Exabyte Storage
Million RPS
5. Flipkart Data Center Evolution
New Chennai DC
Existing CH DC EOL
Plan
Grounds up design
collaboration with DC
Facility provider
Better Power Usage
Effectiveness (PUE)
Co-located DC in
Mumbai
1st ever Big Billion Day
Evaluated public cloud -
Scale & TCO unviable
Dedicated DC in
Chennai
Very Large Expansion
Built In-house IaaS &
PaaS stack
Business Continuity
- Second DC in
Hyderabad
Lower CAPEX -
Sourced from
Manufacturers
Lower OPEX -
Renewable Energy
2014 2015 2022-23
2018-19
Hybrid DC
Strategic partnership
with GCP
Two Private DC +
Two Public DC
Leverage the best of
both worlds
2023-24
6. Why Hybrid Cloud ?
Public Cloud Private Cloud Hybrid Cloud
Capacity Flexibility Better Best
Business Agility Best Best
Developer Productivity Best Better
Cost Better Best
Lock-In Protection Best Better
Security and Compliance Better Better Best
Support Better Best Better
7. Flipkart’s Strategy - Hybrid Cloud
Business Agility
Faster feature
development due to
use of new advanced
technologies
Cost Structure
Favorable TCO and
terms makes infra a
competitive
differentiator
Variable Scale 5-10 x BAU
8. Flipkart Hybrid Cloud View
2 Private Data Centers
2 Public Data Centers
Data Platform on Public Cloud
Bursting for Sale Peaks on Public Cloud
Applications BCP on Public Cloud
Note: We are still working towards this end state.
17. Custom Pubsub Client
● 40X faster by stripping-off features from pubsub
client that were not needed for receiver-like
pattern.
● Avro serialization, Custom Compression and
decompression library for PubSub with Events
Batching
GCP Pub/Sub
~70% Cost Reduction
18. GCP - Dataflow
Moved to Inhouse Built Framework built on top of Spark
Streaming
● Reading the data from PubSub and writing to the GCS
with some minor transformations
● Increased CPU utilization on cluster from 20% to 90% ~75% Cost Reduction
19. HBase to GCP BigTable
~50% Cost Reduction
● HBase Client compatible API
● Autoscaling BigTable for BAU and Large Scale spikes
20. GCP Dataproc to YARN
Autoscaler
Custom YARN autoscaler service
○ Faster capacity ramp up and aggressive ramp
down
○ Deploy the application in a single zone on the
basis of the Node label
○ Scaling at nodelabels
○ No Dataproc licensing cost
○ Mix spot VMs and non-preemptible VMs
~20% Cost Reduction
21. GCP - BigQuery
● Trade off Performance vs Cost - Faster is
not always Better
● Tooling to truncate older mutations for
Streaming Ingestion
~85% Cost Reduction