SlideShare a Scribd company logo
1 of 20
Download to read offline
Big Data @ Bukalapak
Ibrahim Arief – @ibamarief
VP of Engineering – Bukalapak
SARCCOM & BCA Tech Talk – October 2017
Strictly Confidential 1
Short Intro – Speaker
• VP of Engineering – Bukalapak (ID)
• 2016 – present
• Engineering Lead – bol.com (NL)
• 2014 – 2016
• Comp Sci PhD dropout ☺ – NTNU Gjøvik (NO)
• 2013 – 2015
2
Short Intro – Bukalapak
3
• One of the largest e-marketplace in Southeast Asia
• 15 million users, 1 Trillion IDR/month
• 900+ Total Employees
• 350+ in Product Development Group
• 120+ in Product (PM, UX, UI, DS, QAT)
• 200+ in Engineering (FE, BE, QAE, MOB, AI)
• 30+ in Technology (SRE, SysEng)
• 20+ Product Development Teams
How big is our Big Data?
4
Billions of data points per day
(censored, sorry ☺)
5
Since 2014, we…
6
7
>1.5 PB of data
Hundreds of millions of images
(current size, predicted to triple every year)
8
>1.5 PB of data
Hundreds of millions of products
(current size, predicted to triple every year)
9
>1.5 PB of data
Hundreds of millions of messages
(current size, predicted to triple every year)
How do we handle all those
data?
i.e. our Big Data Architecture
10
11
Old (2016), data fragmentation, hard to do data crunching for AIs
12
New (2017), data lake & warehouse, robust pipeline for AI & Analytics
13
1PB Elastic
Cluster
New (2017), data lake & warehouse, robust pipeline for AI & Analytics
14
Small 192-core
Spark Cluster
New (2017), data lake & warehouse, robust pipeline for AI & Analytics
What do we use all those
data for?
(hint: not just for generating business reports ☺)
15
Realtime high-level health insight
16
Fast awareness if releases are unhealthy, enabling rapid reaction & mitigation
Old Recommender AI – Similarity-Based Search
17
Showing boring similar products 
New Recommender AI – Crunching 1.2B Monthly Views
18
Showing inspirational alternatives ☺
Data-Driven  A/B Tested
Amazing incremental growth ☺☺☺
New Recommender AI – Crunching 1.2B Monthly Views
19
Wrap Up
20
• Big Data @ Bukalapak  >1.5PB
• Big Data  not just for business reports
• Big Data for realtime health insight can save $$$
• Big Data for AI can and do generate $$$
• We’re hiring! ☺ Check out careers.bukalapak.com

More Related Content

More from SARCCOM

More from SARCCOM (16)

Data Warehousing Tools on Data Ecosystem
Data Warehousing Tools on Data EcosystemData Warehousing Tools on Data Ecosystem
Data Warehousing Tools on Data Ecosystem
 
Startup Engineering Culture
Startup Engineering CultureStartup Engineering Culture
Startup Engineering Culture
 
Menggapai Paripurna Rekayasa
Menggapai Paripurna RekayasaMenggapai Paripurna Rekayasa
Menggapai Paripurna Rekayasa
 
Requirement Gathering Jump Start
Requirement Gathering Jump StartRequirement Gathering Jump Start
Requirement Gathering Jump Start
 
Legacy code - Taming The Beast
Legacy code  - Taming The BeastLegacy code  - Taming The Beast
Legacy code - Taming The Beast
 
The Role of IT Architect in Enterprise Company (Garuda Indonesia)
The Role of IT Architect in Enterprise Company (Garuda Indonesia)The Role of IT Architect in Enterprise Company (Garuda Indonesia)
The Role of IT Architect in Enterprise Company (Garuda Indonesia)
 
The Role of IT Architect in Startup Company
The Role of IT Architect in Startup CompanyThe Role of IT Architect in Startup Company
The Role of IT Architect in Startup Company
 
Blibli Web Application Security Policy Enforcement Point
Blibli Web Application Security Policy Enforcement Point Blibli Web Application Security Policy Enforcement Point
Blibli Web Application Security Policy Enforcement Point
 
Implement OpenSAMM on blibli.com
Implement OpenSAMM on blibli.comImplement OpenSAMM on blibli.com
Implement OpenSAMM on blibli.com
 
Architecting for Huper Growth and Great Engineering Culture
Architecting for Huper Growth and Great Engineering CultureArchitecting for Huper Growth and Great Engineering Culture
Architecting for Huper Growth and Great Engineering Culture
 
Software Architecture Introduction
Software Architecture IntroductionSoftware Architecture Introduction
Software Architecture Introduction
 
Software Architecture Fundamentals Part-1 Architecture soft skill
Software Architecture Fundamentals Part-1 Architecture soft skillSoftware Architecture Fundamentals Part-1 Architecture soft skill
Software Architecture Fundamentals Part-1 Architecture soft skill
 
Telco Business & Technology
Telco Business & TechnologyTelco Business & Technology
Telco Business & Technology
 
The Evolution of Software for a Startup
The Evolution of Software for a Startup The Evolution of Software for a Startup
The Evolution of Software for a Startup
 
How to work with us? We are Gen Y!
How to work with us? We are Gen Y!How to work with us? We are Gen Y!
How to work with us? We are Gen Y!
 
Managing Security in Agile Culture
Managing Security in Agile CultureManaging Security in Agile Culture
Managing Security in Agile Culture
 

Recently uploaded

Recently uploaded (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 

Big Data @Bukalapak

  • 1. Big Data @ Bukalapak Ibrahim Arief – @ibamarief VP of Engineering – Bukalapak SARCCOM & BCA Tech Talk – October 2017 Strictly Confidential 1
  • 2. Short Intro – Speaker • VP of Engineering – Bukalapak (ID) • 2016 – present • Engineering Lead – bol.com (NL) • 2014 – 2016 • Comp Sci PhD dropout ☺ – NTNU Gjøvik (NO) • 2013 – 2015 2
  • 3. Short Intro – Bukalapak 3 • One of the largest e-marketplace in Southeast Asia • 15 million users, 1 Trillion IDR/month • 900+ Total Employees • 350+ in Product Development Group • 120+ in Product (PM, UX, UI, DS, QAT) • 200+ in Engineering (FE, BE, QAE, MOB, AI) • 30+ in Technology (SRE, SysEng) • 20+ Product Development Teams
  • 4. How big is our Big Data? 4
  • 5. Billions of data points per day (censored, sorry ☺) 5
  • 7. 7 >1.5 PB of data Hundreds of millions of images (current size, predicted to triple every year)
  • 8. 8 >1.5 PB of data Hundreds of millions of products (current size, predicted to triple every year)
  • 9. 9 >1.5 PB of data Hundreds of millions of messages (current size, predicted to triple every year)
  • 10. How do we handle all those data? i.e. our Big Data Architecture 10
  • 11. 11 Old (2016), data fragmentation, hard to do data crunching for AIs
  • 12. 12 New (2017), data lake & warehouse, robust pipeline for AI & Analytics
  • 13. 13 1PB Elastic Cluster New (2017), data lake & warehouse, robust pipeline for AI & Analytics
  • 14. 14 Small 192-core Spark Cluster New (2017), data lake & warehouse, robust pipeline for AI & Analytics
  • 15. What do we use all those data for? (hint: not just for generating business reports ☺) 15
  • 16. Realtime high-level health insight 16 Fast awareness if releases are unhealthy, enabling rapid reaction & mitigation
  • 17. Old Recommender AI – Similarity-Based Search 17 Showing boring similar products 
  • 18. New Recommender AI – Crunching 1.2B Monthly Views 18 Showing inspirational alternatives ☺ Data-Driven  A/B Tested Amazing incremental growth ☺☺☺
  • 19. New Recommender AI – Crunching 1.2B Monthly Views 19
  • 20. Wrap Up 20 • Big Data @ Bukalapak  >1.5PB • Big Data  not just for business reports • Big Data for realtime health insight can save $$$ • Big Data for AI can and do generate $$$ • We’re hiring! ☺ Check out careers.bukalapak.com