SlideShare a Scribd company logo
1 of 26
Download to read offline
Demystifying
Big Data
Brown Bag
Everything start small
Traditional Approach
Simple Process
Result
What’s next?
Unanswered question of lifetime.
Unquenchable thirst of improvement
❏ How to Sell more?
❏ How to optimize inventory?
❏ How to engage customer more?
❏ What do my customer Like?
❏ How to reduce Operation Cost?
Torture the data,
and it will confess
to anything
Ronald Coase
How to get Data?
Humans…..
Ever Growing Data
❏ Historical data plays important role.
❏ Data explodes while processing.
❏ More data beats better algorithms.
So What is Big Data?
When data has tendency to grow more than what one machine can
process.
Getting Right Tool
Data Parallel Processing
❏ Distribute the data [ With replication]
❏ Move Computation close to Data
❏ Process each section of Data separately
❏ Aggregate the results.
Advantages of Data Parallel Model
❏ No Hardware restriction. e.g Memory, CPU.
❏ No Scalability Issue
❏ Cost effectiveness.
❏ No Single point of failure.
That’s nice, So
problem solved. But
Presentation says
Hadoop,Spark?
Challenges of Data-||-sim
❏ Data partitioning, distribution and accumulation
❏ Fault Tolerance.
❏ Distributed Coordination and management.
❏ Abstraction with the distributed complexity.
Big Data Ecosystem
❏ Distributed Data Storage System:
❏ Data distribution.
❏ Data Replication.
❏ High throughput with no single point of failure.
❏ Distributed Data Processing System:
❏ Distributing Code close to data.
❏ Abstracting distributed complexity from programmer.
❏ Fault tolerance and handling computation failure.
❏ Aggregating results.
❏ Distributed Coordination and Resource management.
❏ Resource allocation.
❏ Distributed configuration management.
Distributed Data Storage System
Distributed Data Processing System
Distributed Coordination and Resource management.
Lambda Architecture
How to Sell more?
Recommendation.
Speed Layer
2. Product Views
1. Web Log
3. Similar Product
4. Update user product recommendation
How to optimize
inventory?
Predication
Batch Layer
1. User Data
2. Location Cluster per item
3. Location Cluster
per item Data
3. Current Warehouse
inventory
4. Inventory transfer.
THANK YOU
Akash Mishra
akashm@thoughtworks.com

More Related Content

Viewers also liked

Büyük Selçuklu Devleti
Büyük Selçuklu DevletiBüyük Selçuklu Devleti
Büyük Selçuklu DevletiDoğukan Çetin
 
Minimalizmin Flat Tasarım Bağlamında Popülaritesi, Kökeni ve Temsilcileri
Minimalizmin Flat Tasarım Bağlamında Popülaritesi, Kökeni ve TemsilcileriMinimalizmin Flat Tasarım Bağlamında Popülaritesi, Kökeni ve Temsilcileri
Minimalizmin Flat Tasarım Bağlamında Popülaritesi, Kökeni ve TemsilcileriMesut Yılmaz
 

Viewers also liked (6)

Emevi sanati
Emevi sanati Emevi sanati
Emevi sanati
 
Büyük Selçuklu Devleti
Büyük Selçuklu DevletiBüyük Selçuklu Devleti
Büyük Selçuklu Devleti
 
Karahanlilar
KarahanlilarKarahanlilar
Karahanlilar
 
Büyük selcuklu
Büyük selcukluBüyük selcuklu
Büyük selcuklu
 
Beyli̇kler dönemi̇
Beyli̇kler dönemi̇Beyli̇kler dönemi̇
Beyli̇kler dönemi̇
 
Minimalizmin Flat Tasarım Bağlamında Popülaritesi, Kökeni ve Temsilcileri
Minimalizmin Flat Tasarım Bağlamında Popülaritesi, Kökeni ve TemsilcileriMinimalizmin Flat Tasarım Bağlamında Popülaritesi, Kökeni ve Temsilcileri
Minimalizmin Flat Tasarım Bağlamında Popülaritesi, Kökeni ve Temsilcileri
 

Similar to Demystifying big data

Build data warehouse for retail using Hadoop
Build data warehouse for retail using HadoopBuild data warehouse for retail using Hadoop
Build data warehouse for retail using HadoopAlex Nguyen
 
Big data explanation with real time use case
 Big data explanation with real time use case Big data explanation with real time use case
Big data explanation with real time use caseN.Jagadish Kumar
 
Stacktrace Berlin RC.2
Stacktrace Berlin RC.2Stacktrace Berlin RC.2
Stacktrace Berlin RC.2Oliver Seemann
 
SuperWeek 2023 - Building the case for Digital Analytics
SuperWeek 2023 - Building the case for Digital AnalyticsSuperWeek 2023 - Building the case for Digital Analytics
SuperWeek 2023 - Building the case for Digital AnalyticsLukáš Čech
 
Bi isn't big data and big data isn't BI (updated)
Bi isn't big data and big data isn't BI (updated)Bi isn't big data and big data isn't BI (updated)
Bi isn't big data and big data isn't BI (updated)mark madsen
 
Big Data at a Gaming Company: Spil Games
Big Data at a Gaming Company: Spil GamesBig Data at a Gaming Company: Spil Games
Big Data at a Gaming Company: Spil GamesRob Winters
 
WiDS - Unleashing the promises of big data
WiDS - Unleashing the promises of big dataWiDS - Unleashing the promises of big data
WiDS - Unleashing the promises of big dataYara Jubran
 
Putting data science in your business a first utility feedback
Putting data science in your business a first utility feedbackPutting data science in your business a first utility feedback
Putting data science in your business a first utility feedbackPeculium Crypto
 
Next Big Thing In IT Space
Next Big Thing In IT SpaceNext Big Thing In IT Space
Next Big Thing In IT SpaceAhsan Shamsudeen
 
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)mark madsen
 
SDD2017 - 03 Abed Ajraou - putting data science in your business a first uti...
SDD2017 - 03 Abed Ajraou  - putting data science in your business a first uti...SDD2017 - 03 Abed Ajraou  - putting data science in your business a first uti...
SDD2017 - 03 Abed Ajraou - putting data science in your business a first uti...Dario Mangano
 
NYC Open Data Meetup-- Thoughtworks chief data scientist talk
NYC Open Data Meetup-- Thoughtworks chief data scientist talkNYC Open Data Meetup-- Thoughtworks chief data scientist talk
NYC Open Data Meetup-- Thoughtworks chief data scientist talkVivian S. Zhang
 
Data for Action Talk - 2016-02-22
Data for Action Talk - 2016-02-22Data for Action Talk - 2016-02-22
Data for Action Talk - 2016-02-22David E Drummond
 
How to succeed at data without even trying!
How to succeed at data without even trying!How to succeed at data without even trying!
How to succeed at data without even trying!Dylan
 
One Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database RevolutionOne Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database Revolutionmark madsen
 
Analytics-Enabled Experiences: The New Secret Weapon
Analytics-Enabled Experiences: The New Secret WeaponAnalytics-Enabled Experiences: The New Secret Weapon
Analytics-Enabled Experiences: The New Secret WeaponDatabricks
 
Everything has changed except us
Everything has changed except usEverything has changed except us
Everything has changed except usmark madsen
 
Where Is Your Data?: An Introduction to Problems and Bottlenecks in Data Systems
Where Is Your Data?: An Introduction to Problems and Bottlenecks in Data SystemsWhere Is Your Data?: An Introduction to Problems and Bottlenecks in Data Systems
Where Is Your Data?: An Introduction to Problems and Bottlenecks in Data SystemsInsightDataScience
 
Next-Generation BPM - How to create intelligent Business Processes thanks to ...
Next-Generation BPM - How to create intelligent Business Processes thanks to ...Next-Generation BPM - How to create intelligent Business Processes thanks to ...
Next-Generation BPM - How to create intelligent Business Processes thanks to ...Kai Wähner
 

Similar to Demystifying big data (20)

Build data warehouse for retail using Hadoop
Build data warehouse for retail using HadoopBuild data warehouse for retail using Hadoop
Build data warehouse for retail using Hadoop
 
Optimisation vs prediction
Optimisation vs predictionOptimisation vs prediction
Optimisation vs prediction
 
Big data explanation with real time use case
 Big data explanation with real time use case Big data explanation with real time use case
Big data explanation with real time use case
 
Stacktrace Berlin RC.2
Stacktrace Berlin RC.2Stacktrace Berlin RC.2
Stacktrace Berlin RC.2
 
SuperWeek 2023 - Building the case for Digital Analytics
SuperWeek 2023 - Building the case for Digital AnalyticsSuperWeek 2023 - Building the case for Digital Analytics
SuperWeek 2023 - Building the case for Digital Analytics
 
Bi isn't big data and big data isn't BI (updated)
Bi isn't big data and big data isn't BI (updated)Bi isn't big data and big data isn't BI (updated)
Bi isn't big data and big data isn't BI (updated)
 
Big Data at a Gaming Company: Spil Games
Big Data at a Gaming Company: Spil GamesBig Data at a Gaming Company: Spil Games
Big Data at a Gaming Company: Spil Games
 
WiDS - Unleashing the promises of big data
WiDS - Unleashing the promises of big dataWiDS - Unleashing the promises of big data
WiDS - Unleashing the promises of big data
 
Putting data science in your business a first utility feedback
Putting data science in your business a first utility feedbackPutting data science in your business a first utility feedback
Putting data science in your business a first utility feedback
 
Next Big Thing In IT Space
Next Big Thing In IT SpaceNext Big Thing In IT Space
Next Big Thing In IT Space
 
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
 
SDD2017 - 03 Abed Ajraou - putting data science in your business a first uti...
SDD2017 - 03 Abed Ajraou  - putting data science in your business a first uti...SDD2017 - 03 Abed Ajraou  - putting data science in your business a first uti...
SDD2017 - 03 Abed Ajraou - putting data science in your business a first uti...
 
NYC Open Data Meetup-- Thoughtworks chief data scientist talk
NYC Open Data Meetup-- Thoughtworks chief data scientist talkNYC Open Data Meetup-- Thoughtworks chief data scientist talk
NYC Open Data Meetup-- Thoughtworks chief data scientist talk
 
Data for Action Talk - 2016-02-22
Data for Action Talk - 2016-02-22Data for Action Talk - 2016-02-22
Data for Action Talk - 2016-02-22
 
How to succeed at data without even trying!
How to succeed at data without even trying!How to succeed at data without even trying!
How to succeed at data without even trying!
 
One Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database RevolutionOne Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database Revolution
 
Analytics-Enabled Experiences: The New Secret Weapon
Analytics-Enabled Experiences: The New Secret WeaponAnalytics-Enabled Experiences: The New Secret Weapon
Analytics-Enabled Experiences: The New Secret Weapon
 
Everything has changed except us
Everything has changed except usEverything has changed except us
Everything has changed except us
 
Where Is Your Data?: An Introduction to Problems and Bottlenecks in Data Systems
Where Is Your Data?: An Introduction to Problems and Bottlenecks in Data SystemsWhere Is Your Data?: An Introduction to Problems and Bottlenecks in Data Systems
Where Is Your Data?: An Introduction to Problems and Bottlenecks in Data Systems
 
Next-Generation BPM - How to create intelligent Business Processes thanks to ...
Next-Generation BPM - How to create intelligent Business Processes thanks to ...Next-Generation BPM - How to create intelligent Business Processes thanks to ...
Next-Generation BPM - How to create intelligent Business Processes thanks to ...
 

Recently uploaded

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 

Recently uploaded (20)

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

Demystifying big data