7. Unquenchable thirst of improvement
❏ How to Sell more?
❏ How to optimize inventory?
❏ How to engage customer more?
❏ What do my customer Like?
❏ How to reduce Operation Cost?
13. Data Parallel Processing
❏ Distribute the data [ With replication]
❏ Move Computation close to Data
❏ Process each section of Data separately
❏ Aggregate the results.
14. Advantages of Data Parallel Model
❏ No Hardware restriction. e.g Memory, CPU.
❏ No Scalability Issue
❏ Cost effectiveness.
❏ No Single point of failure.
16. Challenges of Data-||-sim
❏ Data partitioning, distribution and accumulation
❏ Fault Tolerance.
❏ Distributed Coordination and management.
❏ Abstraction with the distributed complexity.
17. Big Data Ecosystem
❏ Distributed Data Storage System:
❏ Data distribution.
❏ Data Replication.
❏ High throughput with no single point of failure.
❏ Distributed Data Processing System:
❏ Distributing Code close to data.
❏ Abstracting distributed complexity from programmer.
❏ Fault tolerance and handling computation failure.
❏ Aggregating results.
❏ Distributed Coordination and Resource management.
❏ Resource allocation.
❏ Distributed configuration management.