5. Apache Hadoop
Pros:
• Batch operations
• Scalability
• User defined methods
Cons:
• The problem must be resolved in context of a single
job
• Filesystem based
12. HDD vs MEMORY?
• Memory speed is in nanoseconds
• 10GbE Network speed is in microseconds (~50)
• Flash speed is in microseconds (between 20-500+)
• Disk speed is in milliseconds (between 4-7)
14. Apache Spark
Pros:
• In memory operations up to 100x times faster then
Hadoop MapReduce
• On disc operations up to 10x times faster then Hadoop
MapReduce
• In-memory
• Batch operations & near real time
• Interactive
• Not bound to hadoop
• Easy to start for developers