Big Data, Fast Data - MapReduce in Hazelcast

BIG DATA - FAST DATA
USING MAPREDUCE IN HAZELCAST
Source:http://www.newscientist.com/gallery/dn17805-computer-museums-of-the-world/11
www.hazelcast.com

WHO AM I
Christoph Engelbert(@noctarius2k)
8+ years of JavaWeirdoness
Performance, GC, traffic topics
Apache DirectMemoryPMC
Previous companies incl. Ubisoftand HRS
CastMapRMapReduce for Hazelcast3
www.hazelcast.com

TOPICS
Hazelcast
Distributed Computing
Map &Reduce
Demonstration
Questions
www.hazelcast.com

HAZELCAST
A SHORT SPACE TRIP
www.hazelcast.com

WHAT IS HAZELCAST?
In-MemoryData-Grid
DataPartioning(Sharding)
JavaCollections Implementation
Distributed ComputingPlatform
www.hazelcast.com

WHY HAZELCAST?
www.hazelcast.com

WHY IN-MEMORY
COMPUTING?
www.hazelcast.com

TREND OF PRICES
DataSource:http://www.jcmit.com/memoryprice.htm
www.hazelcast.com

SPEED DIFFERENCE
DataSource:http://i.imgur.com/ykOjTVw.png
www.hazelcast.com

DISTRIBUTED
COMPUTING
OR
MULTICORE CPU ON STEROIDS
www.hazelcast.com

THE IDEA OF DISTRIBUTED COMPUTING
Source:https://www.flickr.com/photos/stefan_ledwina/1853508040
www.hazelcast.com

THE BEGINNING
Source:http://en.wikipedia.org/wiki/File:KL_Advanced_Micro_Devices_AM9080.jpg
www.hazelcast.com

MULTICORE IS NOT NEW
Source:http://en.wikipedia.org/wiki/File:80386with387.JPG
www.hazelcast.com

CLUSTER IT
Source:http://rarecpus.com/images2/cpu_cluster.jpg
www.hazelcast.com

SUPER COMPUTER
Source:http://www.dkrz.de/about/aufgaben/dkrz-geschichte/rechnerhistorie-1
www.hazelcast.com

CLOUD COMPUTING
Source:https://farm6.staticflickr.com/5523/11407118963_e0e0870846_b_d.jpg
www.hazelcast.com

MAP & REDUCE
THE BLACK MAGIC FROM PLANET GOOGLE
www.hazelcast.com

USE CASES
LogAnalysis
DataQuerying
Aggregation and summing
Distributed Sort
ETL (ExtractTransform Load)
and more...
www.hazelcast.com

SIMPLE STEPS
Read
Map /Transform
Reduce
www.hazelcast.com

FULL STEPS
Read
Map /Transform
Combining
Grouping/Shuffling
Reduce
Collating
www.hazelcast.com

MAPREDUCE WORKFLOW
www.hazelcast.com

Dataare mapped /transformed in asetof key-value pairs
SOME PSEUDO CODE (1/3)
MAPPING
map( key:String, document:String ):Void ->
for each w:word in document:
emit( w, 1 )
www.hazelcast.com

Multiple values are combined to an
intermediate resultto preserve traffic
COMBINING
combine( word:String, counts:List[Int] ):Void ->
emit( word, sum( counts ) )
www.hazelcast.com

Values are reduced /aggregated to the requested result
REDUCING
reduce( word:String, counts:List[Int] ):Int ->
return sum( counts )
www.hazelcast.com

FOR MATHEMATICIANS
Process: (K x V)*→ (L x W)* ⇒ [(l1, w1), …, (lm, wm)]
Mapping: (K x V) → (L x W)* ⇒ (k, v) → [(l1, w1), …, (ln, wn)]
Reducing: L x W*→ X* ⇒ (l, [w1, …, wn]) → [x1, …,xn]
www.hazelcast.com

MAPREDUCE PROGRAMS IN
GOOGLE SOURCE TREE
Source:http://research.google.com/archive/mapreduce-osdi04-slides/index-auto-0005.html
www.hazelcast.com

DEMONSTRATION
www.hazelcast.com

@noctarius2k
@hazelcast
http://www.sourceprojects.com
http://github.com/noctarius
THANK YOU!
ANY QUESTIONS?
Images:AllimagesarelicensedunderCreativeCommons
www.hazelcast.com

Big Data, Fast Data - MapReduce in Hazelcast

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (12)

Similar to Big Data, Fast Data - MapReduce in Hazelcast

Similar to Big Data, Fast Data - MapReduce in Hazelcast (20)

More from Christoph Engelbert

More from Christoph Engelbert (16)

Recently uploaded

Recently uploaded (20)

Big Data, Fast Data - MapReduce in Hazelcast