SlideShare a Scribd company logo
BIG DATA - FAST DATA
USING MAPREDUCE IN HAZELCAST
Source:http://www.newscientist.com/gallery/dn17805-computer-museums-of-the-world/11
www.hazelcast.com
WHO AM I
Christoph Engelbert(@noctarius2k)
8+ years of JavaWeirdoness
Performance, GC, traffic topics
Apache DirectMemoryPMC
Previous companies incl. Ubisoftand HRS
CastMapRMapReduce for Hazelcast3
www.hazelcast.com
TOPICS
Hazelcast
Distributed Computing
Map &Reduce
Demonstration
Questions
www.hazelcast.com
HAZELCAST
A SHORT SPACE TRIP
www.hazelcast.com
WHAT IS HAZELCAST?
In-MemoryData-Grid
DataPartioning(Sharding)
JavaCollections Implementation
Distributed ComputingPlatform
www.hazelcast.com
WHY HAZELCAST?
www.hazelcast.com
WHY IN-MEMORY
COMPUTING?
www.hazelcast.com
TREND OF PRICES
DataSource:http://www.jcmit.com/memoryprice.htm
www.hazelcast.com
SPEED DIFFERENCE
DataSource:http://i.imgur.com/ykOjTVw.png
www.hazelcast.com
DISTRIBUTED
COMPUTING
OR
MULTICORE CPU ON STEROIDS
www.hazelcast.com
THE IDEA OF DISTRIBUTED COMPUTING
Source:https://www.flickr.com/photos/stefan_ledwina/1853508040
www.hazelcast.com
THE BEGINNING
Source:http://en.wikipedia.org/wiki/File:KL_Advanced_Micro_Devices_AM9080.jpg
www.hazelcast.com
MULTICORE IS NOT NEW
Source:http://en.wikipedia.org/wiki/File:80386with387.JPG
www.hazelcast.com
CLUSTER IT
Source:http://rarecpus.com/images2/cpu_cluster.jpg
www.hazelcast.com
SUPER COMPUTER
Source:http://www.dkrz.de/about/aufgaben/dkrz-geschichte/rechnerhistorie-1
www.hazelcast.com
CLOUD COMPUTING
Source:https://farm6.staticflickr.com/5523/11407118963_e0e0870846_b_d.jpg
www.hazelcast.com
MAP & REDUCE
THE BLACK MAGIC FROM PLANET GOOGLE
www.hazelcast.com
USE CASES
LogAnalysis
DataQuerying
Aggregation and summing
Distributed Sort
ETL (ExtractTransform Load)
and more...
www.hazelcast.com
SIMPLE STEPS
Read
Map /Transform
Reduce
www.hazelcast.com
FULL STEPS
Read
Map /Transform
Combining
Grouping/Shuffling
Reduce
Collating
www.hazelcast.com
MAPREDUCE WORKFLOW
www.hazelcast.com
Dataare mapped /transformed in asetof key-value pairs
SOME PSEUDO CODE (1/3)
MAPPING
map( key:String, document:String ):Void ->
for each w:word in document:
emit( w, 1 )
www.hazelcast.com
Multiple values are combined to an
intermediate resultto preserve traffic
SOME PSEUDO CODE (2/3)
COMBINING
combine( word:String, counts:List[Int] ):Void ->
emit( word, sum( counts ) )
www.hazelcast.com
Values are reduced /aggregated to the requested result
SOME PSEUDO CODE (3/3)
REDUCING
reduce( word:String, counts:List[Int] ):Int ->
return sum( counts )
www.hazelcast.com
FOR MATHEMATICIANS
Process: (K x V)*→ (L x W)* ⇒ [(l1, w1), …, (lm, wm)]
Mapping: (K x V) → (L x W)* ⇒ (k, v) → [(l1, w1), …, (ln, wn)]
Reducing: L x W*→ X* ⇒ (l, [w1, …, wn]) → [x1, …,xn]
www.hazelcast.com
MAPREDUCE PROGRAMS IN
GOOGLE SOURCE TREE
Source:http://research.google.com/archive/mapreduce-osdi04-slides/index-auto-0005.html
www.hazelcast.com
DEMONSTRATION
www.hazelcast.com
@noctarius2k
@hazelcast
http://www.sourceprojects.com
http://github.com/noctarius
THANK YOU!
ANY QUESTIONS?
Images:AllimagesarelicensedunderCreativeCommons
www.hazelcast.com

More Related Content

Viewers also liked

Viewers also liked (12)

JCache - Gimme Caching - JavaLand
JCache - Gimme Caching - JavaLandJCache - Gimme Caching - JavaLand
JCache - Gimme Caching - JavaLand
 
Gimme Caching - The JCache Way
Gimme Caching - The JCache WayGimme Caching - The JCache Way
Gimme Caching - The JCache Way
 
Gimme Caching, the Hazelcast JCache Way
Gimme Caching, the Hazelcast JCache WayGimme Caching, the Hazelcast JCache Way
Gimme Caching, the Hazelcast JCache Way
 
My Old Friend Malloc
My Old Friend MallocMy Old Friend Malloc
My Old Friend Malloc
 
The Delivery Hero - A Simpsons As A Service Storyboard
The Delivery Hero - A Simpsons As A Service StoryboardThe Delivery Hero - A Simpsons As A Service Storyboard
The Delivery Hero - A Simpsons As A Service Storyboard
 
Unsafe Java World - Crossing the Borderline - JokerConf 2014 Saint Petersburg
Unsafe Java World - Crossing the Borderline - JokerConf 2014 Saint PetersburgUnsafe Java World - Crossing the Borderline - JokerConf 2014 Saint Petersburg
Unsafe Java World - Crossing the Borderline - JokerConf 2014 Saint Petersburg
 
Distributed computing with Hazelcast - JavaOne 2014
Distributed computing with Hazelcast - JavaOne 2014Distributed computing with Hazelcast - JavaOne 2014
Distributed computing with Hazelcast - JavaOne 2014
 
JVM Mechanics: When Does the JVM JIT & Deoptimize?
JVM Mechanics: When Does the JVM JIT & Deoptimize?JVM Mechanics: When Does the JVM JIT & Deoptimize?
JVM Mechanics: When Does the JVM JIT & Deoptimize?
 
A Post-Apocalyptic sun.misc.Unsafe World
A Post-Apocalyptic sun.misc.Unsafe WorldA Post-Apocalyptic sun.misc.Unsafe World
A Post-Apocalyptic sun.misc.Unsafe World
 
Understanding Garbage Collection
Understanding Garbage CollectionUnderstanding Garbage Collection
Understanding Garbage Collection
 
In-Memory Computing - Distributed Systems - Devoxx UK 2015
In-Memory Computing - Distributed Systems - Devoxx UK 2015In-Memory Computing - Distributed Systems - Devoxx UK 2015
In-Memory Computing - Distributed Systems - Devoxx UK 2015
 
Hazelcast - In-Memory DataGrid
Hazelcast - In-Memory DataGridHazelcast - In-Memory DataGrid
Hazelcast - In-Memory DataGrid
 

Similar to Big Data, Fast Data - MapReduce in Hazelcast

What does it take to make google work at scale
What does it take to make google work at scale What does it take to make google work at scale
What does it take to make google work at scale
君 廖
 
Emphemeral hadoop clusters in the cloud
Emphemeral hadoop clusters in the cloudEmphemeral hadoop clusters in the cloud
Emphemeral hadoop clusters in the cloud
gfodor
 
Cassandra summit-2013
Cassandra summit-2013Cassandra summit-2013
Cassandra summit-2013
dfilppi
 

Similar to Big Data, Fast Data - MapReduce in Hazelcast (20)

Map Reduce in Hazelcast - Hazelcast User Group London Version
Map Reduce in Hazelcast - Hazelcast User Group London VersionMap Reduce in Hazelcast - Hazelcast User Group London Version
Map Reduce in Hazelcast - Hazelcast User Group London Version
 
Distributed Computing in Hazelcast - Geekout 2014 Edition
Distributed Computing in Hazelcast - Geekout 2014 EditionDistributed Computing in Hazelcast - Geekout 2014 Edition
Distributed Computing in Hazelcast - Geekout 2014 Edition
 
Ultra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & Alluxio
Ultra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & AlluxioUltra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & Alluxio
Ultra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & Alluxio
 
What does it take to make google work at scale
What does it take to make google work at scale What does it take to make google work at scale
What does it take to make google work at scale
 
Big Data on the Cloud
Big Data on the CloudBig Data on the Cloud
Big Data on the Cloud
 
What does it take to make google work at scale
What does it take to make google work at scale What does it take to make google work at scale
What does it take to make google work at scale
 
Stream Processing and Real-Time Data Pipelines
Stream Processing and Real-Time Data PipelinesStream Processing and Real-Time Data Pipelines
Stream Processing and Real-Time Data Pipelines
 
Emphemeral hadoop clusters in the cloud
Emphemeral hadoop clusters in the cloudEmphemeral hadoop clusters in the cloud
Emphemeral hadoop clusters in the cloud
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
Think Distributed: The Hazelcast Way
Think Distributed: The Hazelcast WayThink Distributed: The Hazelcast Way
Think Distributed: The Hazelcast Way
 
Distributed caching and computing v3.7
Distributed caching and computing v3.7Distributed caching and computing v3.7
Distributed caching and computing v3.7
 
NYC_2016_slides
NYC_2016_slidesNYC_2016_slides
NYC_2016_slides
 
Top-5-Performance-JaxLondon-2023.pptx
Top-5-Performance-JaxLondon-2023.pptxTop-5-Performance-JaxLondon-2023.pptx
Top-5-Performance-JaxLondon-2023.pptx
 
Cassandra summit-2013
Cassandra summit-2013Cassandra summit-2013
Cassandra summit-2013
 
Big Data, Simple and Fast: Addressing the Shortcomings of Hadoop
Big Data, Simple and Fast: Addressing the Shortcomings of HadoopBig Data, Simple and Fast: Addressing the Shortcomings of Hadoop
Big Data, Simple and Fast: Addressing the Shortcomings of Hadoop
 
Tech
TechTech
Tech
 
JavaFest. Grzegorz Piwowarek. Hazelcast - Hitchhiker’s Guide
JavaFest. Grzegorz Piwowarek. Hazelcast - Hitchhiker’s GuideJavaFest. Grzegorz Piwowarek. Hazelcast - Hitchhiker’s Guide
JavaFest. Grzegorz Piwowarek. Hazelcast - Hitchhiker’s Guide
 
Big Data and OSS at IBM
Big Data and OSS at IBMBig Data and OSS at IBM
Big Data and OSS at IBM
 
Intro to Apache Spark
Intro to Apache SparkIntro to Apache Spark
Intro to Apache Spark
 
Distributed caching-computing v3.8
Distributed caching-computing v3.8Distributed caching-computing v3.8
Distributed caching-computing v3.8
 

More from Christoph Engelbert

More from Christoph Engelbert (16)

Data Pipeline Plumbing
Data Pipeline PlumbingData Pipeline Plumbing
Data Pipeline Plumbing
 
Gute Nachrichten, Schlechte Nachrichten
Gute Nachrichten, Schlechte NachrichtenGute Nachrichten, Schlechte Nachrichten
Gute Nachrichten, Schlechte Nachrichten
 
Of Farm Topologies and Time-Series Data
Of Farm Topologies and Time-Series DataOf Farm Topologies and Time-Series Data
Of Farm Topologies and Time-Series Data
 
What I learned about IoT Security ... and why it's so hard!
What I learned about IoT Security ... and why it's so hard!What I learned about IoT Security ... and why it's so hard!
What I learned about IoT Security ... and why it's so hard!
 
PostgreSQL: The Time-Series Database You (Actually) Want
PostgreSQL: The Time-Series Database You (Actually) WantPostgreSQL: The Time-Series Database You (Actually) Want
PostgreSQL: The Time-Series Database You (Actually) Want
 
Road to (Enterprise) Observability
Road to (Enterprise) ObservabilityRoad to (Enterprise) Observability
Road to (Enterprise) Observability
 
Oops-Less Operation
Oops-Less OperationOops-Less Operation
Oops-Less Operation
 
Instan(t)a-neous Monitoring
Instan(t)a-neous MonitoringInstan(t)a-neous Monitoring
Instan(t)a-neous Monitoring
 
Don't Go, Java!
Don't Go, Java!Don't Go, Java!
Don't Go, Java!
 
TypeScript Go(es) Embedded
TypeScript Go(es) EmbeddedTypeScript Go(es) Embedded
TypeScript Go(es) Embedded
 
Hazelcast Jet - Riding the Jet Streams
Hazelcast Jet - Riding the Jet StreamsHazelcast Jet - Riding the Jet Streams
Hazelcast Jet - Riding the Jet Streams
 
CBOR - The Better JSON
CBOR - The Better JSONCBOR - The Better JSON
CBOR - The Better JSON
 
Project Panama - Beyond the (JVM) Wall
Project Panama - Beyond the (JVM) WallProject Panama - Beyond the (JVM) Wall
Project Panama - Beyond the (JVM) Wall
 
Gimme Caching - The JCache Way
Gimme Caching - The JCache WayGimme Caching - The JCache Way
Gimme Caching - The JCache Way
 
Distributed Computing with Hazelcast - Brazil Tour
Distributed Computing with Hazelcast - Brazil TourDistributed Computing with Hazelcast - Brazil Tour
Distributed Computing with Hazelcast - Brazil Tour
 
Distributed Computing - An Interactive Introduction
Distributed Computing - An Interactive IntroductionDistributed Computing - An Interactive Introduction
Distributed Computing - An Interactive Introduction
 

Recently uploaded

Laundry management system project report.pdf
Laundry management system project report.pdfLaundry management system project report.pdf
Laundry management system project report.pdf
Kamal Acharya
 
Online blood donation management system project.pdf
Online blood donation management system project.pdfOnline blood donation management system project.pdf
Online blood donation management system project.pdf
Kamal Acharya
 
RS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
RS Khurmi Machine Design Clutch and Brake Exercise Numerical SolutionsRS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
RS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
Atif Razi
 
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdfONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
Kamal Acharya
 

Recently uploaded (20)

KIT-601 Lecture Notes-UNIT-3.pdf Mining Data Stream
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data StreamKIT-601 Lecture Notes-UNIT-3.pdf Mining Data Stream
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data Stream
 
Democratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek AryaDemocratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek Arya
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
 
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdf
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdfA CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdf
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdf
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
 
Construction method of steel structure space frame .pptx
Construction method of steel structure space frame .pptxConstruction method of steel structure space frame .pptx
Construction method of steel structure space frame .pptx
 
2024 DevOps Pro Europe - Growing at the edge
2024 DevOps Pro Europe - Growing at the edge2024 DevOps Pro Europe - Growing at the edge
2024 DevOps Pro Europe - Growing at the edge
 
İTÜ CAD and Reverse Engineering Workshop
İTÜ CAD and Reverse Engineering WorkshopİTÜ CAD and Reverse Engineering Workshop
İTÜ CAD and Reverse Engineering Workshop
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdf
 
Scaling in conventional MOSFET for constant electric field and constant voltage
Scaling in conventional MOSFET for constant electric field and constant voltageScaling in conventional MOSFET for constant electric field and constant voltage
Scaling in conventional MOSFET for constant electric field and constant voltage
 
Laundry management system project report.pdf
Laundry management system project report.pdfLaundry management system project report.pdf
Laundry management system project report.pdf
 
Explosives Industry manufacturing process.pdf
Explosives Industry manufacturing process.pdfExplosives Industry manufacturing process.pdf
Explosives Industry manufacturing process.pdf
 
NO1 Pandit Amil Baba In Bahawalpur, Sargodha, Sialkot, Sheikhupura, Rahim Yar...
NO1 Pandit Amil Baba In Bahawalpur, Sargodha, Sialkot, Sheikhupura, Rahim Yar...NO1 Pandit Amil Baba In Bahawalpur, Sargodha, Sialkot, Sheikhupura, Rahim Yar...
NO1 Pandit Amil Baba In Bahawalpur, Sargodha, Sialkot, Sheikhupura, Rahim Yar...
 
Peek implant persentation - Copy (1).pdf
Peek implant persentation - Copy (1).pdfPeek implant persentation - Copy (1).pdf
Peek implant persentation - Copy (1).pdf
 
Online blood donation management system project.pdf
Online blood donation management system project.pdfOnline blood donation management system project.pdf
Online blood donation management system project.pdf
 
RS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
RS Khurmi Machine Design Clutch and Brake Exercise Numerical SolutionsRS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
RS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
 
Natalia Rutkowska - BIM School Course in Kraków
Natalia Rutkowska - BIM School Course in KrakówNatalia Rutkowska - BIM School Course in Kraków
Natalia Rutkowska - BIM School Course in Kraków
 
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptx
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptxCloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptx
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptx
 
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdfONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
 

Big Data, Fast Data - MapReduce in Hazelcast