Big Data looks
tiny

from

Stratosphere
Kostas Tzoumas
kostas.tzoumas@tu-berlin.de
Data is an important asset
video & audio streams, sensor data, RFID, GPS, user online
behavior, scientific simulations, web...
Data
Analysis
3
Four “I”s for Big Analysis
text mining, interactive and ad hoc analysis, machine
learning, graph analysis, statistical alg...
Hadoop
Hadoop’s selling point is its
low effective storage cost.
Hadoop clusters are becoming a data vortex, attracting
cr...
Advanced
Analytics
Analytics that model the data to reveal hidden
relationships, not just describe the data.
E.g., machine...
SQL

MapReduce
BigAnalytics

BigSQL
NoMapReduce
scripting
XQuery?
SQL--

column a query
store++
plan

wrong
platform

scalable
parallel sort
8
Data Scientist:

The Sexiest Job of the 21st Century
Meet the people who
can coax treasure out of
messy, unstructured data...
Taken from http://www.oracle.com/technetwork/java/jvmls2013vitek-2013524.pdf
10
Hadoop is...
1. A programming model called MapReduce
2. An implementation of said programming
model, called Hadoop MapRedu...
1. A programming model called
MapReduce
val!input!=!TextFile(textInput)
val!words!=!input.flatMap!{!line!=>!line.split(“!“...
(Romeo, (1,1,1))
(art, (1,1))
(thou, (1,1))

Reduce

(Romeo, 3)
(art, 2)
(thou, 2)

(What, 1)
(art, 1)
(thou, 1)
(hurt, 1)...
Hand-coded join in Hadoop
MapReduce

public!class!ReduceSideBookAndAuthorJoin!extends!HadoopJob!{
!!private!static!final!P...
5. Interfaces to Hadoop MapReduce
(Pig, Hive, Cascading, ...)

Reduce

Reduce
Reduce

Map

Map
Map

Lacking in
declarativi...
6. An ML library called Mahout.
Iterative programs in Hadoop
Client

16

Reduce

Iteration 3
Map

Reduce

Iteration 2
Map
...
Iterations in MapReduce too
slow. Design a new runtime
system and use the Hadoop
Incremental Iterations matter
scheduler t...
Observations
1. MapReduce programming model good for grouping & counting.
2. MapReduce programming model not good for much...
Stratosphere

Big Data

19
Stratosphere: a brief history
2009: DFG-funded research group from
TUB, HUB, HPI starts research on
“Information Managemen...
Stratosphere status
Next stable release (v0.4) coming up
around end of November. Snapshot
available to download; maturity
...
22
Desiderata for next-gen big
data platforms: Usability

10 million
Excel users

3 million
R users

70,000
Hadoop
users
23

...
Desiderata for next-gen big
data platforms: Performance
Stratosphere!

Hadoop!

0!

100!

200!

300!

400!

500!

600!

70...
Data characteristics change

Each color is a differently written
program that produces the same result but has very
differ...
filter

MATC

aggregate

project

Enumeration Algorithm: 4 5 6 7 8r0
0 123
r0 = this Read Set

UDF Code Analysis

0 123456...
one pass
dataflow

many pass
dataflow

MapReduce

Impala, ...

Stratosphere

Text

✔

✔

✔

Aggregation

✔

✔

✔

ETL

✔

✔
...
Giraph is a Stratosphere
Incremental
program Iterations: Doing Pregel
Working Set has messages sent by the vertices

Wi+1
...
To recap:
Stratosphere is an open-source system that runs
on top of Hadoop Yarn and HDFS, but replaces
Hadoop MapReduce wi...
A next-generation Big Data
platform is being developed
in Berlin.

Help us shape
the future of
Stratosphere!

30

http://w...
Upcoming SlideShare
Loading in …5
×

Dr. Kostas Tzoumas: Big Data Looks Tiny From Stratosphere at Big Data Beers (Nov. 20, 2013)

2,087 views

Published on

Presentation by Dr. Kostas Tzoumas at the Big Data Beers Meetup [1] (Nov. 20, 2013) introducing the Stratosphere Platform for Big Data Analytics.

Check out http://stratosphere.eu for more information.

[1] http://www.meetup.com/Big-Data-Beers/events/147397982/

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,087
On SlideShare
0
From Embeds
0
Number of Embeds
905
Actions
Shares
0
Downloads
29
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Dr. Kostas Tzoumas: Big Data Looks Tiny From Stratosphere at Big Data Beers (Nov. 20, 2013)

  1. 1. Big Data looks tiny from Stratosphere Kostas Tzoumas kostas.tzoumas@tu-berlin.de
  2. 2. Data is an important asset video & audio streams, sensor data, RFID, GPS, user online behavior, scientific simulations, web archives, ... Volume Handle petabytes of data Velocity Handle high data arrival rates Variety Handle many heterogeneous data sources Veracity 2 Handle inherent uncertainty of data
  3. 3. Data Analysis 3
  4. 4. Four “I”s for Big Analysis text mining, interactive and ad hoc analysis, machine learning, graph analysis, statistical algorithms Iterative Model the data, do not just describe it Incremental Maintain the model under high arrival rates Interactive Step-by-step data exploration on very large data Integrative 4 Fluent unified interfaces for different data models
  5. 5. Hadoop Hadoop’s selling point is its low effective storage cost. Hadoop clusters are becoming a data vortex, attracting cross-departmental data and changing the data usage culture in companies. Hadoop MapReduce was the wrong abstraction and implementation to begin with and will be superseded by better systems. 5
  6. 6. Advanced Analytics Analytics that model the data to reveal hidden relationships, not just describe the data. E.g., machine learning, predictive stats, graph analysis Increasingly important from a market perspective. Very different than SQL analytics: different languages and access patterns (iterative vs. one-pass programs). Hadoop toolchain poor; R, Matlab, etc not parallel. 6
  7. 7. SQL MapReduce BigAnalytics BigSQL NoMapReduce
  8. 8. scripting XQuery? SQL-- column a query store++ plan wrong platform scalable parallel sort 8
  9. 9. Data Scientist: The Sexiest Job of the 21st Century Meet the people who can coax treasure out of messy, unstructured data. FROM!(! by Thomas H. Davenport !!FROM!pv_users! and D.J. Patil !!MAP!pv_users.userid,!pv_users.date! !!USING!'map_script'! !!AS!dt,!uid! !!CLUSTER0BY0dt)!map_output! INSERT0OVERWRITE0TABLE0pv_users_reduced! !!REDUCE!map_output.dt,!map_output.uid! !!USING!'reduce_script'! !!AS!date,!count;! ≠ hen Jonathan Goldman arrived for work in June 2006 at LinkedIn, the business networking site, the place still felt like a start-up. The company had just under 8 million accounts, and the number was A"="load"'WordcountInput.txt';" growing quickly as existing memB"="MAPREDUCE"wordcount.jar"store"A"into"'inputDir‘"load" """"'outputDir'"as"(word:chararray,"count:"int)" colbers invited their friends and """"'org.myorg.WordCount"inputDir"outputDir';" weren’t leagues to join. But users C"="sort"B"by"count;" seeking out connections with the people who were already on the site at the rate executives had expected. Something was apparently missing in the social experience. As one LinkedIn manager put it, “It was like arriving at a conference reception and realizing you don’t know anyone. So you just stand in the corner sipping your drink—and you 9
  10. 10. Taken from http://www.oracle.com/technetwork/java/jvmls2013vitek-2013524.pdf 10
  11. 11. Hadoop is... 1. A programming model called MapReduce 2. An implementation of said programming model, called Hadoop MapReduce 3. A file system, called HDFS 4. A resource manager, called Yarn 5. Interfaces to Hadoop MapReduce (Pig, Hive, Cascading, ...) 6. An ML library called Mahout. 7. Recently, a collection of runtime systems (Tez, Impala, Spark, Stratosphere, ...) 11 * Inspired by Jens Dittrich
  12. 12. 1. A programming model called MapReduce val!input!=!TextFile(textInput) val!words!=!input.flatMap!{!line!=>!line.split(“!“)!} val!counts!=!words.groupBy!{!word!=>!word!}.count()! val!output!=!counts.write6(wordsOutput,!CsvOutputFormat()) map reduce ( ( ) [ ) [ “Romeo, Romeo, wherefore art thou Romeo?” = (Romeo,(1,1,1)) (wherefore,1), (art,1) (thou,1) ] (Romeo,1), (Romeo,1) (wherefore,1), (art,1) (thou,1), (Romeo,1) = 12 (Romeo,3) (wherefore,1), (art,1) (thou,1) ]
  13. 13. (Romeo, (1,1,1)) (art, (1,1)) (thou, (1,1)) Reduce (Romeo, 3) (art, 2) (thou, 2) (What, 1) (art, 1) (thou, 1) (hurt, 1) (wherefore, 1) (What, 1) (hurt, 1) Reduce “What, art thou hurt?” Map “Romeo, Romeo, wherefore art thou Romeo?” (Romeo, 1) (Romeo, 1) (wherefore, 1) (art, 1) (thou, 1) (Romeo, 1) Map 2. An implementation of said programming model, called Hadoop MapReduce (wherefore, 1) (What, 1) (hurt, 1) Data written to disk Data shuffled over network 13
  14. 14. Hand-coded join in Hadoop MapReduce public!class!ReduceSideBookAndAuthorJoin!extends!HadoopJob!{ !!private!static!final!Pattern!SEPARATOR!=!Pattern.compile("t"); !!@Override !!public!int!run(String[]!args)!throws!Exception!{ !!!!Map<String,String>!parsedArgs!=!parseArgs(args); !!!!Path!authors!=!new!Path(parsedArgs.get("OOauthors")); !!!!Path!books!=!new!Path(parsedArgs.get("OObooks")); !!!!Path!outputPath!=!new!Path(parsedArgs.get("OOoutput")); !!!!Job!join!=!new!Job(new!Configuration(getConf())); !!!!Configuration!jobConf!=!join.getConfiguration(); !!!!MultipleInputs.addInputPath(join,!authors,!TextInputFormat.class,!ConvertAuthorsMapper.class); !!!!MultipleInputs.addInputPath(join,!books,!TextInputFormat.class,!ConvertBooksMapper.class); !!!!join.setMapOutputKeyClass(SecondarySortedAuthorID.class); !!!!join.setMapOutputValueClass(AuthorOrTitleAndYearOfPublication.class); !!!!jobConf.setBoolean("mapred.compress.map.output",!true); !!!!join.setReducerClass(JoinReducer.class); !!!!join.setOutputKeyClass(Text.class); !!!!join.setOutputValueClass(NullWritable.class); !!!!join.setJarByClass(JoinReducer.class); !!!!join.setJobName("reduceSideBookAuthorJoin"); !!!!join.setOutputFormatClass(TextOutputFormat.class); !!!!jobConf.set("mapred.output.dir",!outputPath.toString()); !!!!join.setGroupingComparatorClass(SecondarySortedAuthorID.GroupingComparator.class); !!!!join.waitForCompletion(true); !!!!return!0; !!} !!static!class!ConvertAuthorsMapper !!!!!!extends!Mapper<Object,Text,SecondarySortedAuthorID,AuthorOrTitleAndYearOfPublication>!{ !!!!@Override !!!!protected!void!map(Object!key,!Text!value,!Context!ctx)!throws!IOException,!InterruptedException! { !!!!!!String!line!=!value.toString(); !!!!!!if!(line.length()!>!0)!{ !!!!!!!!String[]!tokens!=!SEPARATOR.split(line.toString()); !!!!!!!!long!authorID!=!Long.parseLong(tokens[0]); !!!!!!!!String!author!=!tokens[1]; !!!!!!!!ctx.write(new!SecondarySortedAuthorID(authorID,!true),!new! AuthorOrTitleAndYearOfPublication(author)); !!!!!!} !!!!} !!} !!static!class!ConvertBooksMapper !!!!!!extends!Mapper<Object,Text,SecondarySortedAuthorID,AuthorOrTitleAndYearOfPublication>!{ !!!!@Override !!!!protected!void!map(Object!key,!Text!line,!Context!ctx)!throws!IOException,!InterruptedException!{ !!!!!!String[]!tokens!=!SEPARATOR.split(line.toString()); !!!!!!long!authorID!=!Long.parseLong(tokens[0]); !!!!!!short!yearOfPublication!=!Short.parseShort(tokens[1]); !!!!!!String!title!=!tokens[2]; !!!!!!ctx.write(new!SecondarySortedAuthorID(authorID,!false),!new! AuthorOrTitleAndYearOfPublication(title, !!!!!!!!!!yearOfPublication)); !!!!} !!} !!static!class!JoinReducer !!!!!!extends!Reducer<SecondarySortedAuthorID,AuthorOrTitleAndYearOfPublication,Text,NullWritable>!{ !!!!@Override !!!!protected!void!reduce(SecondarySortedAuthorID!key,!Iterable<AuthorOrTitleAndYearOfPublication>! values,!Context!ctx) !!!!!!!!throws!IOException,!InterruptedException!{ !!!!!!String!author!=!null; !!!!!!for!(AuthorOrTitleAndYearOfPublication!value!:!values)!{ !!!!!!!!if!(author!==!null!&&!!value.containsAuthor())!{ !!!!!!!!!!throw!new!IllegalStateException("No!author!found!for!book:!"!+!value.getTitle()); !!!!!!!!}!else!if!(author!==!null!&&!value.containsAuthor())!{ !!!!!!!!!!author!=!value.getAuthor(); !!!!!!!!}!else!{ !!!!!!!!!!ctx.write(new!Text(author!+!'t'!+!value.getTitle()!+!'t'!+!value.getYearOfPublication()), !!!!!!!!!!!!!!NullWritable.get()); !!!!!!!!} !!!!!!} !!!!} !!} !!static!class!SecondarySortedAuthorID!implements!WritableComparable<SecondarySortedAuthorID>!{ !!!!private!boolean!containsAuthor; !!!!private!long!id; !!!!static!{ !!!!!!WritableComparator.define(SecondarySortedAuthorID.class,!new!SecondarySortComparator()); !!!!} !!!!SecondarySortedAuthorID()!{} !!!!SecondarySortedAuthorID(long!id,!boolean!containsAuthor)!{ !!!!!!this.id!=!id; !!!!!!this.containsAuthor!=!containsAuthor; !!!!} !!!!@Override !!!!public!int!compareTo(SecondarySortedAuthorID!other)!{ !!!!!!return!ComparisonChain.start() !!!!!!!!!!.compare(id,!other.id) !!!!!!!!!!.result(); !!!!} !!!!@Override !!!!public!void!write(DataOutput!out)!throws!IOException!{ !!!!!!out.writeBoolean(containsAuthor); !!!!!!out.writeLong(id); !!!!} !!!!@Override !!!!public!void!readFields(DataInput!in)!throws!IOException!{ !!!!!!containsAuthor!=!in.readBoolean(); !!!!!!id!=!in.readLong(); !!!!} !!!!@Override !!!!public!boolean!equals(Object!o)!{ !!!!!!if!(o!instanceof!SecondarySortedAuthorID)!{ !!!!!!!!return!id!==!((SecondarySortedAuthorID)!o).id; !!!!!!} !!!!!!return!false; !!!!} !!!!@Override !!!!public!int!hashCode()!{ !!!!!!return!Longs.hashCode(id); !!!!} !!!!static!class!SecondarySortComparator!extends!WritableComparator!implements!Serializable!{ !!!!!!protected!SecondarySortComparator()!{ !!!!!!!!super(SecondarySortedAuthorID.class,!true); !!!!!!} !!!!!!@Override !!!!!!public!int!compare(WritableComparable!a,!WritableComparable!b)!{ !!!!!!!!SecondarySortedAuthorID!keyA!=!(SecondarySortedAuthorID)!a; !!!!!!!!SecondarySortedAuthorID!keyB!=!(SecondarySortedAuthorID)!b; !!!!!!!!return!ComparisonChain.start() !!!!!!!!!!!!.compare(keyA.id,!keyB.id) !!!!!!!!!!!!.compare(!keyA.containsAuthor,!!keyB.containsAuthor) !!!!!!!!!!!!.result(); !!!!!!} !!!!} !!!!static!class!GroupingComparator!extends!WritableComparator!implements!Serializable!{ !!!!!!protected!GroupingComparator()!{ !!!!!!!!super(SecondarySortedAuthorID.class,!true); !!!!!!} !!!!} !!} 14 !!static!class!AuthorOrTitleAndYearOfPublication!implements! Writable!{ !!!!private!boolean!containsAuthor; !!!!private!String!author; !!!!private!String!title; !!!!private!Short!yearOfPublication; !!!!AuthorOrTitleAndYearOfPublication()!{} !!!!AuthorOrTitleAndYearOfPublication(String!author)!{ !!!!!!this.containsAuthor!=!true; !!!!!!this.author!=!Preconditions.checkNotNull(author); !!!!} !!!!AuthorOrTitleAndYearOfPublication(String!title,!short! yearOfPublication)!{ !!!!!!this.containsAuthor!=!false; !!!!!!this.title!=!Preconditions.checkNotNull(title); !!!!!!this.yearOfPublication!=!yearOfPublication; !!!!} !!!!public!boolean!containsAuthor()!{ !!!!!!return!containsAuthor; !!!!} !!!!public!String!getAuthor()!{ !!!!!!return!author; !!!!} !!!!public!String!getTitle()!{ !!!!!!return!title; !!!!} !!!!public!Short!getYearOfPublication()!{ !!!!!!return!yearOfPublication; !!!!} !!!!@Override !!!!public!void!write(DataOutput!out)!throws!IOException!{ !!!!!!out.writeBoolean(containsAuthor); !!!!!!if!(containsAuthor)!{ !!!!!!!!out.writeUTF(author); !!!!!!}!else!{ !!!!!!!!out.writeUTF(title); !!!!!!!!out.writeShort(yearOfPublication); !!!!!!} !!!!} !!!!@Override !!!!public!void!readFields(DataInput!in)!throws!IOException!{ !!!!!!author!=!null; !!!!!!title!=!null; !!!!!!yearOfPublication!=!null; !!!!!!containsAuthor!=!in.readBoolean(); !!!!!!if!(containsAuthor)!{ !!!!!!!!author!=!in.readUTF(); !!!!!!}!else!{ !!!!!!!!title!=!in.readUTF(); !!!!!!!!yearOfPublication!=!in.readShort(); !!!!!!} !!!!} !!} }
  15. 15. 5. Interfaces to Hadoop MapReduce (Pig, Hive, Cascading, ...) Reduce Reduce Reduce Map Map Map Lacking in declarativity 15 Operators exchange data via HDFS Sort the only grouping operator Need many MapReduce rounds
  16. 16. 6. An ML library called Mahout. Iterative programs in Hadoop Client 16 Reduce Iteration 3 Map Reduce Iteration 2 Map Reduce Map Iteration 1
  17. 17. Iterations in MapReduce too slow. Design a new runtime system and use the Hadoop Incremental Iterations matter scheduler to exploit sparse computational dependencies. ■ Changes to the iteration's result for Connected Components in each superstep # Vertices (thousands) 1400 1200 1000 800 600 400 200 0 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 Naïve (Bulk) Superstep 17 Incremental
  18. 18. Observations 1. MapReduce programming model good for grouping & counting. 2. MapReduce programming model not good for much else. 3. Hadoop implementation of MapReduce trades performance for fault-tolerance (disk-based data shuffling). 4. MapReduce programming model not suited for SQL. Need to hack around it with multiple MapReduce rounds. 5. Hadoop’s implementation of MapReduce not suited for SQL. 6. MapReduce programming model and its Hadoop implementation not suited for iterations. Need to hack around it with implementing iterations in client or embedding a new runtime in a Map function. 18
  19. 19. Stratosphere Big Data 19
  20. 20. Stratosphere: a brief history 2009: DFG-funded research group from TUB, HUB, HPI starts research on “Information Management in the Cloud.” 2010-2012: Stratosphere released as open source (v0.1, v0.2) and becomes known in academic community. Companies and Universities in Europe become part of Stratosphere. 2013 and beyond: Transition from a research project to a stable and usable open source system, developer community, and real-world use cases. 20
  21. 21. Stratosphere status Next stable release (v0.4) coming up around end of November. Snapshot available to download; maturity equivalent to Apache incubations. 21 Community picking up: external developers from Universities (KTH, SICS, Inria, and others), hackathons in Berlin, Paris, Budapest, companies are starting to use Stratosphere (Deutsche Telekom, Internet Memory, Mediaplus).
  22. 22. 22
  23. 23. Desiderata for next-gen big data platforms: Usability 10 million Excel users 3 million R users 70,000 Hadoop users 23 “the market faces certain challenges such as unavailability of qualified and experienced work professionals, who can effectively handle the Hadoop architecture.”
  24. 24. Desiderata for next-gen big data platforms: Performance Stratosphere! Hadoop! 0! 100! 200! 300! 400! 500! 600! 700! Performance difference from days to minutes enables real time decision making and widespread use of data within the organization. 24
  25. 25. Data characteristics change Each color is a differently written program that produces the same result but has very different performance depending on small changes in the data set and the analysis requirements Query optimizers: the enabling technology for SQL data warehousing and BI Successful industrial application of artificial intelligence Data characteristics change Currently, only Stratosphere can optimize non-relational data analysis programs. (a) Complex Plan Diagram (b) Reduced Plan Diagram Figure 2: Complex Plan and Reduced Plan Diagram (Query 8, OptA) 25
  26. 26. filter MATC aggregate project Enumeration Algorithm: 4 5 6 7 8r0 0 123 r0 = this Read Set UDF Code Analysis 0 123456 78 = this r1 = @parameter0 // Iter ter0 // project Iterator r2 = @parameter1 // Coll ter1 // Collector Use a combination of compiler and • Checks reorder conditions and switches successive operators database technology to lift optimization Supported Transformations: • Filter push-down beyond relational algebra. Derive • Join reordering • Invariant group transformations properties of user-defined functions via • Non-relational operators are integrated code analysis and use these to mimic a relational database optimizer. r0 = this r1 = @parameter0 // Record / Reco Prerequisites: r2 = @parameter1 // Collector / Coll r1 = @parameter0 // Record / Reco • Descents data //filter r2 = @parameter1 ow recursively top-down / Collector Coll $r5 = r1.getField(8) 8) Control-Flow, Def-Use, Use-Def lists $r6 = r0.date_lb $i0 = $r5.compareTo($r6) o($r6) • Fixed API to access records $d0 = r1.getField(6) r0 = this r1 / Reco r3 = r1.next() = @parameter0 // Record r0.extendedprice = $d0 = @parameter0 // Record = r r2 = @parameter1 // Collector / Coll r1 / Reco d0 r3.getField(4) ) $d1 = r1.getField(7) r2 = @parameter1 // Collector / Coll $d0 = r1.getField(6) goto 2 r0.discount = $d1 r0.extendedprice = $d0 $r5 = r1.getField(8) 8) $d1 $r6 = r0.date_lb 1: r3 = r1.next()= r1.getField(7) $r7 = r0.revenue // PactRecord r0.discount = $d1 $i0 = $r5.compareTo($r6) o($r6) $d1 = r3.getField(4) Extracted Information: $d2 = r0.extendedprice $i0 < 0 goto 1 if $r9 = r1.getField(8) 8) d0 = d0 + $d1 $r7 = r0.revenue // PactRecord $d3 = • Field sets= r0.date_ub write accesses on records r0.discount track read and $d2 = r0.extendedprice $r10 $r9 = r1.getField(8) 8) $d4 = 0 - $d3 2: $z0 = r1.hasNext() $d3 = r0.discount • Upper and$r9.compareTo($r10) $i1 = lower output cardinality bounds $r10 = r0.date_ub $d5 = $d2 * $d4 $d4 1 if $z0 != 0 goto = 0 - $d3 if $i1 >= 0 goto 1 $i1 = $r9.compareTo($r10) $d5 = $d2 * $d4 $r7.setValue($d5) if $i1 >= 0 goto 1 r3.setField(4, d0) 4, Safety: $r7.setValue($d5) r1.setNull(6) ) r2.collect(r1) r2.collect(r3) r1.setNull(6) ) • All record access instructions are detected r2.collect(r1) r1.setNull(7) r1.setNull(7) • Supersets of actual Read/Write sets are returned r1.setNull(8) 1: return r1.setNull(8) 1: return if $i0 < 0 goto 1 • Supersets allow fewer but always safe transformations $r8 = r0.revenue nds 0 123456 78 Details Set[HPS+12] Bounds Write in Ou et Out-Card [0,1] 0 123456 78 0 123456 78 [0,1] Data Flow Transformations Reorder Conditions: output [0,1] 0 123456 78 [0,1] 0 123456 78 Physical Optimization output 0 123456 78 0 12345678 0 12345678 0 123456 78 5 0 123456 78 0 0 123456 78 0 123456 78 0 123456 78 5 0 12345678 MATCH 5 0 123456 78 0 123456 78 0 123456 78 0 0 123456 78 aggregate 0 123456 78 0 123456 78 0 12345678 0 123456 7 supplier8 Interesting Properties: 5 0 123456 78 0 0 123456 78 0 123456 78 0 123456 78 0 123456 78 • Checks reorder conditions and switches successive operators REDUCE join 0 12345678 REDUCE aggregate MAP supplier project 5 0 123456 78 0 123456 78 0 123456 78 Details in [HPS+12] 0 123456 78 0 123456 78 0 123456 78 0 123456 78 0 123456 78 0 123456 78 0 123456 78 REDUCE aggregate 0 123456 78 0 123456 78 0 123456 78 MAP lineitem filter Partition project output lineitem Execution Plan Selection: Details in [BEH+10] • Chooses execution strategies for 2nd-order functions Local Forward MATCH Hybrid-Hash • Chooses shipping strategies to distribute data [WK09] Warneke, Kao, • Strategies known from parallel databases Local Forward Parallel Execution REDUCE 0 12345678 0 12345678 0 12345678 0 12345678 0 12345678 0 12345678 Partition supplier MAP filter 0 123456 78 MAP • Massively parallel execution of 26 DAG-structured data ows • Sequential processing tasks • • R 0 123456 78 0 12345678 0 12345678 0 12345678 0 12345678 MAP 0 123456 78 Pipeline Local Forward 0 12345678 0 12345678 • lineitem MAP Pipeline • 0 123456 78 Execution Engine: • 0 123456 78 0 123456 78 lineitem E filter COMBINE 3 4 5 6 7 8 0 12 0 1 Part-Sort2 3 4 5 6 7 8 MAP Local Forward project 0 123456 78 lineitem 0 123456 78 Partition REDUCE 0 123456 78 • Exploits UDF annotations for size estimates • Cost model combines network, disk I/O and CPU costs Parallel Execution lineitem Pa 0 123456 78 aggregate supplier MAP 0 123456 78 0 123456 78 hysical Optimization [0,1] 5 REDUCE 2 3 4 5 6 7 8 0 1 Sort 0 1 2 3 4 5 6 7 8 supplier 0 123456 78 project 0 123456 78 0 123456 78 supplier 0 123456 78 0 123456 78 MAP Cost-based Plan Selection: 0 123456 78 0 123456 78 Local Forward 5 0 123456 78 0 123456 78 filter lineitem Local Forward MATCH join MATCH 5 0 0 Hybrid-Hash 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 123456 78 0 123456 78 0 123456 78 0 123456 78 0 123456 78 0 123456 78 0 123456 78 0 123456 78 MAP 0 0 123456 78 0 123456 78 • Sorting, Grouping, Partitioning MAP MAP MAP supplier filter filter project • Invariant group project transformations • Property preservation reasoning with write sets • Non-relational operators are integrated filter 0 123456 78 0 12345678 5 0 123456 78 0 123456 78 0 • Filter push-down 1 2 3 4 5 6 7 8 0 123456 78 • Join reordering MAP r3.setField(4, d0) 4, r2.collect(r3) MAP 0 12345678 0 12345678 • Chooses shipping strategies to distribute data join Enumeration Algorithm: • Descents data ow recursively top-down • Strategies known from parallel databases MATCH Supported Transformations: 2: $z0 = r1.hasNext() if $z0 != 0 goto 1 0 123456 78 5 0 123456 78 0 0 123456 78 0 123456 78 0 123456 78 0 123456 78 5 0 123456 78 0 123456 78 0 123456 78 0 12345678 0 123456 78 0 12345678 0 12345678 0 123456 78 0 123456 78 1: r3 = r1.next() $d1 = r3.getField(4) d0 = d0 + $d1 output 0 12345678 0 12345678 0 123456 78 0 12345678 5 0 123456 78 project goto 2 output output 0 12345678 0 12345678 REDUCE MATCH Execution Plan Selection: aggregate join 2. Preservation of groupsREDUCE for grouping operators nd-order functions MATCH • Chooses executionMATCH strategies for 2 • Groups must remain unchanged or be completely removed join aggregate join 0 123456 78 MAP r3 = r1.next() d0 = r r3.getField(4) ) output output 1. No Write-Read / Write-Write con icts on 0 1 2 3 4 5 6 7 8 record elds 0 12345678 0 12345 78 • Similar to con ict detection 6in optimistic 1concurrency control 0 2345678 [0,1] 0 r0 = this r1 = @parameter0 // Iter 0 1 2 3 4 5 6 7 8 ter0 // Iterator r2 = @parameter1 // Coll 0 1 2 3 4 5 6 7 8 ter1 // Collector $r8 = r0.revenue r1.setField(4, $r8) r2.collect(r1) 1) r1.setField(4, $r8) r2.collect(r1) 1) Details in [HPS+12] and [HKT12] 5 0 123456 78 aggregate r0 = this • Static Code Analysis Framework provides join 0 123456 78 Local Forward output output output lineitem MATCH Hybrid-Hash REDUCE Sort MATCH Hybrid-Hash supplier REDUCE Sort MATCH Hybrid-Hash supplier REDUCE Sort D supplier [H
  27. 27. one pass dataflow many pass dataflow MapReduce Impala, ... Stratosphere Text ✔ ✔ ✔ Aggregation ✔ ✔ ✔ ETL ✔ ✔ ✔ SQL Hive is too slow ✔ ✔ Advanced analytics Mahout is slow and low level Madlib is too slow ✔ A fast, massively parallel database-inspired backend. map reduce Truly scales to disk-resident large data sets using database technology (e.g., hybrid hashing and external sort-merge for implementing key matching). Built-in support for iterative programs via “iterate” operator: predictive and advanced analytics (machine learning, graph processing, stats) are all iterative. 27
  28. 28. Giraph is a Stratosphere Incremental program Iterations: Doing Pregel Working Set has messages sent by the vertices Wi+1 Create Messages from new state Graph Topology Delta set has state of changed vertices Di+1 Aggregate messages and derive new state Match . U CoGroup N (left outer) Wi Si Stratosphere – Parallel Analytics Beyond MapReduce 28
  29. 29. To recap: Stratosphere is an open-source system that runs on top of Hadoop Yarn and HDFS, but replaces Hadoop MapReduce with a new runtime engine designed for iterative and DAG-shaped programs, offers a program optimizer that frees programmer from low-level decisions, is scalable to large clusters and disk-resident data sets, and is programmable in Java and Scala (and more to come). 29
  30. 30. A next-generation Big Data platform is being developed in Berlin. Help us shape the future of Stratosphere! 30 http://www.flickr.com/photos/andiearbeit/4354455624/lightbox/

×