SlideShare a Scribd company logo
1 of 21
Download to read offline
Partitioning SKA Dataflows
for Optimal Graph Execution
Data-­‐Intensive	
  Astronomy	
  (DIA),	
  ICRAR	
  
chen.wu@icrar.org	
  
	
  
	
  
Chen Wu
Andreas Wicenec, Rodrigo Tobar
2
2
Western Australia
South Africa
h=ps://www.skatelescope.org	
  
Precursor	
  (fully	
  opera/onal)	
  
Murchison	
  Widefield	
  Array	
  (MWA)	
  
Precursor	
  
MeerKAT	
  
The	
  Square	
  Kilometre	
  Array	
  	
  
SKA Science Data Processor (SDP) high-level dataflow
Data	
  ingesCon	
  at	
  0.5	
  TB/s	
  per	
  site	
  
Data	
  management	
  
Data	
  processing	
  130	
  PFlops	
  per	
  site	
  
Data	
  analysis	
  and	
  Vis	
  
4
SKA Data Challenges
•  Multiple concurrent observing projects
•  Data sharing between projects
•  Capital and operational budget limited
•  Power, Cooling
•  Acquisition, maintenance & software development costs
•  Throughput: produce ~0.2-10 Tera Voxels/second
•  Automatic 23/7 type of operation
•  Data parallelism: Millions of related tasks on thousands of nodes
5
Data deluge
5
Telescope	
   Raw	
  Data	
  Rate	
   Archive	
  Growth	
  
MWA	
   1.4	
  TB/hour	
   5	
  PB/year	
  
LSST	
   1.5	
  TB/hour	
   6	
  PB/year	
  
ASKAP	
   9	
  TB/hour	
   5.5	
  PB/year	
  
SKA1-­‐Low	
   1,400	
  TB/hour	
   150	
  PB/year	
  
arxiv.org/abs/1702.07617	
  
6
DALiuGE
6
•  Defined once, executed anywhere (well)
–  Separation
–  Coherence
•  Work with existing software components
•  Extended dataflow model
–  Unlock “hidden” parallelisms
–  Data is given autonomy
•  Decentralised execution via event propagations
•  Built-in Data lifecycle management
7
Related work
7
•  Dataflow (DAG) computation model [7]
–  Unlock “hidden” parallelisms
•  DAG mapping (QAP) is a hard problem [5]
•  Exact solutions
–  Assignment graph [2], allocation graph [19] (max flow)
–  O (|V| * P) à works on small graphs on small clusters
•  Heuristics
–  One-phase (HEFT) [18]
•  Direct mapping from Ranked List A to Ranked List B
–  Two-phase [13, 16]:
•  (1) Partitioning (offline)
•  (2) Mapping (online)
8
Related work
8
•  Resource Demand Abstraction (RDA)
–  Aggregated workload “per partition”
–  Estimates and capacity planning
•  Existing two-phase methods mostly
–  multi-processors on a single node
–  We need multi-level scheduling/mapping
•  Goal ≠ Maximum parallelism
–  Resource footprint vs. execution latency
•  Graph partitioning vs. Dataflow partitioning
–  [1, 5, 20] vs. [16],…
–  dataflows vs. long running MPI processes
A
E
B F
C G
H
D
9
Graph execution
9
10
Partition problem
10
M(·∙)	
  is	
  a	
  funcCon	
  that	
  outputs	
  the	
  number	
  M	
  of	
  parCCons	
  given	
  a	
  PGT	
  and	
  a	
  soluCon	
  p	
  
	
  
T(·∙)	
  is	
  a	
  funcCon	
  that	
  outputs	
  the	
  compleCon	
  Cme	
  T	
  given	
  a	
  PGT	
  and	
  a	
  parCCon	
  soluCon	
  p.	
  
	
  
Ri(t)	
  denotes	
  the	
  aggregated	
  resource	
  demand	
  from	
  all	
  running	
  Drops	
  in	
  parCCon	
  i	
  at	
  Cme	
  t.	
  
11
Partition algorithm (somewhat greedy)
11
12
12
•  Stochastic Local Search Heuristics
–  Meta-Heuristics
•  Particle Swarm Optimisation
•  Genetic algorithm
–  Statistical mechanics
•  Simulated annealing (MCMC)
•  Mean field annealing
•  Constraints-based Local Search
•  Reinforcement learning (MDP)
–  Monte Carlo Tree Search
Comparison	
  on	
  LOFAR	
  Imaging	
  	
  
(No	
  deadline,	
  DoP	
  =	
  4)	
  
Min	
  
Cost	
  
#	
  Parts	
   Run	
  
Time	
  	
  
Direct	
  HeurisCcs	
  (Edge	
  zeroing)	
   403	
   50	
   3	
  
ParCcle	
  Swarm	
  OpCmisaCon	
   423	
   57	
   5	
  
Simulated	
  annealing	
   713	
   73	
   64	
  
Monte	
  Carlo	
  Tree	
  Search	
  	
  
(250	
  ms	
  “thinking”	
  Cme)	
  
403	
   51	
   57	
  
Monte	
  Carlo	
  Tree	
  Search	
  	
  
(150	
  ms	
  “thinking”	
  Cme)	
  
408	
   52	
   35	
  
AlphaGO
Partition algorithm (WIP: less greedy)
13
Partitioning constraint (DoP)
13
•  How to preserve constraints à Graph
theory to the rescue!
–  Brutal force does not work well due to the
huge number of anti-chains
–  Dilworth theorem (normal antichain)
•  Let bpg = bipartite_graph(DAG)
•  DoP == Poset Width ==
len(max_antichain) ==
len(min_num_chain) == cardinality(dag) -
len(max_matching(bpg))
–  Maximum Weighted K-families (weighted
antichain)
•  Split graph à Admissible Graph à
Residual Graph (using maxflow) à Pi
•  Drops that satisfies a Pi equation is in the
maximum weighted antichain
14
Case study – CHILES
14
15
Case study – CHILES
15
16
Case study – CHILES
16
17
Case study – CHILES
CHILES Meeting, Perth, Jan 2017
 17
Given:	
  4	
  Cores	
  per	
  node	
  
Output	
  schedule:	
  
4	
  nodes	
  needed	
  
exec_Cme:32	
  
total_data_movement:55	
  
Given:	
  2	
  Cores	
  per	
  node	
  
Output	
  schedule:	
  
6	
  nodes	
  needed	
  
exec_Cme:37	
  	
  
total_data_movement:65	
  
18
# of nodes vs. per node DoP
18
19
Graph monitoring
19
20
Graph execution on Tianhe2
20
70K	
  Drops	
  running	
  on	
  500	
  	
  
compute	
  nodes	
  at	
  the	
  Tianhe-­‐2	
  	
  
Supercomputer	
  for	
  simulated	
  
LOFAR	
  imaging	
  “simulated”	
  run	
  
	
  
Gray	
  –	
  Drops	
  not	
  yet	
  started	
  
Yellow	
  –	
  Drops	
  being	
  executed	
  
Green	
  –	
  Drops	
  completed	
  execuCons	
  
Red	
  –	
  Drops	
  failed	
  
21
Summary
21
•  SKA Dataflows
•  Related work
•  Graph execution engine à DALiuGE
•  Partitioning problem
•  Partitioning algorithm (current + WIP)
•  Partitioning constraint à DoP
•  Case study and preliminary results

More Related Content

What's hot

Clustering
ClusteringClustering
Clustering
Anjan Goswami
 
Case study- Real-time OLAP Cubes
Case study- Real-time OLAP Cubes Case study- Real-time OLAP Cubes
Case study- Real-time OLAP Cubes
Ziemowit Jankowski
 
Ronalao termpresent
Ronalao termpresentRonalao termpresent
Ronalao termpresent
Elma Belitz
 

What's hot (20)

Omid: A transactional Framework for HBase
Omid: A transactional Framework for HBaseOmid: A transactional Framework for HBase
Omid: A transactional Framework for HBase
 
Advancing Scientific Data Support in ArcGIS
Advancing Scientific Data Support in ArcGISAdvancing Scientific Data Support in ArcGIS
Advancing Scientific Data Support in ArcGIS
 
Exploiting GPU's for Columnar DataFrrames by Kiran Lonikar
Exploiting GPU's for Columnar DataFrrames by Kiran LonikarExploiting GPU's for Columnar DataFrrames by Kiran Lonikar
Exploiting GPU's for Columnar DataFrrames by Kiran Lonikar
 
Reading HDF family of formats via NetCDF-Java / CDM
Reading HDF family of formats via NetCDF-Java / CDMReading HDF family of formats via NetCDF-Java / CDM
Reading HDF family of formats via NetCDF-Java / CDM
 
Working with Scientific Data in MATLAB
Working with Scientific Data in MATLABWorking with Scientific Data in MATLAB
Working with Scientific Data in MATLAB
 
Clustering
ClusteringClustering
Clustering
 
Graphite
GraphiteGraphite
Graphite
 
Spark Summit EU talk by Elena Lazovik
Spark Summit EU talk by Elena LazovikSpark Summit EU talk by Elena Lazovik
Spark Summit EU talk by Elena Lazovik
 
Automatic Features Generation And Model Training On Spark: A Bayesian Approach
Automatic Features Generation And Model Training On Spark: A Bayesian ApproachAutomatic Features Generation And Model Training On Spark: A Bayesian Approach
Automatic Features Generation And Model Training On Spark: A Bayesian Approach
 
Case study- Real-time OLAP Cubes
Case study- Real-time OLAP Cubes Case study- Real-time OLAP Cubes
Case study- Real-time OLAP Cubes
 
Project Matsu: Elastic Clouds for Disaster Relief
Project Matsu: Elastic Clouds for Disaster ReliefProject Matsu: Elastic Clouds for Disaster Relief
Project Matsu: Elastic Clouds for Disaster Relief
 
On Extending MapReduce - Survey and Experiments
On Extending MapReduce - Survey and ExperimentsOn Extending MapReduce - Survey and Experiments
On Extending MapReduce - Survey and Experiments
 
FTM tree
FTM treeFTM tree
FTM tree
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
 
Ronalao termpresent
Ronalao termpresentRonalao termpresent
Ronalao termpresent
 
Dynamically Optimizing Queries over Large Scale Data Platforms
Dynamically Optimizing Queries over Large Scale Data PlatformsDynamically Optimizing Queries over Large Scale Data Platforms
Dynamically Optimizing Queries over Large Scale Data Platforms
 
Real-Time Analysis of Streaming Synchotron Data: SCinet SC19 Technology Chall...
Real-Time Analysis of Streaming Synchotron Data: SCinet SC19 Technology Chall...Real-Time Analysis of Streaming Synchotron Data: SCinet SC19 Technology Chall...
Real-Time Analysis of Streaming Synchotron Data: SCinet SC19 Technology Chall...
 
Llnl talk
Llnl talkLlnl talk
Llnl talk
 
Working with OpenStreetMap using Apache Spark and Geotrellis
Working with OpenStreetMap using Apache Spark and GeotrellisWorking with OpenStreetMap using Apache Spark and Geotrellis
Working with OpenStreetMap using Apache Spark and Geotrellis
 
Loffeld_SIAMCSE15
Loffeld_SIAMCSE15Loffeld_SIAMCSE15
Loffeld_SIAMCSE15
 

Similar to Partitioning SKA Dataflows for Optimal Graph Execution

Hadoop Network Performance profile
Hadoop Network Performance profileHadoop Network Performance profile
Hadoop Network Performance profile
pramodbiligiri
 
Cisco connect toronto 2015 big data sean mc keown
Cisco connect toronto 2015 big data  sean mc keownCisco connect toronto 2015 big data  sean mc keown
Cisco connect toronto 2015 big data sean mc keown
Cisco Canada
 
Optimization of Continuous Queries in Federated Database and Stream Processin...
Optimization of Continuous Queries in Federated Database and Stream Processin...Optimization of Continuous Queries in Federated Database and Stream Processin...
Optimization of Continuous Queries in Federated Database and Stream Processin...
Zbigniew Jerzak
 
Intel realtime analytics_spark
Intel realtime analytics_sparkIntel realtime analytics_spark
Intel realtime analytics_spark
Geetanjali G
 
Network-aware Data Management for Large Scale Distributed Applications, IBM R...
Network-aware Data Management for Large Scale Distributed Applications, IBM R...Network-aware Data Management for Large Scale Distributed Applications, IBM R...
Network-aware Data Management for Large Scale Distributed Applications, IBM R...
balmanme
 
A Lightweight Infrastructure for Graph Analytics
A Lightweight Infrastructure for Graph AnalyticsA Lightweight Infrastructure for Graph Analytics
A Lightweight Infrastructure for Graph Analytics
Donald Nguyen
 

Similar to Partitioning SKA Dataflows for Optimal Graph Execution (20)

Hadoop Network Performance profile
Hadoop Network Performance profileHadoop Network Performance profile
Hadoop Network Performance profile
 
Strata Stinger Talk October 2013
Strata Stinger Talk October 2013Strata Stinger Talk October 2013
Strata Stinger Talk October 2013
 
Programmable Exascale Supercomputer
Programmable Exascale SupercomputerProgrammable Exascale Supercomputer
Programmable Exascale Supercomputer
 
Streaming exa-scale data over 100Gbps networks
Streaming exa-scale data over 100Gbps networksStreaming exa-scale data over 100Gbps networks
Streaming exa-scale data over 100Gbps networks
 
Exascale Capabl
Exascale CapablExascale Capabl
Exascale Capabl
 
Big Data Architecture and Deployment
Big Data Architecture and DeploymentBig Data Architecture and Deployment
Big Data Architecture and Deployment
 
Cisco connect toronto 2015 big data sean mc keown
Cisco connect toronto 2015 big data  sean mc keownCisco connect toronto 2015 big data  sean mc keown
Cisco connect toronto 2015 big data sean mc keown
 
Data Analytics and Simulation in Parallel with MATLAB*
Data Analytics and Simulation in Parallel with MATLAB*Data Analytics and Simulation in Parallel with MATLAB*
Data Analytics and Simulation in Parallel with MATLAB*
 
Optimization of Continuous Queries in Federated Database and Stream Processin...
Optimization of Continuous Queries in Federated Database and Stream Processin...Optimization of Continuous Queries in Federated Database and Stream Processin...
Optimization of Continuous Queries in Federated Database and Stream Processin...
 
Yahoo - Moving beyond running 100% of Apache Pig jobs on Apache Tez
Yahoo - Moving beyond running 100% of Apache Pig jobs on Apache TezYahoo - Moving beyond running 100% of Apache Pig jobs on Apache Tez
Yahoo - Moving beyond running 100% of Apache Pig jobs on Apache Tez
 
Stream processing from single node to a cluster
Stream processing from single node to a clusterStream processing from single node to a cluster
Stream processing from single node to a cluster
 
Intel realtime analytics_spark
Intel realtime analytics_sparkIntel realtime analytics_spark
Intel realtime analytics_spark
 
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
 
Java Thread and Process Performance for Parallel Machine Learning on Multicor...
Java Thread and Process Performance for Parallel Machine Learning on Multicor...Java Thread and Process Performance for Parallel Machine Learning on Multicor...
Java Thread and Process Performance for Parallel Machine Learning on Multicor...
 
Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)
Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)
Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)
 
Thorny path to the Large-Scale Graph Processing (Highload++, 2014)
Thorny path to the Large-Scale Graph Processing (Highload++, 2014)Thorny path to the Large-Scale Graph Processing (Highload++, 2014)
Thorny path to the Large-Scale Graph Processing (Highload++, 2014)
 
Network-aware Data Management for Large Scale Distributed Applications, IBM R...
Network-aware Data Management for Large Scale Distributed Applications, IBM R...Network-aware Data Management for Large Scale Distributed Applications, IBM R...
Network-aware Data Management for Large Scale Distributed Applications, IBM R...
 
mapreduce.pptx
mapreduce.pptxmapreduce.pptx
mapreduce.pptx
 
Cliff sugerman
Cliff sugermanCliff sugerman
Cliff sugerman
 
A Lightweight Infrastructure for Graph Analytics
A Lightweight Infrastructure for Graph AnalyticsA Lightweight Infrastructure for Graph Analytics
A Lightweight Infrastructure for Graph Analytics
 

Recently uploaded

Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
Sérgio Sacani
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Lokesh Kothari
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
RizalinePalanog2
 
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
Lokesh Kothari
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
Areesha Ahmad
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
gindu3009
 

Recently uploaded (20)

Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 

Partitioning SKA Dataflows for Optimal Graph Execution

  • 1. Partitioning SKA Dataflows for Optimal Graph Execution Data-­‐Intensive  Astronomy  (DIA),  ICRAR   chen.wu@icrar.org       Chen Wu Andreas Wicenec, Rodrigo Tobar
  • 2. 2 2 Western Australia South Africa h=ps://www.skatelescope.org   Precursor  (fully  opera/onal)   Murchison  Widefield  Array  (MWA)   Precursor   MeerKAT   The  Square  Kilometre  Array    
  • 3. SKA Science Data Processor (SDP) high-level dataflow Data  ingesCon  at  0.5  TB/s  per  site   Data  management   Data  processing  130  PFlops  per  site   Data  analysis  and  Vis  
  • 4. 4 SKA Data Challenges •  Multiple concurrent observing projects •  Data sharing between projects •  Capital and operational budget limited •  Power, Cooling •  Acquisition, maintenance & software development costs •  Throughput: produce ~0.2-10 Tera Voxels/second •  Automatic 23/7 type of operation •  Data parallelism: Millions of related tasks on thousands of nodes
  • 5. 5 Data deluge 5 Telescope   Raw  Data  Rate   Archive  Growth   MWA   1.4  TB/hour   5  PB/year   LSST   1.5  TB/hour   6  PB/year   ASKAP   9  TB/hour   5.5  PB/year   SKA1-­‐Low   1,400  TB/hour   150  PB/year   arxiv.org/abs/1702.07617  
  • 6. 6 DALiuGE 6 •  Defined once, executed anywhere (well) –  Separation –  Coherence •  Work with existing software components •  Extended dataflow model –  Unlock “hidden” parallelisms –  Data is given autonomy •  Decentralised execution via event propagations •  Built-in Data lifecycle management
  • 7. 7 Related work 7 •  Dataflow (DAG) computation model [7] –  Unlock “hidden” parallelisms •  DAG mapping (QAP) is a hard problem [5] •  Exact solutions –  Assignment graph [2], allocation graph [19] (max flow) –  O (|V| * P) à works on small graphs on small clusters •  Heuristics –  One-phase (HEFT) [18] •  Direct mapping from Ranked List A to Ranked List B –  Two-phase [13, 16]: •  (1) Partitioning (offline) •  (2) Mapping (online)
  • 8. 8 Related work 8 •  Resource Demand Abstraction (RDA) –  Aggregated workload “per partition” –  Estimates and capacity planning •  Existing two-phase methods mostly –  multi-processors on a single node –  We need multi-level scheduling/mapping •  Goal ≠ Maximum parallelism –  Resource footprint vs. execution latency •  Graph partitioning vs. Dataflow partitioning –  [1, 5, 20] vs. [16],… –  dataflows vs. long running MPI processes A E B F C G H D
  • 10. 10 Partition problem 10 M(·∙)  is  a  funcCon  that  outputs  the  number  M  of  parCCons  given  a  PGT  and  a  soluCon  p     T(·∙)  is  a  funcCon  that  outputs  the  compleCon  Cme  T  given  a  PGT  and  a  parCCon  soluCon  p.     Ri(t)  denotes  the  aggregated  resource  demand  from  all  running  Drops  in  parCCon  i  at  Cme  t.  
  • 12. 12 12 •  Stochastic Local Search Heuristics –  Meta-Heuristics •  Particle Swarm Optimisation •  Genetic algorithm –  Statistical mechanics •  Simulated annealing (MCMC) •  Mean field annealing •  Constraints-based Local Search •  Reinforcement learning (MDP) –  Monte Carlo Tree Search Comparison  on  LOFAR  Imaging     (No  deadline,  DoP  =  4)   Min   Cost   #  Parts   Run   Time     Direct  HeurisCcs  (Edge  zeroing)   403   50   3   ParCcle  Swarm  OpCmisaCon   423   57   5   Simulated  annealing   713   73   64   Monte  Carlo  Tree  Search     (250  ms  “thinking”  Cme)   403   51   57   Monte  Carlo  Tree  Search     (150  ms  “thinking”  Cme)   408   52   35   AlphaGO Partition algorithm (WIP: less greedy)
  • 13. 13 Partitioning constraint (DoP) 13 •  How to preserve constraints à Graph theory to the rescue! –  Brutal force does not work well due to the huge number of anti-chains –  Dilworth theorem (normal antichain) •  Let bpg = bipartite_graph(DAG) •  DoP == Poset Width == len(max_antichain) == len(min_num_chain) == cardinality(dag) - len(max_matching(bpg)) –  Maximum Weighted K-families (weighted antichain) •  Split graph à Admissible Graph à Residual Graph (using maxflow) à Pi •  Drops that satisfies a Pi equation is in the maximum weighted antichain
  • 14. 14 Case study – CHILES 14
  • 15. 15 Case study – CHILES 15
  • 16. 16 Case study – CHILES 16
  • 17. 17 Case study – CHILES CHILES Meeting, Perth, Jan 2017 17 Given:  4  Cores  per  node   Output  schedule:   4  nodes  needed   exec_Cme:32   total_data_movement:55   Given:  2  Cores  per  node   Output  schedule:   6  nodes  needed   exec_Cme:37     total_data_movement:65  
  • 18. 18 # of nodes vs. per node DoP 18
  • 20. 20 Graph execution on Tianhe2 20 70K  Drops  running  on  500     compute  nodes  at  the  Tianhe-­‐2     Supercomputer  for  simulated   LOFAR  imaging  “simulated”  run     Gray  –  Drops  not  yet  started   Yellow  –  Drops  being  executed   Green  –  Drops  completed  execuCons   Red  –  Drops  failed  
  • 21. 21 Summary 21 •  SKA Dataflows •  Related work •  Graph execution engine à DALiuGE •  Partitioning problem •  Partitioning algorithm (current + WIP) •  Partitioning constraint à DoP •  Case study and preliminary results