SlideShare a Scribd company logo
Data-Intensive Computing for
Competent Genetic Algorithms:
A Pilot Study using Meandre


Xavier Llorà

National Center for Supercomputing Applications
University of Illinois at Urbana-Champaign
Urbana, Illinois, 61801

xllora@ncsa.illinois.edu
http://www.ncsa.illinois.edu/~xllora
Outline

•   Data-intensive computing and HPC?
•   Is this related at all to evolutionary computation?
•   Data-intensive computing with Meandre
•   GAs and competent GAs
•   Data-intensive computing for GAs
2 Minute HPC History

• The eighties and early nineties picture
   •   Commodity hardware rare, slow, and costly
   •   Supercomputers were extremely expensive
   •   Most of them hand crafted and only few units
   •   Two competing families
        • CISC (e.g. Cray C90 with up to 16 processors)
        • RISC (e.g. Connection Machine CM-5 with up 4,096 processors)
• Late nineties commodity hardware hit main stream
   • Start becoming popular, cheaper, and faster
   • Economy of scale
   • Massive parallel computers build from commodity components become a
     viable option
Two Visions

• C90 like supercomputers were like a comfy pair of trainers
   •   Oriented to scientific computing
   •   Complex vector oriented supercomputers
   •   Shared memory (lots of them)
   •   Multiprocessor enabled via some intercommunication networks
   •   Single system image
• CM5 like computers did not get massive traction, but a bit
   •   General purpose (as long as you can chop the work in simple units)
   •   Lots of simple processors available
   •   Distributed memory pushed new programming models (message passing)
   •   Complex interconnection networks
• NCSA have shared memory, distributed memory, and gpgpu based
Miniaturization Building Bridges

• Multicores and gpgpus are reviving the C90 flavor
• The CM-5 flavor now survives as distributed clusters of not so
  simple units
Control Models of Parallelization in EC

Run 1   Run 5        Run 9                 Master

                              Individual            Evaluation
Run 2   Run 6        Run 10

Run 3   Run 7        Run 11
                              Slave         Slave         Slave




         Migration
But Data is also Part of the Equation

• Google and Yahoo! revived an old route
• Usually refers to:
   • Infrastructure
   • Programming techniques/paradigms
• Google made it main stream after their MapReduce model
• Yahoo! provides and open source implementation
   • Hadoop (MapReduce)
   • HDFS (Hadoop distributed filesystem)
• Store petabytes reliably on commodity hardware (fault tolerant)
• Programming model
   • Map: Equivalent to the map operation on functional programming
   • Reduce: The reduction phase after maps are computed
A Simple Example

     n
               2
           x → reduce(map(x, sqr), sum)
   i=0

    x              x               x                     x

         map           map             map                   map

    x2             x2             x2                     x2

          reduce         reduce        reduce   reduce

                                  sum
Is This Related to EC?

• How can we easily benefit of the current core race painlessly?
• NCSA’s Blue Waters estimated may top on 100K
• Yes on several facets
   • Large optimization problems need to deal with large population sizes
     (Sastry, Goldberg & Llorà, 2007)
   • Large-scale data mining using genetic-based machine learning (Llorà et
     al. 2007)
   • Competent GAs model building extremely costly and data rich (Pelikan
     et al. 2001)
• The goal?
   • Rethink parallelization as data flow processes
   • Show that traditional models can be map to data-intensive computing
     models
   • Foster you curiosity
Data-Intensive Computing with Meandre
The Meandre Infrastructure Challenges

• NCSA infrastructure effort on data-intensive computing
• Transparency
   • From a single laptop to a HPC cluster
   • Not bound to a particular computation fabric
   • Allow heterogeneous development
• Intuitive programming paradigm
   • Modular Components assembled into Flows
   • Foster Collaboration and Sharing
• Open Source
• Service Orientated Architecture (SOA)
Basic Infrastructure Philosophy

•   Dataflow execution paradigm
•   Semantic-web driven
•   Web oriented
•   Facilitate distributed computing
•   Support publishing services
•   Promote reuse, sharing, and collaboration
•   More information at http://seasr.org/meandre
Data Flow Execution in Meandre

• A simple example c ← a+b
• A traditional control-driven language

        a = 1
        b = 2
        c = a+b


• Execution following the sequence of instructions
• One step at a time
      • a+b+c+d requires 3 steps
      • Could be easily parallelized
Data Flow Execution in Meandre

• Data flow execution is driven by data
• The previous example may have 2 possible data flow versions

  Stateless data flow

   value(a)
                               +                     value(c)
   value(b)


  State-based data flow


   value(b)    value(a)        +                     value(c)
                                   ?
The Basic Building Blocks: Components




                        Component




      RDF descriptor of the         The component
      components behavior           implementation
Go with the Flow: Creating Complex Tasks

• Directed multigraph of components creates a flow



  Push
  Text
                 Concatenate          To Upper      Print
                    Text              Case Text     Text
  Push
  Text
Automatic Parallelization:
Speed and Robustness
• Meandre ZigZag language allow automatic parallelization

                                      To Upper
                                      Case Text
 Push
 Text                                 To Upper
               Concatenate            Case Text             Print
                  Text                                      Text
 Push
 Text

                                       To Upper
                                       Case Text
GAs and Competent GAs
Selectorecombinative GAs

1. Initialize the population with random individuals
2. Evaluate the fitness value of the individuals
3. Select good solutions by using s-wise tournament selection
   without replacement (Goldberg, Korb & Deb, 1989)
4. Create new individuals by recombining the selected population
   using uniform crossover (Sywerda, 1989)
5. Evaluate the fitness valueof all offspring
6. Repeat steps 3-5 until convergence criteria are met
Extended Compact Genetic Algorithm

• Harik et al. 2006
• Initialize the population (usually random initialization)
• Evaluate the fitness of individuals
• Select promising solutions (e.g., tournament selection)
• Build the probabilistic model
   • Optimize structure & parameters to best fit selected individuals
   • Automatic identification of sub-structures
• Sample the model to create new candidate solutions
   • Effective exchange of building blocks
• Repeat steps 2–7 till some convergence criteria are met
eCGA Model Building Process

• Use model-building procedure of extended compact GA
   • Partition genes into (mutually) independent groups
   • Start with the lowest complexity model
   • Search for a least-complex, most-accurate model

  
               Model Structure
           
        
            
   Metric
  [X0] [X1] [X2] [X3] [X4] [X5] [X6] [X7] [X8] [X9] [X10] [X11]
       1.0000
  [X0] [X1] [X2] [X3] [X4X5] [X6] [X7] [X8] [X9] [X10] [X11]
          0.9933
  [X0] [X1] [X2] [X3] [X4X5X7] [X6] [X8] [X9] [X10] [X11]
         
   0.9819
  [X0] [X1] [X2] [X3] [X4X5X6X7] [X8] [X9] [X10] [X11]
            
   0.9644
  
         
             
     …
       
            
            
  [X0] [X1] [X2] [X3] [X4X5X6X7] [X8X9X10X11]
        
            
   0.9273
  
       
            
     …
             
         
            
  [X0X1X2X3] [X4X5X6X7] [X8X9X10X11]
       
         
                0.8895
Data-Intensive Flows for Competent GAs
Selectorecombinative GA
sGAs Execution Profile and Parallelization

         • Intel 2.8Ghz QuadCore, 4Gb RAM. Average of 20 runs.
                                                        sbp
  sbp                                                  soed
                                                eps/mapper
                                               eps/paralell/0
 soed
                                               eps/paralell/1
                                               eps/paralell/2
  eps                                          eps/paralell/3
                                                eps/reducer
                                                     twrops
twrops                                        ucbps/mapper
                                             ucbps/paralell/0
                                             ucbps/paralell/1
  noit
                                             ucbps/paralell/2
                                             ucbps/paralell/3
                                              ucbps/reducer
ucbps
                                                        noit



                                                                0




                                                                    5000




                                                                           10000




                                                                                   15000




                                                                                           20000
         0




              5000




                     10000




                             15000




                                     20000
eCGA Model Model building
eCGA Execution Profile and Parallelization

          • Intel 2.8Ghz QuadCore, 4Gb RAM. Average of 20 runs.
                                                                                  init_ecga
        init_ecga                                                update_partitions/mapper

                                                                 update_partitions/mapper

                                                                 update_partitions/mapper
update_partitions
                                                                update_partitions/paralell/0

                                                                update_partitions/paralell/1

                                                                update_partitions/paralell/2
greedy_ecga_mb
                                                                update_partitions/paralell/3

                                                                  update_partitions/reduce

                                                                          greedy_ecga_mb
     print_model
                                                                               print_model




                                                                                               0


                                                                                                   10000


                                                                                                           20000


                                                                                                                   30000


                                                                                                                           40000


                                                                                                                                   50000
                    0


                        10000


                                20000


                                        30000


                                                40000


                                                        50000
eCGA Model Building Speedup

• Intel 2.8Ghz QuadCore, 4Gb RAM. Average of 20 runs.
• Speedup against original eCGA model building
                                                       5                                 ●
            Speedup vs. Original eCGA Model Building

                                                       4




                                                                                     ●
                                                       3




                                                               ●
                                                       2




                                                           ●
                                                       1




                                                           1   2                     3   4

                                                                   Number of cores
Scalability on NUMA Systems

•   Run on NCSA’s SGI Altix Cobalt
•   1,120 processors and up to 5 TB of RAM
•   SGI NUMAlink
•   NUMA architecture
•   Test for speedup behavior
•   Average of 20 independent runs
•   Automatic parallelization of the partition evaluation
•   Results still show the linear trend (despite the NUMA)
    • 16 processors, speedup = 14.01
    • 32 processors, speedup = 27.96
Wrapping Up
Summary

• Evolutionary computation is data rich
• Data-intensive computing can provide to EC:
   •   Tap into parallelism quite painless
   •   Provide a simple programming and modeling
   •   Boost reusability
   •   Tackle otherwise intractable problems
• Shown that equivalent data-intensive computing versions of
  traditional algorithms exist
• Linear parallelism can be tap transparently
Data-Intensive Computing for
Competent Genetic Algorithms:
A Pilot Study using Meandre


Xavier Llorà

National Center for Supercomputing Applications
University of Illinois at Urbana-Champaign
Urbana, Illinois, 61801

xllora@ncsa.illinois.edu
http://www.ncsa.illinois.edu/~xllora

More Related Content

What's hot

cosbench-openstack.pdf
cosbench-openstack.pdfcosbench-openstack.pdf
cosbench-openstack.pdf
OpenStack Foundation
 
Hanborq Optimizations on Hadoop MapReduce
Hanborq Optimizations on Hadoop MapReduceHanborq Optimizations on Hadoop MapReduce
Hanborq Optimizations on Hadoop MapReduce
Hanborq Inc.
 
Accelerating Spark MLlib and DataFrame with Vector Processor “SX-Aurora TSUBASA”
Accelerating Spark MLlib and DataFrame with Vector Processor “SX-Aurora TSUBASA”Accelerating Spark MLlib and DataFrame with Vector Processor “SX-Aurora TSUBASA”
Accelerating Spark MLlib and DataFrame with Vector Processor “SX-Aurora TSUBASA”
Databricks
 
Oct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and Deployment
Oct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and DeploymentOct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and Deployment
Oct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and Deployment
Yahoo Developer Network
 
Hadoop World 2011: The Hadoop Stack - Then, Now and in the Future - Eli Colli...
Hadoop World 2011: The Hadoop Stack - Then, Now and in the Future - Eli Colli...Hadoop World 2011: The Hadoop Stack - Then, Now and in the Future - Eli Colli...
Hadoop World 2011: The Hadoop Stack - Then, Now and in the Future - Eli Colli...
Cloudera, Inc.
 
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
Chester Chen
 
ROMA User-Customizable NoSQL Database in Ruby
ROMA User-Customizable NoSQL Database in RubyROMA User-Customizable NoSQL Database in Ruby
ROMA User-Customizable NoSQL Database in Ruby
Rakuten Group, Inc.
 
Lessons learned scaling big data in cloud
Lessons learned   scaling big data in cloudLessons learned   scaling big data in cloud
Lessons learned scaling big data in cloud
Vijay Rayapati
 
London hug
London hugLondon hug
London hug
Ted Dunning
 
Scaling Big Data Mining Infrastructure Twitter Experience
Scaling Big Data Mining Infrastructure Twitter ExperienceScaling Big Data Mining Infrastructure Twitter Experience
Scaling Big Data Mining Infrastructure Twitter Experience
DataWorks Summit
 
Introduction to Hadoop - ACCU2010
Introduction to Hadoop - ACCU2010Introduction to Hadoop - ACCU2010
Introduction to Hadoop - ACCU2010
Gavin Heavyside
 
Cassandra nyc 2011 ilya maykov - ooyala - scaling video analytics with apac...
Cassandra nyc 2011   ilya maykov - ooyala - scaling video analytics with apac...Cassandra nyc 2011   ilya maykov - ooyala - scaling video analytics with apac...
Cassandra nyc 2011 ilya maykov - ooyala - scaling video analytics with apac...ivmaykov
 
Cloudera Sessions - Clinic 1 - Getting Started With Hadoop
Cloudera Sessions - Clinic 1 - Getting Started With HadoopCloudera Sessions - Clinic 1 - Getting Started With Hadoop
Cloudera Sessions - Clinic 1 - Getting Started With Hadoop
Cloudera, Inc.
 
Hivemall tech talk at Redwood, CA
Hivemall tech talk at Redwood, CAHivemall tech talk at Redwood, CA
Hivemall tech talk at Redwood, CA
Makoto Yui
 
Drill at the Chug 9-19-12
Drill at the Chug 9-19-12Drill at the Chug 9-19-12
Drill at the Chug 9-19-12
Ted Dunning
 
Hadoop on VMware
Hadoop on VMwareHadoop on VMware
Hadoop on VMware
Richard McDougall
 
Hadoop on Virtual Machines
Hadoop on Virtual MachinesHadoop on Virtual Machines
Hadoop on Virtual Machines
Richard McDougall
 

What's hot (20)

cosbench-openstack.pdf
cosbench-openstack.pdfcosbench-openstack.pdf
cosbench-openstack.pdf
 
Cosbench apac
Cosbench apacCosbench apac
Cosbench apac
 
Hanborq Optimizations on Hadoop MapReduce
Hanborq Optimizations on Hadoop MapReduceHanborq Optimizations on Hadoop MapReduce
Hanborq Optimizations on Hadoop MapReduce
 
Accelerating Spark MLlib and DataFrame with Vector Processor “SX-Aurora TSUBASA”
Accelerating Spark MLlib and DataFrame with Vector Processor “SX-Aurora TSUBASA”Accelerating Spark MLlib and DataFrame with Vector Processor “SX-Aurora TSUBASA”
Accelerating Spark MLlib and DataFrame with Vector Processor “SX-Aurora TSUBASA”
 
Oct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and Deployment
Oct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and DeploymentOct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and Deployment
Oct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and Deployment
 
Hadoop World 2011: The Hadoop Stack - Then, Now and in the Future - Eli Colli...
Hadoop World 2011: The Hadoop Stack - Then, Now and in the Future - Eli Colli...Hadoop World 2011: The Hadoop Stack - Then, Now and in the Future - Eli Colli...
Hadoop World 2011: The Hadoop Stack - Then, Now and in the Future - Eli Colli...
 
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
 
NoSQL: Cassadra vs. HBase
NoSQL: Cassadra vs. HBaseNoSQL: Cassadra vs. HBase
NoSQL: Cassadra vs. HBase
 
ROMA User-Customizable NoSQL Database in Ruby
ROMA User-Customizable NoSQL Database in RubyROMA User-Customizable NoSQL Database in Ruby
ROMA User-Customizable NoSQL Database in Ruby
 
Lessons learned scaling big data in cloud
Lessons learned   scaling big data in cloudLessons learned   scaling big data in cloud
Lessons learned scaling big data in cloud
 
London hug
London hugLondon hug
London hug
 
Scaling Big Data Mining Infrastructure Twitter Experience
Scaling Big Data Mining Infrastructure Twitter ExperienceScaling Big Data Mining Infrastructure Twitter Experience
Scaling Big Data Mining Infrastructure Twitter Experience
 
Introduction to Hadoop - ACCU2010
Introduction to Hadoop - ACCU2010Introduction to Hadoop - ACCU2010
Introduction to Hadoop - ACCU2010
 
Cassandra nyc 2011 ilya maykov - ooyala - scaling video analytics with apac...
Cassandra nyc 2011   ilya maykov - ooyala - scaling video analytics with apac...Cassandra nyc 2011   ilya maykov - ooyala - scaling video analytics with apac...
Cassandra nyc 2011 ilya maykov - ooyala - scaling video analytics with apac...
 
Cloudera Sessions - Clinic 1 - Getting Started With Hadoop
Cloudera Sessions - Clinic 1 - Getting Started With HadoopCloudera Sessions - Clinic 1 - Getting Started With Hadoop
Cloudera Sessions - Clinic 1 - Getting Started With Hadoop
 
Hivemall tech talk at Redwood, CA
Hivemall tech talk at Redwood, CAHivemall tech talk at Redwood, CA
Hivemall tech talk at Redwood, CA
 
Drill at the Chug 9-19-12
Drill at the Chug 9-19-12Drill at the Chug 9-19-12
Drill at the Chug 9-19-12
 
Hadoop on VMware
Hadoop on VMwareHadoop on VMware
Hadoop on VMware
 
Hadoop on Virtual Machines
Hadoop on Virtual MachinesHadoop on Virtual Machines
Hadoop on Virtual Machines
 
Zh tw cloud computing era
Zh tw cloud computing eraZh tw cloud computing era
Zh tw cloud computing era
 

Similar to Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using Meandre

MapReduce: A useful parallel tool that still has room for improvement
MapReduce: A useful parallel tool that still has room for improvementMapReduce: A useful parallel tool that still has room for improvement
MapReduce: A useful parallel tool that still has room for improvement
Kyong-Ha Lee
 
Realizing Robust and Scalable Evolutionary Algorithms toward Exascale Era
Realizing Robust and Scalable Evolutionary Algorithms toward Exascale EraRealizing Robust and Scalable Evolutionary Algorithms toward Exascale Era
Realizing Robust and Scalable Evolutionary Algorithms toward Exascale Era
Masaharu Munetomo
 
Parallel and Distributed Computing on Low Latency Clusters
Parallel and Distributed Computing on Low Latency ClustersParallel and Distributed Computing on Low Latency Clusters
Parallel and Distributed Computing on Low Latency Clusters
Vittorio Giovara
 
Sharding Containers: Make Go Apps Computer-Friendly Again by Andrey Sibiryov
Sharding Containers: Make Go Apps Computer-Friendly Again by Andrey Sibiryov Sharding Containers: Make Go Apps Computer-Friendly Again by Andrey Sibiryov
Sharding Containers: Make Go Apps Computer-Friendly Again by Andrey Sibiryov
Docker, Inc.
 
Floating Point Operations , Memory Chip Organization , Serial Bus Architectur...
Floating Point Operations , Memory Chip Organization , Serial Bus Architectur...Floating Point Operations , Memory Chip Organization , Serial Bus Architectur...
Floating Point Operations , Memory Chip Organization , Serial Bus Architectur...
VAISHNAVI MADHAN
 
Floating Point Operations , Memory Chip Organization , Serial Bus Architectur...
Floating Point Operations , Memory Chip Organization , Serial Bus Architectur...Floating Point Operations , Memory Chip Organization , Serial Bus Architectur...
Floating Point Operations , Memory Chip Organization , Serial Bus Architectur...
KRamasamy2
 
try
trytry
Building a Database for the End of the World
Building a Database for the End of the WorldBuilding a Database for the End of the World
Building a Database for the End of the World
jhugg
 
Coates bosc2010 clouds-fluff-and-no-substance
Coates bosc2010 clouds-fluff-and-no-substanceCoates bosc2010 clouds-fluff-and-no-substance
Coates bosc2010 clouds-fluff-and-no-substanceBOSC 2010
 
Millions quotes per second in pure java
Millions quotes per second in pure javaMillions quotes per second in pure java
Millions quotes per second in pure javaRoman Elizarov
 
Hadoop - Introduction to HDFS
Hadoop - Introduction to HDFSHadoop - Introduction to HDFS
Hadoop - Introduction to HDFS
Vibrant Technologies & Computers
 
Seminar Presentation Hadoop
Seminar Presentation HadoopSeminar Presentation Hadoop
Seminar Presentation HadoopVarun Narang
 
Everything We Learned About In-Memory Data Layout While Building VoltDB
Everything We Learned About In-Memory Data Layout While Building VoltDBEverything We Learned About In-Memory Data Layout While Building VoltDB
Everything We Learned About In-Memory Data Layout While Building VoltDB
jhugg
 
Performance Analysis of Lattice QCD with APGAS Programming Model
Performance Analysis of Lattice QCD with APGAS Programming ModelPerformance Analysis of Lattice QCD with APGAS Programming Model
Performance Analysis of Lattice QCD with APGAS Programming Model
Koichi Shirahata
 
Sc12 workshop-writeup
Sc12 workshop-writeupSc12 workshop-writeup
Sc12 workshop-writeup
Aaron Zauner
 
Ling liu part 02:big graph processing
Ling liu part 02:big graph processingLing liu part 02:big graph processing
Ling liu part 02:big graph processing
jins0618
 
MapReduce on Zero VM
MapReduce on Zero VM MapReduce on Zero VM
MapReduce on Zero VM
Joy Rahman
 
Scalable machine learning
Scalable machine learningScalable machine learning
Scalable machine learning
Arnaud Rachez
 
Launching Your First Big Data Project on AWS
Launching Your First Big Data Project on AWSLaunching Your First Big Data Project on AWS
Launching Your First Big Data Project on AWS
Amazon Web Services
 
AWS Webcast - An Introduction to High Performance Computing on AWS
AWS Webcast - An Introduction to High Performance Computing on AWSAWS Webcast - An Introduction to High Performance Computing on AWS
AWS Webcast - An Introduction to High Performance Computing on AWS
Amazon Web Services
 

Similar to Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using Meandre (20)

MapReduce: A useful parallel tool that still has room for improvement
MapReduce: A useful parallel tool that still has room for improvementMapReduce: A useful parallel tool that still has room for improvement
MapReduce: A useful parallel tool that still has room for improvement
 
Realizing Robust and Scalable Evolutionary Algorithms toward Exascale Era
Realizing Robust and Scalable Evolutionary Algorithms toward Exascale EraRealizing Robust and Scalable Evolutionary Algorithms toward Exascale Era
Realizing Robust and Scalable Evolutionary Algorithms toward Exascale Era
 
Parallel and Distributed Computing on Low Latency Clusters
Parallel and Distributed Computing on Low Latency ClustersParallel and Distributed Computing on Low Latency Clusters
Parallel and Distributed Computing on Low Latency Clusters
 
Sharding Containers: Make Go Apps Computer-Friendly Again by Andrey Sibiryov
Sharding Containers: Make Go Apps Computer-Friendly Again by Andrey Sibiryov Sharding Containers: Make Go Apps Computer-Friendly Again by Andrey Sibiryov
Sharding Containers: Make Go Apps Computer-Friendly Again by Andrey Sibiryov
 
Floating Point Operations , Memory Chip Organization , Serial Bus Architectur...
Floating Point Operations , Memory Chip Organization , Serial Bus Architectur...Floating Point Operations , Memory Chip Organization , Serial Bus Architectur...
Floating Point Operations , Memory Chip Organization , Serial Bus Architectur...
 
Floating Point Operations , Memory Chip Organization , Serial Bus Architectur...
Floating Point Operations , Memory Chip Organization , Serial Bus Architectur...Floating Point Operations , Memory Chip Organization , Serial Bus Architectur...
Floating Point Operations , Memory Chip Organization , Serial Bus Architectur...
 
try
trytry
try
 
Building a Database for the End of the World
Building a Database for the End of the WorldBuilding a Database for the End of the World
Building a Database for the End of the World
 
Coates bosc2010 clouds-fluff-and-no-substance
Coates bosc2010 clouds-fluff-and-no-substanceCoates bosc2010 clouds-fluff-and-no-substance
Coates bosc2010 clouds-fluff-and-no-substance
 
Millions quotes per second in pure java
Millions quotes per second in pure javaMillions quotes per second in pure java
Millions quotes per second in pure java
 
Hadoop - Introduction to HDFS
Hadoop - Introduction to HDFSHadoop - Introduction to HDFS
Hadoop - Introduction to HDFS
 
Seminar Presentation Hadoop
Seminar Presentation HadoopSeminar Presentation Hadoop
Seminar Presentation Hadoop
 
Everything We Learned About In-Memory Data Layout While Building VoltDB
Everything We Learned About In-Memory Data Layout While Building VoltDBEverything We Learned About In-Memory Data Layout While Building VoltDB
Everything We Learned About In-Memory Data Layout While Building VoltDB
 
Performance Analysis of Lattice QCD with APGAS Programming Model
Performance Analysis of Lattice QCD with APGAS Programming ModelPerformance Analysis of Lattice QCD with APGAS Programming Model
Performance Analysis of Lattice QCD with APGAS Programming Model
 
Sc12 workshop-writeup
Sc12 workshop-writeupSc12 workshop-writeup
Sc12 workshop-writeup
 
Ling liu part 02:big graph processing
Ling liu part 02:big graph processingLing liu part 02:big graph processing
Ling liu part 02:big graph processing
 
MapReduce on Zero VM
MapReduce on Zero VM MapReduce on Zero VM
MapReduce on Zero VM
 
Scalable machine learning
Scalable machine learningScalable machine learning
Scalable machine learning
 
Launching Your First Big Data Project on AWS
Launching Your First Big Data Project on AWSLaunching Your First Big Data Project on AWS
Launching Your First Big Data Project on AWS
 
AWS Webcast - An Introduction to High Performance Computing on AWS
AWS Webcast - An Introduction to High Performance Computing on AWSAWS Webcast - An Introduction to High Performance Computing on AWS
AWS Webcast - An Introduction to High Performance Computing on AWS
 

More from Xavier Llorà

Meandre 2.0 Alpha Preview
Meandre 2.0 Alpha PreviewMeandre 2.0 Alpha Preview
Meandre 2.0 Alpha Preview
Xavier Llorà
 
Soaring the Clouds with Meandre
Soaring the Clouds with MeandreSoaring the Clouds with Meandre
Soaring the Clouds with Meandre
Xavier Llorà
 
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
Xavier Llorà
 
Large Scale Data Mining using Genetics-Based Machine Learning
Large Scale Data Mining using   Genetics-Based Machine LearningLarge Scale Data Mining using   Genetics-Based Machine Learning
Large Scale Data Mining using Genetics-Based Machine Learning
Xavier Llorà
 
Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new Trends
Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new TrendsScalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new Trends
Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new Trends
Xavier Llorà
 
Pittsburgh Learning Classifier Systems for Protein Structure Prediction: Sca...
Pittsburgh Learning Classifier Systems for Protein  Structure Prediction: Sca...Pittsburgh Learning Classifier Systems for Protein  Structure Prediction: Sca...
Pittsburgh Learning Classifier Systems for Protein Structure Prediction: Sca...
Xavier Llorà
 
Towards a Theoretical Towards a Theoretical Framework for LCS Framework fo...
Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework fo...Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework fo...
Towards a Theoretical Towards a Theoretical Framework for LCS Framework fo...
Xavier Llorà
 
Learning Classifier Systems for Class Imbalance Problems
Learning Classifier Systems  for Class Imbalance  ProblemsLearning Classifier Systems  for Class Imbalance  Problems
Learning Classifier Systems for Class Imbalance Problems
Xavier Llorà
 
A Retrospective Look at A Retrospective Look at Classifier System ResearchCl...
A Retrospective Look at  A Retrospective Look at  Classifier System ResearchCl...A Retrospective Look at  A Retrospective Look at  Classifier System ResearchCl...
A Retrospective Look at A Retrospective Look at Classifier System ResearchCl...
Xavier Llorà
 
XCS: Current capabilities and future challenges
XCS: Current capabilities and future  challengesXCS: Current capabilities and future  challenges
XCS: Current capabilities and future challenges
Xavier Llorà
 
Negative Selection for Algorithm for Anomaly Detection
Negative Selection for Algorithm for Anomaly DetectionNegative Selection for Algorithm for Anomaly Detection
Negative Selection for Algorithm for Anomaly Detection
Xavier Llorà
 
Searle, Intentionality, and the Future of Classifier Systems
Searle, Intentionality, and the  Future of Classifier SystemsSearle, Intentionality, and the  Future of Classifier Systems
Searle, Intentionality, and the Future of Classifier Systems
Xavier Llorà
 
Computed Prediction: So far, so good. What now?
Computed Prediction:  So far, so good. What now?Computed Prediction:  So far, so good. What now?
Computed Prediction: So far, so good. What now?
Xavier Llorà
 
NIGEL 2006 welcome
NIGEL 2006 welcomeNIGEL 2006 welcome
NIGEL 2006 welcome
Xavier Llorà
 
Linkage Learning for Pittsburgh LCS: Making Problems Tractable
Linkage Learning for Pittsburgh LCS: Making Problems TractableLinkage Learning for Pittsburgh LCS: Making Problems Tractable
Linkage Learning for Pittsburgh LCS: Making Problems Tractable
Xavier Llorà
 
Meandre: Semantic-Driven Data-Intensive Flows in the Clouds
Meandre: Semantic-Driven Data-Intensive Flows in the CloudsMeandre: Semantic-Driven Data-Intensive Flows in the Clouds
Meandre: Semantic-Driven Data-Intensive Flows in the Clouds
Xavier Llorà
 
ZigZag: The Meandring Language
ZigZag: The Meandring LanguageZigZag: The Meandring Language
ZigZag: The Meandring Language
Xavier Llorà
 
HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...
HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...
HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...
Xavier Llorà
 
Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning...
Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning...Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning...
Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning...
Xavier Llorà
 
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
Xavier Llorà
 

More from Xavier Llorà (20)

Meandre 2.0 Alpha Preview
Meandre 2.0 Alpha PreviewMeandre 2.0 Alpha Preview
Meandre 2.0 Alpha Preview
 
Soaring the Clouds with Meandre
Soaring the Clouds with MeandreSoaring the Clouds with Meandre
Soaring the Clouds with Meandre
 
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
 
Large Scale Data Mining using Genetics-Based Machine Learning
Large Scale Data Mining using   Genetics-Based Machine LearningLarge Scale Data Mining using   Genetics-Based Machine Learning
Large Scale Data Mining using Genetics-Based Machine Learning
 
Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new Trends
Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new TrendsScalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new Trends
Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new Trends
 
Pittsburgh Learning Classifier Systems for Protein Structure Prediction: Sca...
Pittsburgh Learning Classifier Systems for Protein  Structure Prediction: Sca...Pittsburgh Learning Classifier Systems for Protein  Structure Prediction: Sca...
Pittsburgh Learning Classifier Systems for Protein Structure Prediction: Sca...
 
Towards a Theoretical Towards a Theoretical Framework for LCS Framework fo...
Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework fo...Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework fo...
Towards a Theoretical Towards a Theoretical Framework for LCS Framework fo...
 
Learning Classifier Systems for Class Imbalance Problems
Learning Classifier Systems  for Class Imbalance  ProblemsLearning Classifier Systems  for Class Imbalance  Problems
Learning Classifier Systems for Class Imbalance Problems
 
A Retrospective Look at A Retrospective Look at Classifier System ResearchCl...
A Retrospective Look at  A Retrospective Look at  Classifier System ResearchCl...A Retrospective Look at  A Retrospective Look at  Classifier System ResearchCl...
A Retrospective Look at A Retrospective Look at Classifier System ResearchCl...
 
XCS: Current capabilities and future challenges
XCS: Current capabilities and future  challengesXCS: Current capabilities and future  challenges
XCS: Current capabilities and future challenges
 
Negative Selection for Algorithm for Anomaly Detection
Negative Selection for Algorithm for Anomaly DetectionNegative Selection for Algorithm for Anomaly Detection
Negative Selection for Algorithm for Anomaly Detection
 
Searle, Intentionality, and the Future of Classifier Systems
Searle, Intentionality, and the  Future of Classifier SystemsSearle, Intentionality, and the  Future of Classifier Systems
Searle, Intentionality, and the Future of Classifier Systems
 
Computed Prediction: So far, so good. What now?
Computed Prediction:  So far, so good. What now?Computed Prediction:  So far, so good. What now?
Computed Prediction: So far, so good. What now?
 
NIGEL 2006 welcome
NIGEL 2006 welcomeNIGEL 2006 welcome
NIGEL 2006 welcome
 
Linkage Learning for Pittsburgh LCS: Making Problems Tractable
Linkage Learning for Pittsburgh LCS: Making Problems TractableLinkage Learning for Pittsburgh LCS: Making Problems Tractable
Linkage Learning for Pittsburgh LCS: Making Problems Tractable
 
Meandre: Semantic-Driven Data-Intensive Flows in the Clouds
Meandre: Semantic-Driven Data-Intensive Flows in the CloudsMeandre: Semantic-Driven Data-Intensive Flows in the Clouds
Meandre: Semantic-Driven Data-Intensive Flows in the Clouds
 
ZigZag: The Meandring Language
ZigZag: The Meandring LanguageZigZag: The Meandring Language
ZigZag: The Meandring Language
 
HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...
HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...
HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...
 
Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning...
Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning...Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning...
Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning...
 
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
 

Recently uploaded

GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
Nguyen Thanh Tu Collection
 
Fish and Chips - have they had their chips
Fish and Chips - have they had their chipsFish and Chips - have they had their chips
Fish and Chips - have they had their chips
GeoBlogs
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
Thiyagu K
 
The Art Pastor's Guide to Sabbath | Steve Thomason
The Art Pastor's Guide to Sabbath | Steve ThomasonThe Art Pastor's Guide to Sabbath | Steve Thomason
The Art Pastor's Guide to Sabbath | Steve Thomason
Steve Thomason
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
Jisc
 
How to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERPHow to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERP
Celine George
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
RaedMohamed3
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
kaushalkr1407
 
How to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS ModuleHow to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS Module
Celine George
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
siemaillard
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
Tamralipta Mahavidyalaya
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
Delapenabediema
 
Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
PART A. Introduction to Costumer Service
PART A. Introduction to Costumer ServicePART A. Introduction to Costumer Service
PART A. Introduction to Costumer Service
PedroFerreira53928
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
GeoBlogs
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
TechSoup
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Sandy Millin
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
Balvir Singh
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
How to Break the cycle of negative Thoughts
How to Break the cycle of negative ThoughtsHow to Break the cycle of negative Thoughts
How to Break the cycle of negative Thoughts
Col Mukteshwar Prasad
 

Recently uploaded (20)

GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
 
Fish and Chips - have they had their chips
Fish and Chips - have they had their chipsFish and Chips - have they had their chips
Fish and Chips - have they had their chips
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
 
The Art Pastor's Guide to Sabbath | Steve Thomason
The Art Pastor's Guide to Sabbath | Steve ThomasonThe Art Pastor's Guide to Sabbath | Steve Thomason
The Art Pastor's Guide to Sabbath | Steve Thomason
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 
How to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERPHow to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERP
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
 
How to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS ModuleHow to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS Module
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
 
Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......
 
PART A. Introduction to Costumer Service
PART A. Introduction to Costumer ServicePART A. Introduction to Costumer Service
PART A. Introduction to Costumer Service
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
How to Break the cycle of negative Thoughts
How to Break the cycle of negative ThoughtsHow to Break the cycle of negative Thoughts
How to Break the cycle of negative Thoughts
 

Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using Meandre

  • 1. Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using Meandre Xavier Llorà National Center for Supercomputing Applications University of Illinois at Urbana-Champaign Urbana, Illinois, 61801 xllora@ncsa.illinois.edu http://www.ncsa.illinois.edu/~xllora
  • 2. Outline • Data-intensive computing and HPC? • Is this related at all to evolutionary computation? • Data-intensive computing with Meandre • GAs and competent GAs • Data-intensive computing for GAs
  • 3. 2 Minute HPC History • The eighties and early nineties picture • Commodity hardware rare, slow, and costly • Supercomputers were extremely expensive • Most of them hand crafted and only few units • Two competing families • CISC (e.g. Cray C90 with up to 16 processors) • RISC (e.g. Connection Machine CM-5 with up 4,096 processors) • Late nineties commodity hardware hit main stream • Start becoming popular, cheaper, and faster • Economy of scale • Massive parallel computers build from commodity components become a viable option
  • 4. Two Visions • C90 like supercomputers were like a comfy pair of trainers • Oriented to scientific computing • Complex vector oriented supercomputers • Shared memory (lots of them) • Multiprocessor enabled via some intercommunication networks • Single system image • CM5 like computers did not get massive traction, but a bit • General purpose (as long as you can chop the work in simple units) • Lots of simple processors available • Distributed memory pushed new programming models (message passing) • Complex interconnection networks • NCSA have shared memory, distributed memory, and gpgpu based
  • 5. Miniaturization Building Bridges • Multicores and gpgpus are reviving the C90 flavor • The CM-5 flavor now survives as distributed clusters of not so simple units
  • 6. Control Models of Parallelization in EC Run 1 Run 5 Run 9 Master Individual Evaluation Run 2 Run 6 Run 10 Run 3 Run 7 Run 11 Slave Slave Slave Migration
  • 7. But Data is also Part of the Equation • Google and Yahoo! revived an old route • Usually refers to: • Infrastructure • Programming techniques/paradigms • Google made it main stream after their MapReduce model • Yahoo! provides and open source implementation • Hadoop (MapReduce) • HDFS (Hadoop distributed filesystem) • Store petabytes reliably on commodity hardware (fault tolerant) • Programming model • Map: Equivalent to the map operation on functional programming • Reduce: The reduction phase after maps are computed
  • 8. A Simple Example n 2 x → reduce(map(x, sqr), sum) i=0 x x x x map map map map x2 x2 x2 x2 reduce reduce reduce reduce sum
  • 9. Is This Related to EC? • How can we easily benefit of the current core race painlessly? • NCSA’s Blue Waters estimated may top on 100K • Yes on several facets • Large optimization problems need to deal with large population sizes (Sastry, Goldberg & Llorà, 2007) • Large-scale data mining using genetic-based machine learning (Llorà et al. 2007) • Competent GAs model building extremely costly and data rich (Pelikan et al. 2001) • The goal? • Rethink parallelization as data flow processes • Show that traditional models can be map to data-intensive computing models • Foster you curiosity
  • 11. The Meandre Infrastructure Challenges • NCSA infrastructure effort on data-intensive computing • Transparency • From a single laptop to a HPC cluster • Not bound to a particular computation fabric • Allow heterogeneous development • Intuitive programming paradigm • Modular Components assembled into Flows • Foster Collaboration and Sharing • Open Source • Service Orientated Architecture (SOA)
  • 12. Basic Infrastructure Philosophy • Dataflow execution paradigm • Semantic-web driven • Web oriented • Facilitate distributed computing • Support publishing services • Promote reuse, sharing, and collaboration • More information at http://seasr.org/meandre
  • 13. Data Flow Execution in Meandre • A simple example c ← a+b • A traditional control-driven language a = 1 b = 2 c = a+b • Execution following the sequence of instructions • One step at a time • a+b+c+d requires 3 steps • Could be easily parallelized
  • 14. Data Flow Execution in Meandre • Data flow execution is driven by data • The previous example may have 2 possible data flow versions Stateless data flow value(a) + value(c) value(b) State-based data flow value(b) value(a) + value(c) ?
  • 15. The Basic Building Blocks: Components Component RDF descriptor of the The component components behavior implementation
  • 16. Go with the Flow: Creating Complex Tasks • Directed multigraph of components creates a flow Push Text Concatenate To Upper Print Text Case Text Text Push Text
  • 17. Automatic Parallelization: Speed and Robustness • Meandre ZigZag language allow automatic parallelization To Upper Case Text Push Text To Upper Concatenate Case Text Print Text Text Push Text To Upper Case Text
  • 19. Selectorecombinative GAs 1. Initialize the population with random individuals 2. Evaluate the fitness value of the individuals 3. Select good solutions by using s-wise tournament selection without replacement (Goldberg, Korb & Deb, 1989) 4. Create new individuals by recombining the selected population using uniform crossover (Sywerda, 1989) 5. Evaluate the fitness valueof all offspring 6. Repeat steps 3-5 until convergence criteria are met
  • 20. Extended Compact Genetic Algorithm • Harik et al. 2006 • Initialize the population (usually random initialization) • Evaluate the fitness of individuals • Select promising solutions (e.g., tournament selection) • Build the probabilistic model • Optimize structure & parameters to best fit selected individuals • Automatic identification of sub-structures • Sample the model to create new candidate solutions • Effective exchange of building blocks • Repeat steps 2–7 till some convergence criteria are met
  • 21. eCGA Model Building Process • Use model-building procedure of extended compact GA • Partition genes into (mutually) independent groups • Start with the lowest complexity model • Search for a least-complex, most-accurate model Model Structure Metric [X0] [X1] [X2] [X3] [X4] [X5] [X6] [X7] [X8] [X9] [X10] [X11] 1.0000 [X0] [X1] [X2] [X3] [X4X5] [X6] [X7] [X8] [X9] [X10] [X11] 0.9933 [X0] [X1] [X2] [X3] [X4X5X7] [X6] [X8] [X9] [X10] [X11] 0.9819 [X0] [X1] [X2] [X3] [X4X5X6X7] [X8] [X9] [X10] [X11] 0.9644 … [X0] [X1] [X2] [X3] [X4X5X6X7] [X8X9X10X11] 0.9273 … [X0X1X2X3] [X4X5X6X7] [X8X9X10X11] 0.8895
  • 22. Data-Intensive Flows for Competent GAs
  • 24. sGAs Execution Profile and Parallelization • Intel 2.8Ghz QuadCore, 4Gb RAM. Average of 20 runs. sbp sbp soed eps/mapper eps/paralell/0 soed eps/paralell/1 eps/paralell/2 eps eps/paralell/3 eps/reducer twrops twrops ucbps/mapper ucbps/paralell/0 ucbps/paralell/1 noit ucbps/paralell/2 ucbps/paralell/3 ucbps/reducer ucbps noit 0 5000 10000 15000 20000 0 5000 10000 15000 20000
  • 25. eCGA Model Model building
  • 26. eCGA Execution Profile and Parallelization • Intel 2.8Ghz QuadCore, 4Gb RAM. Average of 20 runs. init_ecga init_ecga update_partitions/mapper update_partitions/mapper update_partitions/mapper update_partitions update_partitions/paralell/0 update_partitions/paralell/1 update_partitions/paralell/2 greedy_ecga_mb update_partitions/paralell/3 update_partitions/reduce greedy_ecga_mb print_model print_model 0 10000 20000 30000 40000 50000 0 10000 20000 30000 40000 50000
  • 27. eCGA Model Building Speedup • Intel 2.8Ghz QuadCore, 4Gb RAM. Average of 20 runs. • Speedup against original eCGA model building 5 ● Speedup vs. Original eCGA Model Building 4 ● 3 ● 2 ● 1 1 2 3 4 Number of cores
  • 28. Scalability on NUMA Systems • Run on NCSA’s SGI Altix Cobalt • 1,120 processors and up to 5 TB of RAM • SGI NUMAlink • NUMA architecture • Test for speedup behavior • Average of 20 independent runs • Automatic parallelization of the partition evaluation • Results still show the linear trend (despite the NUMA) • 16 processors, speedup = 14.01 • 32 processors, speedup = 27.96
  • 30. Summary • Evolutionary computation is data rich • Data-intensive computing can provide to EC: • Tap into parallelism quite painless • Provide a simple programming and modeling • Boost reusability • Tackle otherwise intractable problems • Shown that equivalent data-intensive computing versions of traditional algorithms exist • Linear parallelism can be tap transparently
  • 31. Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using Meandre Xavier Llorà National Center for Supercomputing Applications University of Illinois at Urbana-Champaign Urbana, Illinois, 61801 xllora@ncsa.illinois.edu http://www.ncsa.illinois.edu/~xllora