SlideShare a Scribd company logo
HPC, Grid and Cloud Computing
   - The Past, Present and Future


             Jason Shih
    Academia Sinica Grid computing
      FBI 極簡主義, Nov 3rd, 2010
Outline


 Trend in HPC
 Grid: eScience Research @ PetaScale
 Cloud Hype and Observation
 Future Exploration Path of Computing
 Summary
Max CERN/T1-ASGC Point2Point

   About ASGC
                                                                    Inbound : 9.3 Gbps!




                                                              1. Most Reliable T1: 98.83%!
                                                              2. Very Highly Performing and
                                                               most Stable Site in CCRC08!

                                                                            Asia Pacific Regional
                                                                                Operation Center
 A Worldwide Grid
 Infrastructure
 >280 sites,
 >45 countries
 >80,000 CPUs,
 >20 PetaBytes                                              100 meters underground
 >14,000 users,
 >200 VOs                                                   27km of circumstances;
 >250,000 jobs/day
                                                            locate in Geneva
                                                                 Best Demo Award of EGEE’07!
Grid Application Platform                                          Avian Flu Drug Discovery




Lightweight Problem Solving   Large Hadron Collider (LHC)
        Framework!

                                                                              21
Emerging Trend and Technologies: 2009 -2010
Hype Cycle for Storage Technologies - 2010
Trend in High Performance Computing
Ugly? Performance of HPC Cluster

 272 (52%) of world fastest clusters have efficiency lower
 than 80% (Rmax/Rpeak)
 Only 115 (18%) could drive over 90% of theoretical peak
                            Sampling from Top500 HPC cluster




                   Trend of Cluster Efficiency 2005-2009
Performance and Efficiency
 20% of Top-performed clusters contribute 60% of Total
 Computing Power (27.98PF)
 5 Clusters Eff. < 30
Impact Factor: Interconnectivity
     - Capacity and Cluster Efficiency

 Over 52% of Cluster base on GbE
   With efficiency around 50% only
 InfiniBand adopt by ~36% HPC Clusters
HPC Cluster - Interconnect Using IB
 SDR, DDR and QDR in Top500
   Promising efficiency >= 80%
   Majority of IB ready cluster adopt
   DDR (87%) (2009 Nov)
   Contribute 44% of total computing
   power
     ~28 Pflops
   Avg efficiency ~78%
Trend in HPC Interconnects: Infiniband Roadmap
Common semantics



 Programmer productivity
 Easy of deployment
 HPC filesystem are more mature, wider feature set:
   High concurrent read and write
   In the comfort zone of programmers (vs cloudFS)
 Wide support, adoption, acceptance possible
   pNFS working to be equivalent
   Reuse standard data management tools
     Backup, disaster recovery and tiering
Evolution of Processors
Trend in HPC
Some Observations & Looking for Future (I)
 Computing Paradigm
  (Almost) Free FLOPS
  (Almost) Logic Operation
  Data Access (Memory) Is A Major Bottleneck
  Synchronization Is the Most Expensive
  Data Communication Is A Big Factor in Performance
  I/O Still A Major Programming Consideration
  MPI Coding Is the Motherhood of Large Scale Computing
  Computing in Conjunction of Massive Data Management
  Finding Parallelism Is Not A Whole Issue In Programming
  Data Layout
  Data Movement
  Data Reuse
  Frequency of Interconnected Data Communication
Some Observations & Looking for Future (II)
 Emerging New Possibility
   Massive “Small” Computing Elements with On Board Memory
   Computing Node Can Be Caonfigured Dynamically (including Failure
   recovery)
   Network Switch (within on site complex) Will Nearly Match Memory
   Performance
   Parallel I/O Support for Massive Parallel System
   Asynchronous Computing/Communication Operation
   Sophisticate Data Pre-fetch Scheme (Hardware/Algorithm)
   Automate Dynamic Load Balance Method
   Very High Order Difference Scheme (also Implicit Method)
   Full Coupling of Formerly Split Operators
   Fine Numerical Computational Grid (grid number > 10,000)
   Full Simulation of Protein
   Full Coupling of Computational Model
   Grid Computing for All
Some Observations & Looking for Future (3)




          System will get more complicate &
      Computing Tool will get more sophisticated:


          Vendor Support & User Readiness?
Grid: eScience Research @ PetaScale
WLCG Computing Model
   - The Tier Structure
 Tier-0 (CERN)
   Data recording
   Initial data reconstruction
   Data distribution
 Tier-1 (11 countries)
   Permanent storage
   Re-processing
   Analysis
 Tier-2 (~130 countries)
   Simulation
   End-user analysis
Enabling Grids for E-sciencE




 Archeology
 Astronomy
 Astrophysics
 Civil Protection
 Comp. Chemistry
 Earth Sciences
 Finance
 Fusion
 Geophysics
 High Energy Physics
 Life Sciences
 Multimedia
 Material Sciences
 …

EGEE-II INFSO-RI-031688                                  EGEE07, Budapest, 1-5 October 2007   4
Objectives

 Building sustainable research and collaboration
 infrastructure
 Support research by e-Science, on data intensive
 sciences and applications require cross disciplinary
 distributed collaboration
ASGC Milestone
 Operational from the deployment of LCG0 since 2002
 ASGC CA establish on 2005 (IGTF in same year)
 Tier-1 Center responsibility start from 2005
 Federated Taiwan Tier-2 center (Taiwan Analysis Facility, TAF)
 is also collocated in ASGC
 Rep. of EGEE e-Science Asia Federation while joining EGEE
 from 2004
 Providing Asia Pacific Regional Operation Center (APROC)
 services to regional-wide WLCG/EGEE production infrastructure
 from 2005
 Initiate Avian Flu Drug Discovery Project and collaborate with
 EGEE in 2006
 Start of EUAsiaGrid Project from April 2008
LHC First Beam – Computing at the Petascale


 General Purpose, pp, heavy ions




LHCb: B-physics, CP Violation                  ALICE: Heavy ions, pp


                                      CMS: General Purpose, pp, heavy ions




      ATLAS: General Purpose, pp, heavy ions
Size of LHC Detector

                             ATLAS
      Bld. 40




 7,000 Tons                   ATLAS Detector
                       CMS
 25 Meters in Height
 45 Meters in Length
Standard Cosmology

                   Good model from 0.01 sec
                   after Big Bang




                                                                            Energy, Density, Temperature
                   Supported by considerable
                   observational evidence




                                                                    Time
               Elementary Particle Physics

               From the Standard Model into the
               unknown: towards energies of
               1 TeV and beyond: the Terascale

               Towards Quantum Gravity

               From the unknown into the
               unknown...
        http://www.damtp.cam.ac.uk/user/gr/public/bb_history.html
     UNESCO Information                                                25
Preservation debate, April 2007 -
    Jamie.Shiers@cern.ch
WLCG Timeline

 First Beam on LHC, Sep.
 10, 2008
 Severe Incident after 3w
 operation (3.5TeV)
Petabyte Scale Data Challenges

 Why Petabyte?
  Experiment Computing Model
  Comparing with conventional data management
 Challenges
  Performance: LAN and WAN activities
    Sufficient B/W between CPU Farm
    Eliminate Uplink Bottleneck (Switch Tires)
  Fast responding of Critical Events
    Fabric Infrastructure & Service Level Agreement
  Scalability and Manageability
    Robust DB engine (Oracle RAC)
    KB and Adequate Administration (Training)
Tier Model and Data Management Components
Disk Pool Configuration
     - T1 MSS (CASTOR)
Distribution of Free Capacity
     - Per Disk Servers vs. per Pool
Storage Server Generation
     - Drive vs. Net Capacity (Raid6)

                                               TB
                                     TB   21TB/DS
                                31TB/DS




                    TB          TB
               40TB/DS     15TB/DS
IDC Collocation
 Facility install complete at Mar 27th
 Tape system delay after Apr 9th
   Realignment
   RMA for faulty parts
Storage Farm
 ~ 110 raid subsystem deployed since 2003.
 Supporting both Tier1 and 2 storage fabric
 DAS connection to front-end blade server
   Flexible switching front end server upon
   performance requirement
   4-8G fiber channel connectivity
Computing/Storage System Infrastructure
Throughput of WLCG Experiments
 Throughput defined as Job Eff. x # Jobs running
 Characteristic of 4 LHC Exp. depicting in-efficiency
 is due to poor coding.
Reliability From Different View Perspective
Storage Fabric Management
     – The Challenges: Events Management
Open Cloud Consortium



Cloud Hype and Observation
Cloud Hype
 Metacomputing (~1987, L. Smarr)
 Grid Computing (~1997, I. Foster, K. Kesselman)
 Cloud Computing (~2007, E. Schmidt?)
Type of Infrastructure

  roprietary solutions by public providers
 P
   Turnkey solutions developed internally as they own
    the software and hardware solution/tech.
  loud specific support
 C
   Developers of specific hardware and/or software
    solutions that are utilized by service providers or used
    internally when building private cloud
  raditional providers
 T
   Leverage or tweak their existing
Grid and Cloud:
     Comparison
 Cost & Performance
 Scale & Usability
 Service Mapping
 Interoperability
 Application Scenarios
Cloud Computing:
      “X” as a Service
  ype of Cloud
 T
  ayered Service Model
 L
  eference Model
 R
Virtualization is not Cloud computing

 Performance Overhead
 FV vs. PV
    Disk I/O and network throughput (VM scalability)




Ref: Linux-based virtualization for HPC clusters.
Cloud Infrastructure
     Best practical & Real world performance
  tart Up: 60 ~ 44s
 S
  estart : 30 ~ 27s
 R
  eletion: 60 ~ <5s
 D
  igrate
 M
   30 VM ~ 26.8s
   60 VM ~ 40s
    20 VM ~ 89s
    1
  top
 S
   30VM ~ 27.4s
   60VM ~ 26s
    20VM ~ 57s
    1
Cloud Infrastructure
     Best practical
Real World Performance
  tart Up: 60 ~ 44s
 S
  estart : 30 ~ 27s
 R
  eletion: 60 ~ <5s
 D
  igrate
 M
   30 VM ~ 26.8s
   60 VM ~ 40s
    20 VM ~ 89s
     1
  top
 S
   30VM ~ 27.4s
   60VM ~ 26s
    20VM ~ 57s
     1
Virtualization: HEP Best Practical
Grid over Cloud or
     Cloud over Grid?
Power Consumption Challenge
Conclusion: My Opinion


 Future of Computing: Technology-Push & Demand-
 Pull
 Emerging of new science paradigm
 Virtualization: Promising Technology but being
 overemphasized
 Green: Cloud Service Transparency & Common
 Platform
  More Computing Power ~ Power Consumption
  Challenge
 Private Clouds Will be predominant way
  Commercial Cloud (Public) expect not evolving fast
Acknowledgment


 Thanks valuable discussion/inputs from TCloud
 (Cloud OS: Elaster)
 Professional Technical Support from Silvershine
 Tech. at beginning of the collaboration.

The interesting thing about Cloud Computing is that we’ve
defined Cloud Computing to include everything that we
already do….. I don’t understand what we would do
differently in the light of Cloud Computing other than
change the wording of some of our ads.
      Larry Ellison, quote in the Wall Street Journal, Sep 26, 2008
Issues

 Scalability?
   Infrastructure operation vs. performance
 Assessment
 Application aware – Cloud service
 Cost analysis
 Data center power usage – PUE
 Cloud Myth
 Top 10 Cloud Computing Trend
   http://www.focus.com/articles/hosting-bandwidth/
   top-10-cloud-computing-trends/
 Use Cases & Best Practical
Issues (II)


 Volunteer computing (boinc)?
   Total capacity & performance
   successful stories & research Despines
 What’s hindering cloud adoption? Try human.
   http://gigaom.com/cloud/whats-hindering-cloud-
   adoption-how-about-humans/
 Future projection?
   service readiness? Service level? Technical barriers?

More Related Content

What's hot

Cognitive Engine: Boosting Scientific Discovery
Cognitive Engine:  Boosting Scientific DiscoveryCognitive Engine:  Boosting Scientific Discovery
Cognitive Engine: Boosting Scientific Discovery
diannepatricia
 
CloudLightning and the OPM-based Use Case
CloudLightning and the OPM-based Use CaseCloudLightning and the OPM-based Use Case
CloudLightning and the OPM-based Use Case
CloudLightning
 
Alice data acquisition
Alice data acquisitionAlice data acquisition
Alice data acquisition
Bertalan EGED
 
ECP Application Development
ECP Application DevelopmentECP Application Development
ECP Application Development
inside-BigData.com
 
Gfarm Fs Tatebe Tip2004
Gfarm Fs Tatebe Tip2004Gfarm Fs Tatebe Tip2004
Gfarm Fs Tatebe Tip2004xlight
 
Using Photonics to Prototype the Research Campus Infrastructure of the Future...
Using Photonics to Prototype the Research Campus Infrastructure of the Future...Using Photonics to Prototype the Research Campus Infrastructure of the Future...
Using Photonics to Prototype the Research Campus Infrastructure of the Future...
Larry Smarr
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecasts
inside-BigData.com
 
Hpc Cloud project Overview
Hpc Cloud project OverviewHpc Cloud project Overview
Hpc Cloud project Overview
Floris Sluiter
 
Scale-out AI Training on Massive Core System from HPC to Fabric-based SOC
Scale-out AI Training on Massive Core System from HPC to Fabric-based SOCScale-out AI Training on Massive Core System from HPC to Fabric-based SOC
Scale-out AI Training on Massive Core System from HPC to Fabric-based SOC
inside-BigData.com
 
Revisiting Sensor MAC for Periodic Monitoring: Why Should Transmitters Be Ear...
Revisiting Sensor MAC for Periodic Monitoring: Why Should Transmitters Be Ear...Revisiting Sensor MAC for Periodic Monitoring: Why Should Transmitters Be Ear...
Revisiting Sensor MAC for Periodic Monitoring: Why Should Transmitters Be Ear...
deawoo Kim
 
High Performance Computing - Challenges on the Road to Exascale Computing
High Performance Computing - Challenges on the Road to Exascale ComputingHigh Performance Computing - Challenges on the Road to Exascale Computing
High Performance Computing - Challenges on the Road to Exascale ComputingHeiko Joerg Schick
 
Csc presentation
Csc presentationCsc presentation
Csc presentation
Almu Dena
 
Experiences in Application Specific Supercomputer Design - Reasons, Challenge...
Experiences in Application Specific Supercomputer Design - Reasons, Challenge...Experiences in Application Specific Supercomputer Design - Reasons, Challenge...
Experiences in Application Specific Supercomputer Design - Reasons, Challenge...Heiko Joerg Schick
 
Real-Time Pedestrian Detection Using Apache Storm in a Distributed Environment
Real-Time Pedestrian Detection Using Apache Storm in a Distributed Environment Real-Time Pedestrian Detection Using Apache Storm in a Distributed Environment
Real-Time Pedestrian Detection Using Apache Storm in a Distributed Environment
csandit
 
Petascale Analytics - The World of Big Data Requires Big Analytics
Petascale Analytics - The World of Big Data Requires Big AnalyticsPetascale Analytics - The World of Big Data Requires Big Analytics
Petascale Analytics - The World of Big Data Requires Big AnalyticsHeiko Joerg Schick
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
inside-BigData.com
 
Creating a Planetary Scale OptIPuter
Creating a Planetary Scale OptIPuterCreating a Planetary Scale OptIPuter
Creating a Planetary Scale OptIPuter
Larry Smarr
 
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
Frederic Desprez
 

What's hot (20)

Cognitive Engine: Boosting Scientific Discovery
Cognitive Engine:  Boosting Scientific DiscoveryCognitive Engine:  Boosting Scientific Discovery
Cognitive Engine: Boosting Scientific Discovery
 
CloudLightning and the OPM-based Use Case
CloudLightning and the OPM-based Use CaseCloudLightning and the OPM-based Use Case
CloudLightning and the OPM-based Use Case
 
Alice data acquisition
Alice data acquisitionAlice data acquisition
Alice data acquisition
 
ECP Application Development
ECP Application DevelopmentECP Application Development
ECP Application Development
 
Coca1
Coca1Coca1
Coca1
 
Gfarm Fs Tatebe Tip2004
Gfarm Fs Tatebe Tip2004Gfarm Fs Tatebe Tip2004
Gfarm Fs Tatebe Tip2004
 
Using Photonics to Prototype the Research Campus Infrastructure of the Future...
Using Photonics to Prototype the Research Campus Infrastructure of the Future...Using Photonics to Prototype the Research Campus Infrastructure of the Future...
Using Photonics to Prototype the Research Campus Infrastructure of the Future...
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecasts
 
Hpc Cloud project Overview
Hpc Cloud project OverviewHpc Cloud project Overview
Hpc Cloud project Overview
 
Scale-out AI Training on Massive Core System from HPC to Fabric-based SOC
Scale-out AI Training on Massive Core System from HPC to Fabric-based SOCScale-out AI Training on Massive Core System from HPC to Fabric-based SOC
Scale-out AI Training on Massive Core System from HPC to Fabric-based SOC
 
Revisiting Sensor MAC for Periodic Monitoring: Why Should Transmitters Be Ear...
Revisiting Sensor MAC for Periodic Monitoring: Why Should Transmitters Be Ear...Revisiting Sensor MAC for Periodic Monitoring: Why Should Transmitters Be Ear...
Revisiting Sensor MAC for Periodic Monitoring: Why Should Transmitters Be Ear...
 
Paper444012-4014
Paper444012-4014Paper444012-4014
Paper444012-4014
 
High Performance Computing - Challenges on the Road to Exascale Computing
High Performance Computing - Challenges on the Road to Exascale ComputingHigh Performance Computing - Challenges on the Road to Exascale Computing
High Performance Computing - Challenges on the Road to Exascale Computing
 
Csc presentation
Csc presentationCsc presentation
Csc presentation
 
Experiences in Application Specific Supercomputer Design - Reasons, Challenge...
Experiences in Application Specific Supercomputer Design - Reasons, Challenge...Experiences in Application Specific Supercomputer Design - Reasons, Challenge...
Experiences in Application Specific Supercomputer Design - Reasons, Challenge...
 
Real-Time Pedestrian Detection Using Apache Storm in a Distributed Environment
Real-Time Pedestrian Detection Using Apache Storm in a Distributed Environment Real-Time Pedestrian Detection Using Apache Storm in a Distributed Environment
Real-Time Pedestrian Detection Using Apache Storm in a Distributed Environment
 
Petascale Analytics - The World of Big Data Requires Big Analytics
Petascale Analytics - The World of Big Data Requires Big AnalyticsPetascale Analytics - The World of Big Data Requires Big Analytics
Petascale Analytics - The World of Big Data Requires Big Analytics
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
 
Creating a Planetary Scale OptIPuter
Creating a Planetary Scale OptIPuterCreating a Planetary Scale OptIPuter
Creating a Planetary Scale OptIPuter
 
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
 

Viewers also liked

Infrastructure as a service for Mobile Testing as a Service
Infrastructure as a service for Mobile Testing as a ServiceInfrastructure as a service for Mobile Testing as a Service
Infrastructure as a service for Mobile Testing as a Service
Saurabh Jinturkar
 
Fast & Furious: building HPC solutions in a nutshell
Fast & Furious: building HPC solutions in a nutshellFast & Furious: building HPC solutions in a nutshell
Fast & Furious: building HPC solutions in a nutshell
Victor Haydin
 
Tips dan Trik Tugas Akhir Program Studi Ilmu Komputer USU
Tips dan Trik Tugas Akhir Program Studi Ilmu Komputer USUTips dan Trik Tugas Akhir Program Studi Ilmu Komputer USU
Tips dan Trik Tugas Akhir Program Studi Ilmu Komputer USU
Angga Eriansyah Setiawan
 
HPC in the Cloud
HPC in the CloudHPC in the Cloud
HPC in the Cloud
Amazon Web Services
 
Intro to High Performance Computing in the AWS Cloud
Intro to High Performance Computing in the AWS CloudIntro to High Performance Computing in the AWS Cloud
Intro to High Performance Computing in the AWS Cloud
Amazon Web Services
 
Scalable Python with Docker, Kubernetes, OpenShift
Scalable Python with Docker, Kubernetes, OpenShiftScalable Python with Docker, Kubernetes, OpenShift
Scalable Python with Docker, Kubernetes, OpenShift
Aarno Aukia
 

Viewers also liked (6)

Infrastructure as a service for Mobile Testing as a Service
Infrastructure as a service for Mobile Testing as a ServiceInfrastructure as a service for Mobile Testing as a Service
Infrastructure as a service for Mobile Testing as a Service
 
Fast & Furious: building HPC solutions in a nutshell
Fast & Furious: building HPC solutions in a nutshellFast & Furious: building HPC solutions in a nutshell
Fast & Furious: building HPC solutions in a nutshell
 
Tips dan Trik Tugas Akhir Program Studi Ilmu Komputer USU
Tips dan Trik Tugas Akhir Program Studi Ilmu Komputer USUTips dan Trik Tugas Akhir Program Studi Ilmu Komputer USU
Tips dan Trik Tugas Akhir Program Studi Ilmu Komputer USU
 
HPC in the Cloud
HPC in the CloudHPC in the Cloud
HPC in the Cloud
 
Intro to High Performance Computing in the AWS Cloud
Intro to High Performance Computing in the AWS CloudIntro to High Performance Computing in the AWS Cloud
Intro to High Performance Computing in the AWS Cloud
 
Scalable Python with Docker, Kubernetes, OpenShift
Scalable Python with Docker, Kubernetes, OpenShiftScalable Python with Docker, Kubernetes, OpenShift
Scalable Python with Docker, Kubernetes, OpenShift
 

Similar to Hpc, grid and cloud computing - the past, present, and future challenge

TeraGrid Communication and Computation
TeraGrid Communication and ComputationTeraGrid Communication and Computation
TeraGrid Communication and Computation
Tal Lavian Ph.D.
 
The Optiputer - Toward a Terabit LAN
The Optiputer - Toward a Terabit LANThe Optiputer - Toward a Terabit LAN
The Optiputer - Toward a Terabit LAN
Larry Smarr
 
Grid computing & its applications
Grid computing & its applicationsGrid computing & its applications
Grid computing & its applications
Alokeparna Choudhury
 
01-10 Exploring new high potential 2D materials - Angioni.pdf
01-10 Exploring new high potential 2D materials - Angioni.pdf01-10 Exploring new high potential 2D materials - Angioni.pdf
01-10 Exploring new high potential 2D materials - Angioni.pdf
OCRE | Open Clouds for Research Environments
 
Barcelona Supercomputing Center, Generador de Riqueza
Barcelona Supercomputing Center, Generador de RiquezaBarcelona Supercomputing Center, Generador de Riqueza
Barcelona Supercomputing Center, Generador de Riqueza
Facultad de Informática UCM
 
How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental science
inside-BigData.com
 
Science and Cyberinfrastructure in the Data-Dominated Era
Science and Cyberinfrastructure in the Data-Dominated EraScience and Cyberinfrastructure in the Data-Dominated Era
Science and Cyberinfrastructure in the Data-Dominated Era
Larry Smarr
 
Positioning University of California Information Technology for the Future: S...
Positioning University of California Information Technology for the Future: S...Positioning University of California Information Technology for the Future: S...
Positioning University of California Information Technology for the Future: S...
Larry Smarr
 
Grid optical network service architecture for data intensive applications
Grid optical network service architecture for data intensive applicationsGrid optical network service architecture for data intensive applications
Grid optical network service architecture for data intensive applications
Tal Lavian Ph.D.
 
The Jump to Light Speed - Data Intensive Earth Sciences are Leading the Way t...
The Jump to Light Speed - Data Intensive Earth Sciences are Leading the Way t...The Jump to Light Speed - Data Intensive Earth Sciences are Leading the Way t...
The Jump to Light Speed - Data Intensive Earth Sciences are Leading the Way t...
Larry Smarr
 
Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AI
Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AIArm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AI
Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AI
inside-BigData.com
 
Terabit Applications: What Are They, What is Needed to Enable Them?
Terabit Applications: What Are They, What is Needed to Enable Them?Terabit Applications: What Are They, What is Needed to Enable Them?
Terabit Applications: What Are They, What is Needed to Enable Them?
Larry Smarr
 
Interconnect Your Future With Mellanox
Interconnect Your Future With MellanoxInterconnect Your Future With Mellanox
Interconnect Your Future With Mellanox
Mellanox Technologies
 
The Pacific Research Platform
The Pacific Research PlatformThe Pacific Research Platform
The Pacific Research Platform
Larry Smarr
 
HEPTaiwan.ppt
HEPTaiwan.pptHEPTaiwan.ppt
HEPTaiwan.pptVideoguy
 
TeraGrid and Physics Research
TeraGrid and Physics ResearchTeraGrid and Physics Research
TeraGrid and Physics Researchshandra_psc
 
Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22
marpierc
 
Cloud Computing,雲端運算-中研院網格計畫主持人林誠謙
Cloud Computing,雲端運算-中研院網格計畫主持人林誠謙Cloud Computing,雲端運算-中研院網格計畫主持人林誠謙
Cloud Computing,雲端運算-中研院網格計畫主持人林誠謙
Tracy Chen
 
Dcn invited ecoc2018_short
Dcn invited ecoc2018_shortDcn invited ecoc2018_short
Dcn invited ecoc2018_short
Shuangyi Yan
 
The Open Science Data Cloud: Empowering the Long Tail of Science
The Open Science Data Cloud: Empowering the Long Tail of ScienceThe Open Science Data Cloud: Empowering the Long Tail of Science
The Open Science Data Cloud: Empowering the Long Tail of Science
Robert Grossman
 

Similar to Hpc, grid and cloud computing - the past, present, and future challenge (20)

TeraGrid Communication and Computation
TeraGrid Communication and ComputationTeraGrid Communication and Computation
TeraGrid Communication and Computation
 
The Optiputer - Toward a Terabit LAN
The Optiputer - Toward a Terabit LANThe Optiputer - Toward a Terabit LAN
The Optiputer - Toward a Terabit LAN
 
Grid computing & its applications
Grid computing & its applicationsGrid computing & its applications
Grid computing & its applications
 
01-10 Exploring new high potential 2D materials - Angioni.pdf
01-10 Exploring new high potential 2D materials - Angioni.pdf01-10 Exploring new high potential 2D materials - Angioni.pdf
01-10 Exploring new high potential 2D materials - Angioni.pdf
 
Barcelona Supercomputing Center, Generador de Riqueza
Barcelona Supercomputing Center, Generador de RiquezaBarcelona Supercomputing Center, Generador de Riqueza
Barcelona Supercomputing Center, Generador de Riqueza
 
How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental science
 
Science and Cyberinfrastructure in the Data-Dominated Era
Science and Cyberinfrastructure in the Data-Dominated EraScience and Cyberinfrastructure in the Data-Dominated Era
Science and Cyberinfrastructure in the Data-Dominated Era
 
Positioning University of California Information Technology for the Future: S...
Positioning University of California Information Technology for the Future: S...Positioning University of California Information Technology for the Future: S...
Positioning University of California Information Technology for the Future: S...
 
Grid optical network service architecture for data intensive applications
Grid optical network service architecture for data intensive applicationsGrid optical network service architecture for data intensive applications
Grid optical network service architecture for data intensive applications
 
The Jump to Light Speed - Data Intensive Earth Sciences are Leading the Way t...
The Jump to Light Speed - Data Intensive Earth Sciences are Leading the Way t...The Jump to Light Speed - Data Intensive Earth Sciences are Leading the Way t...
The Jump to Light Speed - Data Intensive Earth Sciences are Leading the Way t...
 
Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AI
Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AIArm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AI
Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AI
 
Terabit Applications: What Are They, What is Needed to Enable Them?
Terabit Applications: What Are They, What is Needed to Enable Them?Terabit Applications: What Are They, What is Needed to Enable Them?
Terabit Applications: What Are They, What is Needed to Enable Them?
 
Interconnect Your Future With Mellanox
Interconnect Your Future With MellanoxInterconnect Your Future With Mellanox
Interconnect Your Future With Mellanox
 
The Pacific Research Platform
The Pacific Research PlatformThe Pacific Research Platform
The Pacific Research Platform
 
HEPTaiwan.ppt
HEPTaiwan.pptHEPTaiwan.ppt
HEPTaiwan.ppt
 
TeraGrid and Physics Research
TeraGrid and Physics ResearchTeraGrid and Physics Research
TeraGrid and Physics Research
 
Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22
 
Cloud Computing,雲端運算-中研院網格計畫主持人林誠謙
Cloud Computing,雲端運算-中研院網格計畫主持人林誠謙Cloud Computing,雲端運算-中研院網格計畫主持人林誠謙
Cloud Computing,雲端運算-中研院網格計畫主持人林誠謙
 
Dcn invited ecoc2018_short
Dcn invited ecoc2018_shortDcn invited ecoc2018_short
Dcn invited ecoc2018_short
 
The Open Science Data Cloud: Empowering the Long Tail of Science
The Open Science Data Cloud: Empowering the Long Tail of ScienceThe Open Science Data Cloud: Empowering the Long Tail of Science
The Open Science Data Cloud: Empowering the Long Tail of Science
 

Recently uploaded

S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
tarandeep35
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
Scholarhat
 
How to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold MethodHow to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold Method
Celine George
 
How to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP ModuleHow to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP Module
Celine George
 
Landownership in the Philippines under the Americans-2-pptx.pptx
Landownership in the Philippines under the Americans-2-pptx.pptxLandownership in the Philippines under the Americans-2-pptx.pptx
Landownership in the Philippines under the Americans-2-pptx.pptx
JezreelCabil2
 
Advantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO PerspectiveAdvantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO Perspective
Krisztián Száraz
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
thanhdowork
 
Normal Labour/ Stages of Labour/ Mechanism of Labour
Normal Labour/ Stages of Labour/ Mechanism of LabourNormal Labour/ Stages of Labour/ Mechanism of Labour
Normal Labour/ Stages of Labour/ Mechanism of Labour
Wasim Ak
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Sandy Millin
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
Group Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana BuscigliopptxGroup Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana Buscigliopptx
ArianaBusciglio
 
Aficamten in HCM (SEQUOIA HCM TRIAL 2024)
Aficamten in HCM (SEQUOIA HCM TRIAL 2024)Aficamten in HCM (SEQUOIA HCM TRIAL 2024)
Aficamten in HCM (SEQUOIA HCM TRIAL 2024)
Ashish Kohli
 
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdfMASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
goswamiyash170123
 
Best Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDABest Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDA
deeptiverma2406
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
Nguyen Thanh Tu Collection
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 
DRUGS AND ITS classification slide share
DRUGS AND ITS classification slide shareDRUGS AND ITS classification slide share
DRUGS AND ITS classification slide share
taiba qazi
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Dr. Vinod Kumar Kanvaria
 
Delivering Micro-Credentials in Technical and Vocational Education and Training
Delivering Micro-Credentials in Technical and Vocational Education and TrainingDelivering Micro-Credentials in Technical and Vocational Education and Training
Delivering Micro-Credentials in Technical and Vocational Education and Training
AG2 Design
 
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdfANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
Priyankaranawat4
 

Recently uploaded (20)

S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
 
How to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold MethodHow to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold Method
 
How to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP ModuleHow to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP Module
 
Landownership in the Philippines under the Americans-2-pptx.pptx
Landownership in the Philippines under the Americans-2-pptx.pptxLandownership in the Philippines under the Americans-2-pptx.pptx
Landownership in the Philippines under the Americans-2-pptx.pptx
 
Advantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO PerspectiveAdvantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO Perspective
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
 
Normal Labour/ Stages of Labour/ Mechanism of Labour
Normal Labour/ Stages of Labour/ Mechanism of LabourNormal Labour/ Stages of Labour/ Mechanism of Labour
Normal Labour/ Stages of Labour/ Mechanism of Labour
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
 
Group Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana BuscigliopptxGroup Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana Buscigliopptx
 
Aficamten in HCM (SEQUOIA HCM TRIAL 2024)
Aficamten in HCM (SEQUOIA HCM TRIAL 2024)Aficamten in HCM (SEQUOIA HCM TRIAL 2024)
Aficamten in HCM (SEQUOIA HCM TRIAL 2024)
 
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdfMASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
 
Best Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDABest Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDA
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 
DRUGS AND ITS classification slide share
DRUGS AND ITS classification slide shareDRUGS AND ITS classification slide share
DRUGS AND ITS classification slide share
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
 
Delivering Micro-Credentials in Technical and Vocational Education and Training
Delivering Micro-Credentials in Technical and Vocational Education and TrainingDelivering Micro-Credentials in Technical and Vocational Education and Training
Delivering Micro-Credentials in Technical and Vocational Education and Training
 
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdfANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
 

Hpc, grid and cloud computing - the past, present, and future challenge

  • 1. HPC, Grid and Cloud Computing - The Past, Present and Future Jason Shih Academia Sinica Grid computing FBI 極簡主義, Nov 3rd, 2010
  • 2. Outline  Trend in HPC  Grid: eScience Research @ PetaScale  Cloud Hype and Observation  Future Exploration Path of Computing  Summary
  • 3. Max CERN/T1-ASGC Point2Point About ASGC Inbound : 9.3 Gbps! 1. Most Reliable T1: 98.83%! 2. Very Highly Performing and most Stable Site in CCRC08! Asia Pacific Regional Operation Center A Worldwide Grid Infrastructure >280 sites, >45 countries >80,000 CPUs, >20 PetaBytes 100 meters underground >14,000 users, >200 VOs 27km of circumstances; >250,000 jobs/day locate in Geneva Best Demo Award of EGEE’07! Grid Application Platform Avian Flu Drug Discovery Lightweight Problem Solving Large Hadron Collider (LHC) Framework! 21
  • 4. Emerging Trend and Technologies: 2009 -2010
  • 5. Hype Cycle for Storage Technologies - 2010
  • 6. Trend in High Performance Computing
  • 7. Ugly? Performance of HPC Cluster  272 (52%) of world fastest clusters have efficiency lower than 80% (Rmax/Rpeak)  Only 115 (18%) could drive over 90% of theoretical peak  Sampling from Top500 HPC cluster Trend of Cluster Efficiency 2005-2009
  • 8. Performance and Efficiency  20% of Top-performed clusters contribute 60% of Total Computing Power (27.98PF)  5 Clusters Eff. < 30
  • 9. Impact Factor: Interconnectivity - Capacity and Cluster Efficiency  Over 52% of Cluster base on GbE  With efficiency around 50% only  InfiniBand adopt by ~36% HPC Clusters
  • 10. HPC Cluster - Interconnect Using IB  SDR, DDR and QDR in Top500  Promising efficiency >= 80%  Majority of IB ready cluster adopt DDR (87%) (2009 Nov)  Contribute 44% of total computing power  ~28 Pflops  Avg efficiency ~78%
  • 11. Trend in HPC Interconnects: Infiniband Roadmap
  • 12. Common semantics  Programmer productivity  Easy of deployment  HPC filesystem are more mature, wider feature set:  High concurrent read and write  In the comfort zone of programmers (vs cloudFS)  Wide support, adoption, acceptance possible  pNFS working to be equivalent  Reuse standard data management tools  Backup, disaster recovery and tiering
  • 15. Some Observations & Looking for Future (I)  Computing Paradigm  (Almost) Free FLOPS  (Almost) Logic Operation  Data Access (Memory) Is A Major Bottleneck  Synchronization Is the Most Expensive  Data Communication Is A Big Factor in Performance  I/O Still A Major Programming Consideration  MPI Coding Is the Motherhood of Large Scale Computing  Computing in Conjunction of Massive Data Management  Finding Parallelism Is Not A Whole Issue In Programming  Data Layout  Data Movement  Data Reuse  Frequency of Interconnected Data Communication
  • 16. Some Observations & Looking for Future (II)  Emerging New Possibility  Massive “Small” Computing Elements with On Board Memory  Computing Node Can Be Caonfigured Dynamically (including Failure recovery)  Network Switch (within on site complex) Will Nearly Match Memory Performance  Parallel I/O Support for Massive Parallel System  Asynchronous Computing/Communication Operation  Sophisticate Data Pre-fetch Scheme (Hardware/Algorithm)  Automate Dynamic Load Balance Method  Very High Order Difference Scheme (also Implicit Method)  Full Coupling of Formerly Split Operators  Fine Numerical Computational Grid (grid number > 10,000)  Full Simulation of Protein  Full Coupling of Computational Model  Grid Computing for All
  • 17. Some Observations & Looking for Future (3) System will get more complicate & Computing Tool will get more sophisticated: Vendor Support & User Readiness?
  • 18. Grid: eScience Research @ PetaScale
  • 19. WLCG Computing Model - The Tier Structure  Tier-0 (CERN)  Data recording  Initial data reconstruction  Data distribution  Tier-1 (11 countries)  Permanent storage  Re-processing  Analysis  Tier-2 (~130 countries)  Simulation  End-user analysis
  • 20. Enabling Grids for E-sciencE Archeology Astronomy Astrophysics Civil Protection Comp. Chemistry Earth Sciences Finance Fusion Geophysics High Energy Physics Life Sciences Multimedia Material Sciences … EGEE-II INFSO-RI-031688 EGEE07, Budapest, 1-5 October 2007 4
  • 21. Objectives  Building sustainable research and collaboration infrastructure  Support research by e-Science, on data intensive sciences and applications require cross disciplinary distributed collaboration
  • 22. ASGC Milestone  Operational from the deployment of LCG0 since 2002  ASGC CA establish on 2005 (IGTF in same year)  Tier-1 Center responsibility start from 2005  Federated Taiwan Tier-2 center (Taiwan Analysis Facility, TAF) is also collocated in ASGC  Rep. of EGEE e-Science Asia Federation while joining EGEE from 2004  Providing Asia Pacific Regional Operation Center (APROC) services to regional-wide WLCG/EGEE production infrastructure from 2005  Initiate Avian Flu Drug Discovery Project and collaborate with EGEE in 2006  Start of EUAsiaGrid Project from April 2008
  • 23. LHC First Beam – Computing at the Petascale  General Purpose, pp, heavy ions LHCb: B-physics, CP Violation ALICE: Heavy ions, pp CMS: General Purpose, pp, heavy ions ATLAS: General Purpose, pp, heavy ions
  • 24. Size of LHC Detector ATLAS Bld. 40 7,000 Tons ATLAS Detector CMS 25 Meters in Height 45 Meters in Length
  • 25. Standard Cosmology Good model from 0.01 sec after Big Bang Energy, Density, Temperature Supported by considerable observational evidence Time Elementary Particle Physics From the Standard Model into the unknown: towards energies of 1 TeV and beyond: the Terascale Towards Quantum Gravity From the unknown into the unknown... http://www.damtp.cam.ac.uk/user/gr/public/bb_history.html UNESCO Information 25 Preservation debate, April 2007 - Jamie.Shiers@cern.ch
  • 26. WLCG Timeline  First Beam on LHC, Sep. 10, 2008  Severe Incident after 3w operation (3.5TeV)
  • 27. Petabyte Scale Data Challenges  Why Petabyte?  Experiment Computing Model  Comparing with conventional data management  Challenges  Performance: LAN and WAN activities  Sufficient B/W between CPU Farm  Eliminate Uplink Bottleneck (Switch Tires)  Fast responding of Critical Events  Fabric Infrastructure & Service Level Agreement  Scalability and Manageability  Robust DB engine (Oracle RAC)  KB and Adequate Administration (Training)
  • 28. Tier Model and Data Management Components
  • 29. Disk Pool Configuration - T1 MSS (CASTOR)
  • 30. Distribution of Free Capacity - Per Disk Servers vs. per Pool
  • 31. Storage Server Generation - Drive vs. Net Capacity (Raid6) TB TB 21TB/DS 31TB/DS TB TB 40TB/DS 15TB/DS
  • 32. IDC Collocation  Facility install complete at Mar 27th  Tape system delay after Apr 9th  Realignment  RMA for faulty parts
  • 33. Storage Farm  ~ 110 raid subsystem deployed since 2003.  Supporting both Tier1 and 2 storage fabric  DAS connection to front-end blade server  Flexible switching front end server upon performance requirement  4-8G fiber channel connectivity
  • 35. Throughput of WLCG Experiments  Throughput defined as Job Eff. x # Jobs running  Characteristic of 4 LHC Exp. depicting in-efficiency is due to poor coding.
  • 36. Reliability From Different View Perspective
  • 37. Storage Fabric Management – The Challenges: Events Management
  • 38. Open Cloud Consortium Cloud Hype and Observation
  • 39.
  • 40. Cloud Hype  Metacomputing (~1987, L. Smarr)  Grid Computing (~1997, I. Foster, K. Kesselman)  Cloud Computing (~2007, E. Schmidt?)
  • 41. Type of Infrastructure   roprietary solutions by public providers P  Turnkey solutions developed internally as they own the software and hardware solution/tech.   loud specific support C  Developers of specific hardware and/or software solutions that are utilized by service providers or used internally when building private cloud   raditional providers T  Leverage or tweak their existing
  • 42. Grid and Cloud: Comparison  Cost & Performance  Scale & Usability  Service Mapping  Interoperability  Application Scenarios
  • 43. Cloud Computing: “X” as a Service   ype of Cloud T   ayered Service Model L   eference Model R
  • 44. Virtualization is not Cloud computing  Performance Overhead  FV vs. PV  Disk I/O and network throughput (VM scalability) Ref: Linux-based virtualization for HPC clusters.
  • 45. Cloud Infrastructure Best practical & Real world performance   tart Up: 60 ~ 44s S   estart : 30 ~ 27s R   eletion: 60 ~ <5s D   igrate M  30 VM ~ 26.8s  60 VM ~ 40s   20 VM ~ 89s 1   top S  30VM ~ 27.4s  60VM ~ 26s   20VM ~ 57s 1
  • 46. Cloud Infrastructure Best practical Real World Performance   tart Up: 60 ~ 44s S   estart : 30 ~ 27s R   eletion: 60 ~ <5s D   igrate M  30 VM ~ 26.8s  60 VM ~ 40s   20 VM ~ 89s 1   top S  30VM ~ 27.4s  60VM ~ 26s   20VM ~ 57s 1
  • 48.
  • 49. Grid over Cloud or Cloud over Grid?
  • 51. Conclusion: My Opinion  Future of Computing: Technology-Push & Demand- Pull  Emerging of new science paradigm  Virtualization: Promising Technology but being overemphasized  Green: Cloud Service Transparency & Common Platform  More Computing Power ~ Power Consumption Challenge  Private Clouds Will be predominant way  Commercial Cloud (Public) expect not evolving fast
  • 52. Acknowledgment  Thanks valuable discussion/inputs from TCloud (Cloud OS: Elaster)  Professional Technical Support from Silvershine Tech. at beginning of the collaboration. The interesting thing about Cloud Computing is that we’ve defined Cloud Computing to include everything that we already do….. I don’t understand what we would do differently in the light of Cloud Computing other than change the wording of some of our ads. Larry Ellison, quote in the Wall Street Journal, Sep 26, 2008
  • 53. Issues  Scalability?  Infrastructure operation vs. performance  Assessment  Application aware – Cloud service  Cost analysis  Data center power usage – PUE  Cloud Myth  Top 10 Cloud Computing Trend  http://www.focus.com/articles/hosting-bandwidth/ top-10-cloud-computing-trends/  Use Cases & Best Practical
  • 54. Issues (II)  Volunteer computing (boinc)?  Total capacity & performance  successful stories & research Despines  What’s hindering cloud adoption? Try human.  http://gigaom.com/cloud/whats-hindering-cloud- adoption-how-about-humans/  Future projection?  service readiness? Service level? Technical barriers?