SlideShare a Scribd company logo
1 of 36
Download to read offline
NLF@chem.uu.nl
11 February 2009
The Grid : a primer
1/24
Outline
 The Grid concept
 Grid architecture
 Middleware – the core
 Interact with the Grid : first steps
 Virtual Organizations
 enmr.eu VO
 Site grid administration
2/24
Our world … today!
Network infrastructure ON
Global Sharing resources “OFF”
3/24
One step further … The Grid
Network infrastructure ON
Global Sharing resources ON
“Coordinated resource sharing and problem solving in dynamic,
multi-institutional virtual organizations”.*
Foster, I. et al., Int. J. Superc. Appli. (2000)15:3
4/24
Why do scientists need the Grid?
High-energy physics (15 PB/year)
15 PB ~ 20*10^6 CD’s
Complex problems !!
Many iterations !!
Virtual cooperation !!
Genome projects, data mining,
Tackling the protein folding,
Protein structure, …
5/24
Building a Grid
1. The architecture
2. The hardware
3. The middleware
6/24
Building a Grid - architecture
Network
Resources
Middleware
Application
User-centric
7/24
Building a Grid - Grid Fabric (I)
Delivery of Advanced Network Technology to Europe
State-of-the-art (1985) = 56 Kbps
Network characterization
 Size
 Throughput
8/24
Building a Grid - Grid fabric (II)
Computer performance
# Syst. Family Rmax.
(GFps)
1 IBM cluster 1105
52 IBM pSeries 48.9
75 IBM BlueGene 35.1
93 IBM BlueGene 27.5
A flop is a basic computational operation
9/24
Building a Grid - middleware
"Middleware" is the software that organizes and
integrates the resources in a grid.
https://twiki.cern.ch/twiki/bin/view/LCG/GenericInstallGuide310
gLite*
10/24
How to interact with the Grid
The UI service
3 ways to access the Grid – UI service, Web portal, or UI PnP
11/24
Enabling Grids for E-science
 > 140 institutions
 > 300 sites
 50 countries
 > 10.000 users
 > 80.000 CPU cores 24/7
WOULD YOU TRUST YOUR COMPUTER TO A COMPLETE STRANGER?
Worldwide LHC Computing Grid (WLCG)
12/24
Registered EGEE Virtual Organizations
Application domain Active VOs Users
High-energy Physics 36 7994
Life Sciences 8 333
... ... ...
Total 155 16263
Stats : 10 Fev 2009
VO name Scope Registered Users
biomd Gobal 223
bio Regional - Italy 57
enmr.eu Global 54
13/24
New application web portal
http://haddock.chem.uu.nl/enmr
14/24
www.enmr.eu
15/24
The eyes of the Grid
http://gridice-enmr.cerm.unifi.it/site/site.php
16/24
How to become an enmr.eu user
www.gridcafe.org
17/24
Trust is the key!
18/24
http://ca.dutchgrid.nl/request/
19/24
Fill the form AND
! Pay attention !
1. Organization
2. Organizational Unit
3. Certificate Level: medium
4. Check & re-check!
RA: A. Bonvin
http://ca.dutchgrid.nl/request/
+ ID card
DN : /O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Your Name
Proof of Possession
20/24
Wait a couple of days …
21/24
https://voms2.cnaf.infn.it:8443/voms/enmr.eu/Login.do
Problems ?
Alexandre
Johan
Nuno
Follow email
instructions !!
22/24
Wise sentence …
“If you think this is cumbersome… it is nothing
compared to get the grid running.”
van der Zwan, J. ; 26-01-2009 14:43
23/24
Site Grid Administration
A glimpse
Goal:
 keep the grid running 24/7
Facts:
 more than 30 middleware updates/year
 Bugs, bugs, and more bugs
 … nevertheless grid is running
How to deal with:
 Test b4 putting a service on production
 Any more ideas?
Sandbox:
 Pre-production: test, destroy, and re-build
 The art of computer virtualization*: takes 2 min.
http://www.xen.org/
24/24
Hardware-centric
/O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Alexandre Bonvin
/O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Johan van der Zwan
Application layer
/O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Marc van Dijk
/O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Sjoerd De Vries
/O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Tsjerk Wassenaar
User abstraction
*.*
Middleware layer
Acknowlegments
25/24
Questões colocadas/ Comentários
 Protecção contra vírus. Existem mecanismos?
 Sistema de prioridades na utilização dos recursos?
 Panos gostou da apresentação, excepto do slide acerca da Grid admin
(diz que estava for a do contexto)
 Num sistema heterogéneo, obtém-se resultados diferentes para o
mesmo problema inicial. No entanto, isto tb ocorre na laboratório. È
possível no entanto escolher que máquinas usar na grid e que
máquinas não usar.
 Klartje perguntou se é possível colocar outros programas na Grid.
Bonvin respondeu que é possivel enviar o programa junto com o
dados.
 Dirk perguntou se as comunicações entre os computadores é
encriptada.
26/24
Moore’s Law
Some STATS :
1. Computer power doubles every … 18 months
2. Network performance doubles every … 9 months
3. Data storage density is doubling every … 12 months
“The number of transistors that could be squeezed on
to a silicon chip was doubling every year.” Moore, G. 1965
Every year that passes, The Grid concept becomes more feasible
 Distributed processors can be more tightly integrated
 Computer grids are increasingly able to solve increasingly complex problems
27/24
gLite INFNGRID – deployment status
Update Date
40 ? (04 Fev 2009,CERN)
38-39 23 Jan 2009
35-37 05 Dez 2008
32-34 07 Nov 2008
30-31 23 Set 2008
... ...
13 19 Fev 2008
... ...
INFNGRID gLite 3.1 (SL4)
28/24
29/24
 The GRID is a collection of geographically distributed resources
 GRID users:
 Organized in Virtual Organizations
 Need to run programs without the need to know
 Where to run a job
 Where to get the input data from
 Where to store the output data to
 The GRID consists of
 An Authorisation and Authentication System
 An Information System
 A Workload Management System
 A Data Management System
 An Accounting System
 Various monitoring services
 Various installation services
The GRID architecture: general view
30/24
 The Authentication and Authorization System:
 Contains the list of all the people authorized to use the GRID
divided by VO
 all machines running Grid services verify the users credentials
map the GRID users to the local users of the machine
 The Information System:
 provides information about gLite resources and their statuses.
 Information published by the individual resources and copied into central databases.
 Used by:
WMS: match resources against job requirements and to rank them
DMS: choose storage resources
monitoring systems
The GRID architecture: general view
31/24
 The Workload Management System:
 manages jobs submitted by users
matches the job requirements to the available resources
schedules the job for execution on an appropriate computing cluster
tracks the job status
allows the user to retrieve the job output when ready
 The Data Management System:
 Allows users to
move files in and out of the Grid
replicate files among different locations
locate files.
 This is achieved:
transferring data via a number of protocols
GridFTP is the most commonly used
interacting with a central file catalog
The GRID architecture: general view
32/24
 Monitoring Services:
 GridICE: monitors the usage of Grid resources
# jobs running, the storage space available …
 R-GMA allows users to monitor application
store results in a relational database
 Some Monitoring Systems check status of Grid services
 more intended for the GRID operations staff
 Dedicated Fabric Management Services:
 manage installation, upgrade and maintenance local Grid services
LCFGng (dismissed)
Quattor
YAIM (semi automatic tool based on APT/YUM and shell scripts)
The GRID architecture: general view
33/24
Grid analogy
Electrical Power-Grid The Grid
You never worry about where the
electricity you are using comes from.
You would never worry about where the
computer power you are using comes from
The infrastructure that makes this
possible is called "the power grid".
The infrastructure that makes this possible
is called "the Grid".
The power grid is pervasive: electricity
is available essentially everywhere and
you can imply access it through a
standard wall socket
The Grid is be pervasive: remote
computing resources would be accessible
from different platforms, and you will simply
access the Grid through your web browser.
The power grid is a utility: you ask for
electricity, and you get it. You also pay
for what you get.
The Grid is a utility: you ask for computer
power or storage capacity and you get it.
You also pay for what you get.
"The Grid" doesn't yet exist in this form; however, the world already
has hundreds of smaller grids...
34/24
Cryptography
 A Scytale
35/24
Evolution of the HDD
Morris, R.J.T. et al , IBM Systems Journal, y.2003, v.42, n.2, pg.205

More Related Content

Similar to The grid aprimer

Critical Information Infrastructure Systems Worldwide
Critical Information Infrastructure Systems WorldwideCritical Information Infrastructure Systems Worldwide
Critical Information Infrastructure Systems Worldwide
Angela Hays
 

Similar to The grid aprimer (20)

Basic networking
Basic networkingBasic networking
Basic networking
 
GRID COMPUTING.ppt
GRID COMPUTING.pptGRID COMPUTING.ppt
GRID COMPUTING.ppt
 
Grid and Cloud Computing Lecture 1a.pptx
Grid and Cloud Computing Lecture 1a.pptxGrid and Cloud Computing Lecture 1a.pptx
Grid and Cloud Computing Lecture 1a.pptx
 
Mark Horowitz - Stanford Engineering - Securing the Internet of Things
Mark Horowitz - Stanford Engineering - Securing the Internet of ThingsMark Horowitz - Stanford Engineering - Securing the Internet of Things
Mark Horowitz - Stanford Engineering - Securing the Internet of Things
 
Grid computing
Grid computingGrid computing
Grid computing
 
A Review Paper On Grid Computing
A Review Paper On Grid ComputingA Review Paper On Grid Computing
A Review Paper On Grid Computing
 
Grid Computing (An Up-Coming Technology)
Grid Computing (An Up-Coming Technology)Grid Computing (An Up-Coming Technology)
Grid Computing (An Up-Coming Technology)
 
Grid Computing in a Commodity World (KCCMG, 2005)
Grid Computing in a Commodity World (KCCMG, 2005)Grid Computing in a Commodity World (KCCMG, 2005)
Grid Computing in a Commodity World (KCCMG, 2005)
 
Grid computing assiment
Grid computing assimentGrid computing assiment
Grid computing assiment
 
Grid computing
Grid computingGrid computing
Grid computing
 
Grid Computing - Collection of computer resources from multiple locations
Grid Computing - Collection of computer resources from multiple locationsGrid Computing - Collection of computer resources from multiple locations
Grid Computing - Collection of computer resources from multiple locations
 
lec1.pptx
lec1.pptxlec1.pptx
lec1.pptx
 
Sector - Presentation at Cloud Computing & Its Applications 2009
Sector - Presentation at Cloud Computing & Its Applications 2009Sector - Presentation at Cloud Computing & Its Applications 2009
Sector - Presentation at Cloud Computing & Its Applications 2009
 
Coco co-desing and co-verification of masked software implementations on cp us
Coco   co-desing and co-verification of masked software implementations on cp usCoco   co-desing and co-verification of masked software implementations on cp us
Coco co-desing and co-verification of masked software implementations on cp us
 
WF-IOT-2014, Seoul, Korea, 06 March 2014
WF-IOT-2014, Seoul, Korea, 06 March 2014WF-IOT-2014, Seoul, Korea, 06 March 2014
WF-IOT-2014, Seoul, Korea, 06 March 2014
 
Critical Information Infrastructure Systems Worldwide
Critical Information Infrastructure Systems WorldwideCritical Information Infrastructure Systems Worldwide
Critical Information Infrastructure Systems Worldwide
 
Trends and Hot Topics in Networking 2023 - IA377 Seminar FEEC-UNICAMP
Trends and Hot Topics in Networking 2023 - IA377 Seminar FEEC-UNICAMPTrends and Hot Topics in Networking 2023 - IA377 Seminar FEEC-UNICAMP
Trends and Hot Topics in Networking 2023 - IA377 Seminar FEEC-UNICAMP
 
Grid computing
Grid computingGrid computing
Grid computing
 
Grid computing
Grid computingGrid computing
Grid computing
 
Grid Computing
Grid ComputingGrid Computing
Grid Computing
 

More from Nuno Ferreira (12)

Introduction to HPC Cloud Computing
Introduction to HPC Cloud ComputingIntroduction to HPC Cloud Computing
Introduction to HPC Cloud Computing
 
HPC Cloud - SURF Research Boot Camp
HPC Cloud - SURF Research Boot CampHPC Cloud - SURF Research Boot Camp
HPC Cloud - SURF Research Boot Camp
 
Thesis.20101115
Thesis.20101115Thesis.20101115
Thesis.20101115
 
Support.services.v2
Support.services.v2Support.services.v2
Support.services.v2
 
Support.services.v1
Support.services.v1Support.services.v1
Support.services.v1
 
Support.services.4.sg.developers
Support.services.4.sg.developersSupport.services.4.sg.developers
Support.services.4.sg.developers
 
Job lifecycle
Job lifecycleJob lifecycle
Job lifecycle
 
Egi users ucf2012
Egi users ucf2012Egi users ucf2012
Egi users ucf2012
 
Egi.tf.2010
Egi.tf.2010Egi.tf.2010
Egi.tf.2010
 
Egi.cf.2014
Egi.cf.2014Egi.cf.2014
Egi.cf.2014
 
App db egi.tf.2013.v2
App db egi.tf.2013.v2App db egi.tf.2013.v2
App db egi.tf.2013.v2
 
20110712.we nmr.utrecht
20110712.we nmr.utrecht20110712.we nmr.utrecht
20110712.we nmr.utrecht
 

The grid aprimer

  • 2. 1/24 Outline  The Grid concept  Grid architecture  Middleware – the core  Interact with the Grid : first steps  Virtual Organizations  enmr.eu VO  Site grid administration
  • 3. 2/24 Our world … today! Network infrastructure ON Global Sharing resources “OFF”
  • 4. 3/24 One step further … The Grid Network infrastructure ON Global Sharing resources ON “Coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations”.* Foster, I. et al., Int. J. Superc. Appli. (2000)15:3
  • 5. 4/24 Why do scientists need the Grid? High-energy physics (15 PB/year) 15 PB ~ 20*10^6 CD’s Complex problems !! Many iterations !! Virtual cooperation !! Genome projects, data mining, Tackling the protein folding, Protein structure, …
  • 6. 5/24 Building a Grid 1. The architecture 2. The hardware 3. The middleware
  • 7. 6/24 Building a Grid - architecture Network Resources Middleware Application User-centric
  • 8. 7/24 Building a Grid - Grid Fabric (I) Delivery of Advanced Network Technology to Europe State-of-the-art (1985) = 56 Kbps Network characterization  Size  Throughput
  • 9. 8/24 Building a Grid - Grid fabric (II) Computer performance # Syst. Family Rmax. (GFps) 1 IBM cluster 1105 52 IBM pSeries 48.9 75 IBM BlueGene 35.1 93 IBM BlueGene 27.5 A flop is a basic computational operation
  • 10. 9/24 Building a Grid - middleware "Middleware" is the software that organizes and integrates the resources in a grid. https://twiki.cern.ch/twiki/bin/view/LCG/GenericInstallGuide310 gLite*
  • 11. 10/24 How to interact with the Grid The UI service 3 ways to access the Grid – UI service, Web portal, or UI PnP
  • 12. 11/24 Enabling Grids for E-science  > 140 institutions  > 300 sites  50 countries  > 10.000 users  > 80.000 CPU cores 24/7 WOULD YOU TRUST YOUR COMPUTER TO A COMPLETE STRANGER? Worldwide LHC Computing Grid (WLCG)
  • 13. 12/24 Registered EGEE Virtual Organizations Application domain Active VOs Users High-energy Physics 36 7994 Life Sciences 8 333 ... ... ... Total 155 16263 Stats : 10 Fev 2009 VO name Scope Registered Users biomd Gobal 223 bio Regional - Italy 57 enmr.eu Global 54
  • 14. 13/24 New application web portal http://haddock.chem.uu.nl/enmr
  • 16. 15/24 The eyes of the Grid http://gridice-enmr.cerm.unifi.it/site/site.php
  • 17. 16/24 How to become an enmr.eu user www.gridcafe.org
  • 20. 19/24 Fill the form AND ! Pay attention ! 1. Organization 2. Organizational Unit 3. Certificate Level: medium 4. Check & re-check! RA: A. Bonvin http://ca.dutchgrid.nl/request/ + ID card DN : /O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Your Name Proof of Possession
  • 21. 20/24 Wait a couple of days …
  • 23. 22/24 Wise sentence … “If you think this is cumbersome… it is nothing compared to get the grid running.” van der Zwan, J. ; 26-01-2009 14:43
  • 24. 23/24 Site Grid Administration A glimpse Goal:  keep the grid running 24/7 Facts:  more than 30 middleware updates/year  Bugs, bugs, and more bugs  … nevertheless grid is running How to deal with:  Test b4 putting a service on production  Any more ideas? Sandbox:  Pre-production: test, destroy, and re-build  The art of computer virtualization*: takes 2 min. http://www.xen.org/
  • 25. 24/24 Hardware-centric /O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Alexandre Bonvin /O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Johan van der Zwan Application layer /O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Marc van Dijk /O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Sjoerd De Vries /O=dutchgrid/O=users/O=universiteit-utrecht/OU=chem/CN=Tsjerk Wassenaar User abstraction *.* Middleware layer Acknowlegments
  • 26. 25/24 Questões colocadas/ Comentários  Protecção contra vírus. Existem mecanismos?  Sistema de prioridades na utilização dos recursos?  Panos gostou da apresentação, excepto do slide acerca da Grid admin (diz que estava for a do contexto)  Num sistema heterogéneo, obtém-se resultados diferentes para o mesmo problema inicial. No entanto, isto tb ocorre na laboratório. È possível no entanto escolher que máquinas usar na grid e que máquinas não usar.  Klartje perguntou se é possível colocar outros programas na Grid. Bonvin respondeu que é possivel enviar o programa junto com o dados.  Dirk perguntou se as comunicações entre os computadores é encriptada.
  • 27. 26/24 Moore’s Law Some STATS : 1. Computer power doubles every … 18 months 2. Network performance doubles every … 9 months 3. Data storage density is doubling every … 12 months “The number of transistors that could be squeezed on to a silicon chip was doubling every year.” Moore, G. 1965 Every year that passes, The Grid concept becomes more feasible  Distributed processors can be more tightly integrated  Computer grids are increasingly able to solve increasingly complex problems
  • 28. 27/24 gLite INFNGRID – deployment status Update Date 40 ? (04 Fev 2009,CERN) 38-39 23 Jan 2009 35-37 05 Dez 2008 32-34 07 Nov 2008 30-31 23 Set 2008 ... ... 13 19 Fev 2008 ... ... INFNGRID gLite 3.1 (SL4)
  • 29. 28/24
  • 30. 29/24  The GRID is a collection of geographically distributed resources  GRID users:  Organized in Virtual Organizations  Need to run programs without the need to know  Where to run a job  Where to get the input data from  Where to store the output data to  The GRID consists of  An Authorisation and Authentication System  An Information System  A Workload Management System  A Data Management System  An Accounting System  Various monitoring services  Various installation services The GRID architecture: general view
  • 31. 30/24  The Authentication and Authorization System:  Contains the list of all the people authorized to use the GRID divided by VO  all machines running Grid services verify the users credentials map the GRID users to the local users of the machine  The Information System:  provides information about gLite resources and their statuses.  Information published by the individual resources and copied into central databases.  Used by: WMS: match resources against job requirements and to rank them DMS: choose storage resources monitoring systems The GRID architecture: general view
  • 32. 31/24  The Workload Management System:  manages jobs submitted by users matches the job requirements to the available resources schedules the job for execution on an appropriate computing cluster tracks the job status allows the user to retrieve the job output when ready  The Data Management System:  Allows users to move files in and out of the Grid replicate files among different locations locate files.  This is achieved: transferring data via a number of protocols GridFTP is the most commonly used interacting with a central file catalog The GRID architecture: general view
  • 33. 32/24  Monitoring Services:  GridICE: monitors the usage of Grid resources # jobs running, the storage space available …  R-GMA allows users to monitor application store results in a relational database  Some Monitoring Systems check status of Grid services  more intended for the GRID operations staff  Dedicated Fabric Management Services:  manage installation, upgrade and maintenance local Grid services LCFGng (dismissed) Quattor YAIM (semi automatic tool based on APT/YUM and shell scripts) The GRID architecture: general view
  • 34. 33/24 Grid analogy Electrical Power-Grid The Grid You never worry about where the electricity you are using comes from. You would never worry about where the computer power you are using comes from The infrastructure that makes this possible is called "the power grid". The infrastructure that makes this possible is called "the Grid". The power grid is pervasive: electricity is available essentially everywhere and you can imply access it through a standard wall socket The Grid is be pervasive: remote computing resources would be accessible from different platforms, and you will simply access the Grid through your web browser. The power grid is a utility: you ask for electricity, and you get it. You also pay for what you get. The Grid is a utility: you ask for computer power or storage capacity and you get it. You also pay for what you get. "The Grid" doesn't yet exist in this form; however, the world already has hundreds of smaller grids...
  • 36. 35/24 Evolution of the HDD Morris, R.J.T. et al , IBM Systems Journal, y.2003, v.42, n.2, pg.205