GRID COMPUTING &IT’S 
APPLICATIONS / LHC GRID 
Alokeparna Choudhury 
Stream. CSE 
Roll No. 20091005 
Reg. No. 2783 of 2009-10 
University Institute of Technology
INTRODUCTION TO 
GRID COMPUTING
WHAT IS GRID? 
 In Grid computing the word “GRID” comes from the concept 
“Grating of crisscrossed parallel bars” 
 Grid is a network of horizontal and perpendicular lines, 
uniformly spaced by means of a system of coordinates. 
 In grid computing the computing data and resources are 
implementing in a GRID network.
WHY DO WE NEED GRIDS? 
Many large-scale problems cannot be solved by 
a single computer 
Globally distributed data and resources
5 
GRID ARCHITECHTURE
CONTD. 
 The architecture for grid computing systems consists of four layers. 
 The lowest fabric layer provides interfaces to local resources at a 
specific site. 
 The connectivity layer consists of communication protocols for 
supporting grid transactions. 
 In addition the connectivity layer will contain security protocols to 
authenticate users and resources. 
 The resource layer is responsible for managing a single resource. 
 The collective layer deals with handling access to multiple 
resources. 
 It typically consists of services for resource discovery, allocation 
and scheduling of tasks onto multiple resources. 
 Finally , the application layer consists of the applications that 
operate within a virtual organization and which make use of the grid 
computing environment.
FROM GRIDS TO CLOUD 
COMPUTING 
• Logical steps: 
– Make the grids public 
– Provide much simpler interfaces (and more limited 
control) 
– Charge usage of resources 
 However, the promise of cloud computing finds a great user 
base in science grids due to: 
– Intense computations 
– Huge amounts of storage needs
CONTD. 
 Much of the Grid research community is now working on 
clouds. 
 It is quite feasible to have a cloud within a computational grid, 
as it is possible to have a computational grid as a part of a 
cloud. 
 In a computational grid, one large job is divided into many 
small portions and executed on multiple machines. This 
characteristics is fundamental to a grid, not so in a cloud.
SOME GRID APPLICATIONS 
Distributed supercomputing 
High-throughput computing 
On-demand computing 
Data-intensive computing 
Collaborative computing
INTRODUCTION TO LCG 
 The LHC Computing Grid(LCG) was approved by the CERN 
council on 20th September 2001 to develop, build and maintain 
a distributed computing infrastructure for the storage & 
analysis of data from the 4 LHC experiments. 
 The project was defined with 2 distinct phases. 
 In Phase 1(2002-2005) the required s/w & services would be 
developed & prototyped. 
 In Phase2(2006-2008) the initial services for the 1st beams 
from the LHC machine would be constructed & brought into 
operation.
WLCG 
 Worldwide LHC Computing Grid 
--Distributed Computing Infrastructure for LHC experiments 
 Linking 3 distributed infrastructures 
-OSG Open Science Grid in the US 
-EGI European Grid Infrastructure 
-NDGF Nordic Data Grid Facility 
 Linking more than 300 computer centers 
 Providing > 340,000 cores 
 To more than 2000(active) users 
 Moving ~10GB/s for each experiment 
 Archiving 15PB per year
THE LHC WITH GRID 
COMPUTING 
 The Large Hadrons Collider(LHC), 
starting to operate in 2007, will 
produce roughly 15 Pet bytes(15 
million GB) of data annually. 
 The mission of the LHC Computing 
Grid (LCG) project is to build & 
maintain a data storage and analysis 
infrastructure for the entire high 
energy physics community that will 
use the LHC.
CONTD. 
 In the case of the LHC , a novel globally distributed 
model for data storage and analysis-a computing grid-was 
chosen to centralize all of this capacity at one 
location near the experiments. 
 The LCG Project will implement a grid to support the 
computing models of the experiments using a distributed 
four-tiered model.
TIER-0 
 The original raw data emerging from the data 
acquisition systems will be recorded at the Tier-0 
centre at CERN. 
 The maximum aggregate bandwidth for raw data 
recording for a single experiment(ALICE) is 1.25GB/s. 
 ALICE is A Large Ion Collider Experiment and is 
prepared for CERN’s Large Hadrons collider. It 
concerns with the Grid Computing technologies, 
needed to analyze ALICE data.
CONTD. 
 The first-pass reconstruction will take place at the Tier-0, 
where a copy of the reconstructed data will be stored. 
 The tier-0 will distribute a second copy of the raw data 
across the Tier-1 centre associated with the experiment. 
 Additional copies of the reconstructed data will also be 
distributed across the Tier-1 centre according to the 
policies of each experiment.
TIER-1 
 The role of the Tier-1 center varies according to the 
experiment, but in general they have the prime responsibility 
for managing the permanent data storage-raw, simulated and 
processed data. 
 It provides computational capacity for reprocessing and for 
analysis process that require access to large amounts of data. 
 At present 11 Tier-1 centers have been defined, most of them 
serving several experiments.
TIER-2 
 The role of the Tier-2 centers is to provide computational 
capacity and appropriate storage services for Monte Carlo 
event simulation and for end user analysis. 
 The Tier-2 centers will obtain data as required from Tier-1 
centers, and the data generated at Tier-2 will be sent to Tier-1 
for permanent storage. 
 More than 100 Tier-2 centers have been identified.
TIER-3 
 Other computing facilities in universities and 
laboratories will take part in the processing and 
analysis of LHC data as Tier-3 facilities. 
 These lie outside the scope of the LCG project, 
although they must be provided with access to the 
data and analysis facilities as decided by the 
experiments.
As part of WLCG there are 2 “Tier-2” sites in India: 
TIFR in Mumbai and VECC in Kolkata 
KOLKATA TIER-2
SUMMARY 
 Grid Computing and WLCG has proven itself during the first 
year of data-taking of LHC. 
 Grid Computing works for WLCG community and has a 
future. 
 Long term sustainability will be a challenge. 
FUTURE 
 Stable WANs will provide excellent performance and move to 
a less hierarchical model. 
 Virtualization and cloud computing. 
 Moving towards standards. 
 Integrating new technology.
RREEFFEERREENNCCEESS 
 https://openlab-mu-internal.web.cern.ch 
 cg-archive. 
web.cern.ch/lcg.../lhcgridfest/.../Robertson 
_Les_GridFest.pdf 
 scientific-journals. 
org/journalofsystemsandsoftware/.../vol2 
no5_4.pdf 
 edutechwiki.unige.ch/en/Grid_computing 
 www.redbooks.ibm.com/redbooks/pdfs/sg246778. 
pdf 
 Distributed Systems-principles and paradigms- 
Andrew S. Tanenbaum , Maarten Van Steen
Grid computing & its applications

Grid computing & its applications

  • 1.
    GRID COMPUTING &IT’S APPLICATIONS / LHC GRID Alokeparna Choudhury Stream. CSE Roll No. 20091005 Reg. No. 2783 of 2009-10 University Institute of Technology
  • 2.
  • 3.
    WHAT IS GRID?  In Grid computing the word “GRID” comes from the concept “Grating of crisscrossed parallel bars”  Grid is a network of horizontal and perpendicular lines, uniformly spaced by means of a system of coordinates.  In grid computing the computing data and resources are implementing in a GRID network.
  • 4.
    WHY DO WENEED GRIDS? Many large-scale problems cannot be solved by a single computer Globally distributed data and resources
  • 5.
  • 6.
    CONTD.  Thearchitecture for grid computing systems consists of four layers.  The lowest fabric layer provides interfaces to local resources at a specific site.  The connectivity layer consists of communication protocols for supporting grid transactions.  In addition the connectivity layer will contain security protocols to authenticate users and resources.  The resource layer is responsible for managing a single resource.  The collective layer deals with handling access to multiple resources.  It typically consists of services for resource discovery, allocation and scheduling of tasks onto multiple resources.  Finally , the application layer consists of the applications that operate within a virtual organization and which make use of the grid computing environment.
  • 7.
    FROM GRIDS TOCLOUD COMPUTING • Logical steps: – Make the grids public – Provide much simpler interfaces (and more limited control) – Charge usage of resources  However, the promise of cloud computing finds a great user base in science grids due to: – Intense computations – Huge amounts of storage needs
  • 8.
    CONTD.  Muchof the Grid research community is now working on clouds.  It is quite feasible to have a cloud within a computational grid, as it is possible to have a computational grid as a part of a cloud.  In a computational grid, one large job is divided into many small portions and executed on multiple machines. This characteristics is fundamental to a grid, not so in a cloud.
  • 9.
    SOME GRID APPLICATIONS Distributed supercomputing High-throughput computing On-demand computing Data-intensive computing Collaborative computing
  • 10.
    INTRODUCTION TO LCG  The LHC Computing Grid(LCG) was approved by the CERN council on 20th September 2001 to develop, build and maintain a distributed computing infrastructure for the storage & analysis of data from the 4 LHC experiments.  The project was defined with 2 distinct phases.  In Phase 1(2002-2005) the required s/w & services would be developed & prototyped.  In Phase2(2006-2008) the initial services for the 1st beams from the LHC machine would be constructed & brought into operation.
  • 11.
    WLCG  WorldwideLHC Computing Grid --Distributed Computing Infrastructure for LHC experiments  Linking 3 distributed infrastructures -OSG Open Science Grid in the US -EGI European Grid Infrastructure -NDGF Nordic Data Grid Facility  Linking more than 300 computer centers  Providing > 340,000 cores  To more than 2000(active) users  Moving ~10GB/s for each experiment  Archiving 15PB per year
  • 13.
    THE LHC WITHGRID COMPUTING  The Large Hadrons Collider(LHC), starting to operate in 2007, will produce roughly 15 Pet bytes(15 million GB) of data annually.  The mission of the LHC Computing Grid (LCG) project is to build & maintain a data storage and analysis infrastructure for the entire high energy physics community that will use the LHC.
  • 14.
    CONTD.  Inthe case of the LHC , a novel globally distributed model for data storage and analysis-a computing grid-was chosen to centralize all of this capacity at one location near the experiments.  The LCG Project will implement a grid to support the computing models of the experiments using a distributed four-tiered model.
  • 15.
    TIER-0  Theoriginal raw data emerging from the data acquisition systems will be recorded at the Tier-0 centre at CERN.  The maximum aggregate bandwidth for raw data recording for a single experiment(ALICE) is 1.25GB/s.  ALICE is A Large Ion Collider Experiment and is prepared for CERN’s Large Hadrons collider. It concerns with the Grid Computing technologies, needed to analyze ALICE data.
  • 16.
    CONTD.  Thefirst-pass reconstruction will take place at the Tier-0, where a copy of the reconstructed data will be stored.  The tier-0 will distribute a second copy of the raw data across the Tier-1 centre associated with the experiment.  Additional copies of the reconstructed data will also be distributed across the Tier-1 centre according to the policies of each experiment.
  • 18.
    TIER-1  Therole of the Tier-1 center varies according to the experiment, but in general they have the prime responsibility for managing the permanent data storage-raw, simulated and processed data.  It provides computational capacity for reprocessing and for analysis process that require access to large amounts of data.  At present 11 Tier-1 centers have been defined, most of them serving several experiments.
  • 19.
    TIER-2  Therole of the Tier-2 centers is to provide computational capacity and appropriate storage services for Monte Carlo event simulation and for end user analysis.  The Tier-2 centers will obtain data as required from Tier-1 centers, and the data generated at Tier-2 will be sent to Tier-1 for permanent storage.  More than 100 Tier-2 centers have been identified.
  • 22.
    TIER-3  Othercomputing facilities in universities and laboratories will take part in the processing and analysis of LHC data as Tier-3 facilities.  These lie outside the scope of the LCG project, although they must be provided with access to the data and analysis facilities as decided by the experiments.
  • 24.
    As part ofWLCG there are 2 “Tier-2” sites in India: TIFR in Mumbai and VECC in Kolkata KOLKATA TIER-2
  • 25.
    SUMMARY  GridComputing and WLCG has proven itself during the first year of data-taking of LHC.  Grid Computing works for WLCG community and has a future.  Long term sustainability will be a challenge. FUTURE  Stable WANs will provide excellent performance and move to a less hierarchical model.  Virtualization and cloud computing.  Moving towards standards.  Integrating new technology.
  • 26.
    RREEFFEERREENNCCEESS  https://openlab-mu-internal.web.cern.ch  cg-archive. web.cern.ch/lcg.../lhcgridfest/.../Robertson _Les_GridFest.pdf  scientific-journals. org/journalofsystemsandsoftware/.../vol2 no5_4.pdf  edutechwiki.unige.ch/en/Grid_computing  www.redbooks.ibm.com/redbooks/pdfs/sg246778. pdf  Distributed Systems-principles and paradigms- Andrew S. Tanenbaum , Maarten Van Steen