SlideShare a Scribd company logo
1 of 21
High Energy Physics Data
Management using
CLOUD Computing
ANALYSIS OF THE FAMOUS BABAR EXPERIMENT DATA HANDLING
PAPER BY: ABHISHEK DEY, CSE 2nd Year | DIYA GHOSH, CSE 2nd year |
Mr. SOMENATH ROY CHOWDHURY
1
Contents
 Motivation
 HEP Legacy Project
 CANFAR Astronomical Research Facility
 System Architecture
 Operational Experience
 Summary
5/25/2013
2
What exactly is BaBar?
 It’s design was motivated by the investigation
of CP violation.
 set up to understand the disparity between
the matter and antimatter content of the universe
by measuring CP violation.
 BaBar focuses on the study of CP violation in the B
meson system.
 nomenclature for the B meson (symbol B) and
its antiparticle (symbol B, pronounced B bar)
5/25/2013
3
BaBar : Data Point of View
 9.5 million lines of C++
and Fortran
 Compiled size is 30 GB
 Significant amount of
manpower is required to
maintain the software
 Each installation must be
validated before
generated results will be
accepted.
CANFAR is a partnership between :
– University of Victoria
– University of British Columbia
– National Research Council, Canadian Astronomy
Data Centre
– Herzberg Institute for Astrophysics
 Helps in providing Infrastructure for VMs.
5/25/2013
4
Need for Cloud Computing:
 Jobs are embarrassingly parallel, much
like HEP.
 Each of these surveys requires a different
processing environment, which require:
 A specific version of a Linux
distribution.
 A specific compiler version.
 Specific libraries
 Applications have little documentation.
 These environments are evolving rapidly
5/25/2013
5
DATA is precious,
too precious..
We need Infrastructure,
which comes easily as a
Service
5/25/2013
6
A word about Cloud Computing:
5/25/2013
7
IaaS: What next?
 With IaaS, we can easily create
many instances of a VM image
 How do we Manage the VMs
once booted?
 How do we get jobs to the
VMs?
5/25/2013
8
Our Solution: Cloud Scheduler + Condor
 Users create a VM with their
experiment software installed.
 A basic VM is created by one group,
and users add on their analysis or
processing software to create their
custom VM.
 Users then create batch jobs as they
would on a regular cluster, but they
specify which VM should run their
images.
CONDOR
5/25/2013
9
Steps for the successful architecture
setup:
5/25/2013
10
5/25/2013
11
5/25/2013
12
5/25/2013
13
CANFAR : MAssive Compact Halo
Objects
 Detailed re-analysis of data from
the MACHO experiment Dark
Matter search.
 Jobs perform a wget to retrieve the
input data (40 M) and have a 4-6
hour run time. Low I/O great for
clouds.
 Astronomers happy with the
environment.
5/25/2013
14
Data Handling in BaBar:
Analysis Jobs
Event data
Real Data
Simulated
Data
Configuration
BaBar
Conditions
Database
 Data is approximately 2PB.
 The file system is hosted on a
cluster of six nodes, consisting of
a Management/Metadata
server (MGS/MDS).
 five Object Storage servers
(OSS).
 single gigabit interface/VLAN to
communicate both internally
and externally.
5/25/2013
15
Xrootd : Need for Distributed Data
 Xrootd is a file server
providing byte level access
and is used by many high
energy physics experiments.
 provides access to the
distributed data.
 a read-ahead value of 1 MB
 a read-ahead cache size of
10 MB was set on each
Xrootd client
5/25/2013
16
How a DFS works?
 Blocks replicated across several
datanodes(usually 3)
 Single namenode stores metadata (file names,
block locations, etc.)
 Optimized for large files, sequential reads
 Clients read from closest replica available.(note:
locality of reference.)
 If the replication for a block drops below target, it
is automatically re-replicated.
Datanodes
1
2
3
4
1
2
4
2
1
3
1
4
3
3
2
4
Namenode
5/25/2013
17
Results and Analysis:
5/25/2013
18
Fault tolerant model:
5/25/2013
19
Acknowledgements
 A special word of appreciation and
thanks to Mr. Somenath Roy Chowdhury.
 My heartiest thanks to the entire team
who worked hard to build the cloud.
5/25/2013
20
Questions Please?
21

More Related Content

What's hot

Large Scale On-Demand Image Processing For Disaster Relief
Large Scale On-Demand Image Processing For Disaster ReliefLarge Scale On-Demand Image Processing For Disaster Relief
Large Scale On-Demand Image Processing For Disaster ReliefRobert Grossman
 
Exploring performance and energy consumption differences between recent Intel...
Exploring performance and energy consumption differences between recent Intel...Exploring performance and energy consumption differences between recent Intel...
Exploring performance and energy consumption differences between recent Intel...Unai Lopez-Novoa
 
Bionimbus Cambridge Workshop (3-28-11, v7)
Bionimbus Cambridge Workshop (3-28-11, v7)Bionimbus Cambridge Workshop (3-28-11, v7)
Bionimbus Cambridge Workshop (3-28-11, v7)Robert Grossman
 
High Performance Cyberinfrastructure Enabling Data-Driven Science Supporting ...
High Performance Cyberinfrastructure Enabling Data-Driven Science Supporting ...High Performance Cyberinfrastructure Enabling Data-Driven Science Supporting ...
High Performance Cyberinfrastructure Enabling Data-Driven Science Supporting ...Larry Smarr
 
An Overview of Bionimbus (March 2010)
An Overview of Bionimbus (March 2010)An Overview of Bionimbus (March 2010)
An Overview of Bionimbus (March 2010)Robert Grossman
 
Hopsworks - ExtremeEarth Open Workshop
Hopsworks - ExtremeEarth Open WorkshopHopsworks - ExtremeEarth Open Workshop
Hopsworks - ExtremeEarth Open WorkshopExtremeEarth
 
Overview of DuraMat software tool development (poster version)
Overview of DuraMat software tool development(poster version)Overview of DuraMat software tool development(poster version)
Overview of DuraMat software tool development (poster version)Anubhav Jain
 
Core Objective 1: Highlights from the Central Data Resource
Core Objective 1: Highlights from the Central Data ResourceCore Objective 1: Highlights from the Central Data Resource
Core Objective 1: Highlights from the Central Data ResourceAnubhav Jain
 
Real-Time Analysis of Streaming Synchotron Data: SCinet SC19 Technology Chall...
Real-Time Analysis of Streaming Synchotron Data: SCinet SC19 Technology Chall...Real-Time Analysis of Streaming Synchotron Data: SCinet SC19 Technology Chall...
Real-Time Analysis of Streaming Synchotron Data: SCinet SC19 Technology Chall...Globus
 
Open Science Data Cloud - CCA 11
Open Science Data Cloud - CCA 11Open Science Data Cloud - CCA 11
Open Science Data Cloud - CCA 11Robert Grossman
 
Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the ContinuumIan Foster
 
OpenTopography - Scalable Services for Geosciences Data
OpenTopography - Scalable Services for Geosciences DataOpenTopography - Scalable Services for Geosciences Data
OpenTopography - Scalable Services for Geosciences DataOpenTopography Facility
 
rasdaman: from barebone Arrays to DataCubes
rasdaman: from barebone Arrays to DataCubesrasdaman: from barebone Arrays to DataCubes
rasdaman: from barebone Arrays to DataCubesEUDAT
 
GeoSpatially enabling your Spark and Accumulo clusters with LocationTech
GeoSpatially enabling your Spark and Accumulo clusters with LocationTechGeoSpatially enabling your Spark and Accumulo clusters with LocationTech
GeoSpatially enabling your Spark and Accumulo clusters with LocationTechRob Emanuele
 
DATACUBES: Conquering Space & Time
DATACUBES: Conquering Space & TimeDATACUBES: Conquering Space & Time
DATACUBES: Conquering Space & Timeplan4all
 
Open Science Data Cloud (IEEE Cloud 2011)
Open Science Data Cloud (IEEE Cloud 2011)Open Science Data Cloud (IEEE Cloud 2011)
Open Science Data Cloud (IEEE Cloud 2011)Robert Grossman
 
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV DataThe DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV DataAnubhav Jain
 
DuraMat Data Analytics
DuraMat Data AnalyticsDuraMat Data Analytics
DuraMat Data AnalyticsAnubhav Jain
 

What's hot (20)

Large Scale On-Demand Image Processing For Disaster Relief
Large Scale On-Demand Image Processing For Disaster ReliefLarge Scale On-Demand Image Processing For Disaster Relief
Large Scale On-Demand Image Processing For Disaster Relief
 
Exploring performance and energy consumption differences between recent Intel...
Exploring performance and energy consumption differences between recent Intel...Exploring performance and energy consumption differences between recent Intel...
Exploring performance and energy consumption differences between recent Intel...
 
Bionimbus Cambridge Workshop (3-28-11, v7)
Bionimbus Cambridge Workshop (3-28-11, v7)Bionimbus Cambridge Workshop (3-28-11, v7)
Bionimbus Cambridge Workshop (3-28-11, v7)
 
High Performance Cyberinfrastructure Enabling Data-Driven Science Supporting ...
High Performance Cyberinfrastructure Enabling Data-Driven Science Supporting ...High Performance Cyberinfrastructure Enabling Data-Driven Science Supporting ...
High Performance Cyberinfrastructure Enabling Data-Driven Science Supporting ...
 
An Overview of Bionimbus (March 2010)
An Overview of Bionimbus (March 2010)An Overview of Bionimbus (March 2010)
An Overview of Bionimbus (March 2010)
 
Hopsworks - ExtremeEarth Open Workshop
Hopsworks - ExtremeEarth Open WorkshopHopsworks - ExtremeEarth Open Workshop
Hopsworks - ExtremeEarth Open Workshop
 
Overview of DuraMat software tool development (poster version)
Overview of DuraMat software tool development(poster version)Overview of DuraMat software tool development(poster version)
Overview of DuraMat software tool development (poster version)
 
Core Objective 1: Highlights from the Central Data Resource
Core Objective 1: Highlights from the Central Data ResourceCore Objective 1: Highlights from the Central Data Resource
Core Objective 1: Highlights from the Central Data Resource
 
Real-Time Analysis of Streaming Synchotron Data: SCinet SC19 Technology Chall...
Real-Time Analysis of Streaming Synchotron Data: SCinet SC19 Technology Chall...Real-Time Analysis of Streaming Synchotron Data: SCinet SC19 Technology Chall...
Real-Time Analysis of Streaming Synchotron Data: SCinet SC19 Technology Chall...
 
Open Science Data Cloud - CCA 11
Open Science Data Cloud - CCA 11Open Science Data Cloud - CCA 11
Open Science Data Cloud - CCA 11
 
Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the Continuum
 
OpenTopography - Scalable Services for Geosciences Data
OpenTopography - Scalable Services for Geosciences DataOpenTopography - Scalable Services for Geosciences Data
OpenTopography - Scalable Services for Geosciences Data
 
rasdaman: from barebone Arrays to DataCubes
rasdaman: from barebone Arrays to DataCubesrasdaman: from barebone Arrays to DataCubes
rasdaman: from barebone Arrays to DataCubes
 
GeoSpatially enabling your Spark and Accumulo clusters with LocationTech
GeoSpatially enabling your Spark and Accumulo clusters with LocationTechGeoSpatially enabling your Spark and Accumulo clusters with LocationTech
GeoSpatially enabling your Spark and Accumulo clusters with LocationTech
 
DATACUBES: Conquering Space & Time
DATACUBES: Conquering Space & TimeDATACUBES: Conquering Space & Time
DATACUBES: Conquering Space & Time
 
CLIM Program: Remote Sensing Workshop, The Earth System Grid Federation as a ...
CLIM Program: Remote Sensing Workshop, The Earth System Grid Federation as a ...CLIM Program: Remote Sensing Workshop, The Earth System Grid Federation as a ...
CLIM Program: Remote Sensing Workshop, The Earth System Grid Federation as a ...
 
Ict 2019 v2
Ict 2019 v2Ict 2019 v2
Ict 2019 v2
 
Open Science Data Cloud (IEEE Cloud 2011)
Open Science Data Cloud (IEEE Cloud 2011)Open Science Data Cloud (IEEE Cloud 2011)
Open Science Data Cloud (IEEE Cloud 2011)
 
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV DataThe DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
 
DuraMat Data Analytics
DuraMat Data AnalyticsDuraMat Data Analytics
DuraMat Data Analytics
 

Viewers also liked

Cloud computing using Eucalyptus
Cloud computing using EucalyptusCloud computing using Eucalyptus
Cloud computing using EucalyptusAbhishek Dey
 
Fall Update on the Competitive Retail Market in ERCOT
Fall Update on the Competitive Retail Market in ERCOTFall Update on the Competitive Retail Market in ERCOT
Fall Update on the Competitive Retail Market in ERCOTaectnet
 
Building your own personal cloud with Eucalyptus
Building your own personal cloud with EucalyptusBuilding your own personal cloud with Eucalyptus
Building your own personal cloud with EucalyptusOrlando_Ruby_Users_Group
 
En acabar una lectura… què podem fer a l’aula?
En acabar una lectura…què podem fer a l’aula?En acabar una lectura…què podem fer a l’aula?
En acabar una lectura… què podem fer a l’aula?reporteducacio
 
Eucalyptus: Open Source for Cloud Computing
Eucalyptus: Open Source for Cloud ComputingEucalyptus: Open Source for Cloud Computing
Eucalyptus: Open Source for Cloud Computingclive boulton
 
How to Transform Enterprise Applications to On-premise Clouds with Wipro and ...
How to Transform Enterprise Applications to On-premise Clouds with Wipro and ...How to Transform Enterprise Applications to On-premise Clouds with Wipro and ...
How to Transform Enterprise Applications to On-premise Clouds with Wipro and ...Eucalyptus Systems, Inc.
 
Eucalyptus - An Open-source Infrastructure for Cloud Computing
Eucalyptus - An Open-source Infrastructure for Cloud ComputingEucalyptus - An Open-source Infrastructure for Cloud Computing
Eucalyptus - An Open-source Infrastructure for Cloud Computingelliando dias
 
Eucalyptus - Open Source Infrastructure-as-a-Service
Eucalyptus - Open Source Infrastructure-as-a-ServiceEucalyptus - Open Source Infrastructure-as-a-Service
Eucalyptus - Open Source Infrastructure-as-a-Servicebuildacloud
 
CloudStack vs OpenStack
CloudStack vs OpenStackCloudStack vs OpenStack
CloudStack vs OpenStackVictor Zhang
 
Cloud Computing Architecture
Cloud Computing Architecture Cloud Computing Architecture
Cloud Computing Architecture Vasu Jain
 
Value Stream Mapping VSM Mapeo de la Cadena de Valor - Lean Manufacturing -
Value Stream Mapping VSM Mapeo de la Cadena de Valor - Lean Manufacturing -Value Stream Mapping VSM Mapeo de la Cadena de Valor - Lean Manufacturing -
Value Stream Mapping VSM Mapeo de la Cadena de Valor - Lean Manufacturing -BOM Consulting Group
 
Open Source Cloud Computing -Eucalyptus
Open Source Cloud Computing -EucalyptusOpen Source Cloud Computing -Eucalyptus
Open Source Cloud Computing -EucalyptusSameer Naik
 
Cloud computing architecture and vulnerabilies
Cloud computing architecture and vulnerabiliesCloud computing architecture and vulnerabilies
Cloud computing architecture and vulnerabiliesVinay Dwivedi
 
RADOS for Eucalyptus
RADOS for EucalyptusRADOS for Eucalyptus
RADOS for EucalyptusTakuya ASADA
 

Viewers also liked (20)

Cloud computing using Eucalyptus
Cloud computing using EucalyptusCloud computing using Eucalyptus
Cloud computing using Eucalyptus
 
Eucalyptus 3 Product Overview
Eucalyptus 3 Product OverviewEucalyptus 3 Product Overview
Eucalyptus 3 Product Overview
 
Fall Update on the Competitive Retail Market in ERCOT
Fall Update on the Competitive Retail Market in ERCOTFall Update on the Competitive Retail Market in ERCOT
Fall Update on the Competitive Retail Market in ERCOT
 
Forest Governance In Malaysia
Forest Governance In MalaysiaForest Governance In Malaysia
Forest Governance In Malaysia
 
Building your own personal cloud with Eucalyptus
Building your own personal cloud with EucalyptusBuilding your own personal cloud with Eucalyptus
Building your own personal cloud with Eucalyptus
 
En acabar una lectura… què podem fer a l’aula?
En acabar una lectura…què podem fer a l’aula?En acabar una lectura…què podem fer a l’aula?
En acabar una lectura… què podem fer a l’aula?
 
Green IT matters at Wipro Ltd
Green IT matters at Wipro LtdGreen IT matters at Wipro Ltd
Green IT matters at Wipro Ltd
 
Eucalyptus gnuNify 2012
Eucalyptus gnuNify 2012 Eucalyptus gnuNify 2012
Eucalyptus gnuNify 2012
 
Eucalyptus: Open Source for Cloud Computing
Eucalyptus: Open Source for Cloud ComputingEucalyptus: Open Source for Cloud Computing
Eucalyptus: Open Source for Cloud Computing
 
How to Transform Enterprise Applications to On-premise Clouds with Wipro and ...
How to Transform Enterprise Applications to On-premise Clouds with Wipro and ...How to Transform Enterprise Applications to On-premise Clouds with Wipro and ...
How to Transform Enterprise Applications to On-premise Clouds with Wipro and ...
 
Eucalyptus - An Open-source Infrastructure for Cloud Computing
Eucalyptus - An Open-source Infrastructure for Cloud ComputingEucalyptus - An Open-source Infrastructure for Cloud Computing
Eucalyptus - An Open-source Infrastructure for Cloud Computing
 
Eucalyptus - Open Source Infrastructure-as-a-Service
Eucalyptus - Open Source Infrastructure-as-a-ServiceEucalyptus - Open Source Infrastructure-as-a-Service
Eucalyptus - Open Source Infrastructure-as-a-Service
 
CloudStack Architecture
CloudStack ArchitectureCloudStack Architecture
CloudStack Architecture
 
CloudStack vs OpenStack
CloudStack vs OpenStackCloudStack vs OpenStack
CloudStack vs OpenStack
 
CloudStack vs Openstack
CloudStack vs OpenstackCloudStack vs Openstack
CloudStack vs Openstack
 
Cloud Computing Architecture
Cloud Computing Architecture Cloud Computing Architecture
Cloud Computing Architecture
 
Value Stream Mapping VSM Mapeo de la Cadena de Valor - Lean Manufacturing -
Value Stream Mapping VSM Mapeo de la Cadena de Valor - Lean Manufacturing -Value Stream Mapping VSM Mapeo de la Cadena de Valor - Lean Manufacturing -
Value Stream Mapping VSM Mapeo de la Cadena de Valor - Lean Manufacturing -
 
Open Source Cloud Computing -Eucalyptus
Open Source Cloud Computing -EucalyptusOpen Source Cloud Computing -Eucalyptus
Open Source Cloud Computing -Eucalyptus
 
Cloud computing architecture and vulnerabilies
Cloud computing architecture and vulnerabiliesCloud computing architecture and vulnerabilies
Cloud computing architecture and vulnerabilies
 
RADOS for Eucalyptus
RADOS for EucalyptusRADOS for Eucalyptus
RADOS for Eucalyptus
 

Similar to Handling High Energy Physics Data using Cloud Computing

Panel: NRP Science Impacts​
Panel: NRP Science Impacts​Panel: NRP Science Impacts​
Panel: NRP Science Impacts​Larry Smarr
 
Summary of the Deployment Scenarios and Functional Requirements
Summary of the Deployment Scenarios and Functional RequirementsSummary of the Deployment Scenarios and Functional Requirements
Summary of the Deployment Scenarios and Functional RequirementsArchiver
 
Larry Smarr - NRP Application Drivers
Larry Smarr - NRP Application DriversLarry Smarr - NRP Application Drivers
Larry Smarr - NRP Application DriversLarry Smarr
 
Big Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental ScienceBig Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental ScienceIan Foster
 
Low Power High-Performance Computing on the BeagleBoard Platform
Low Power High-Performance Computing on the BeagleBoard PlatformLow Power High-Performance Computing on the BeagleBoard Platform
Low Power High-Performance Computing on the BeagleBoard Platforma3labdsp
 
Computing Outside The Box September 2009
Computing Outside The Box September 2009Computing Outside The Box September 2009
Computing Outside The Box September 2009Ian Foster
 
Security Challenges and the Pacific Research Platform
Security Challenges and the Pacific Research PlatformSecurity Challenges and the Pacific Research Platform
Security Challenges and the Pacific Research PlatformLarry Smarr
 
Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analys...
Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analys...Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analys...
Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analys...Larry Smarr
 
Computing Outside The Box June 2009
Computing Outside The Box June 2009Computing Outside The Box June 2009
Computing Outside The Box June 2009Ian Foster
 
Applying Photonics to User Needs: The Application Challenge
Applying Photonics to User Needs: The Application ChallengeApplying Photonics to User Needs: The Application Challenge
Applying Photonics to User Needs: The Application ChallengeLarry Smarr
 
Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AI
Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AIArm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AI
Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AIinside-BigData.com
 
OpenACC and Open Hackathons Monthly Highlights May 2023.pdf
OpenACC and Open Hackathons Monthly Highlights May  2023.pdfOpenACC and Open Hackathons Monthly Highlights May  2023.pdf
OpenACC and Open Hackathons Monthly Highlights May 2023.pdfOpenACC
 
Hadoop for Scientific Workloads__HadoopSummit2010
Hadoop for Scientific Workloads__HadoopSummit2010Hadoop for Scientific Workloads__HadoopSummit2010
Hadoop for Scientific Workloads__HadoopSummit2010Yahoo Developer Network
 
DuraMat CO1 Central Data Resource: How it started, how it’s going …
DuraMat CO1 Central Data Resource: How it started, how it’s going …DuraMat CO1 Central Data Resource: How it started, how it’s going …
DuraMat CO1 Central Data Resource: How it started, how it’s going …Anubhav Jain
 
Blue Waters and Resource Management - Now and in the Future
 Blue Waters and Resource Management - Now and in the Future Blue Waters and Resource Management - Now and in the Future
Blue Waters and Resource Management - Now and in the Futureinside-BigData.com
 
Riding the Light: How Dedicated Optical Circuits are Enabling New Science
Riding the Light: How Dedicated Optical Circuits are Enabling New ScienceRiding the Light: How Dedicated Optical Circuits are Enabling New Science
Riding the Light: How Dedicated Optical Circuits are Enabling New ScienceLarry Smarr
 
The Coming Age of Extreme Heterogeneity in HPC
The Coming Age of Extreme Heterogeneity in HPCThe Coming Age of Extreme Heterogeneity in HPC
The Coming Age of Extreme Heterogeneity in HPCinside-BigData.com
 
The next generation of the Montage image mosaic engine
The next generation of the Montage image mosaic engineThe next generation of the Montage image mosaic engine
The next generation of the Montage image mosaic engineG. Bruce Berriman
 

Similar to Handling High Energy Physics Data using Cloud Computing (20)

Panel: NRP Science Impacts​
Panel: NRP Science Impacts​Panel: NRP Science Impacts​
Panel: NRP Science Impacts​
 
Summary of the Deployment Scenarios and Functional Requirements
Summary of the Deployment Scenarios and Functional RequirementsSummary of the Deployment Scenarios and Functional Requirements
Summary of the Deployment Scenarios and Functional Requirements
 
Larry Smarr - NRP Application Drivers
Larry Smarr - NRP Application DriversLarry Smarr - NRP Application Drivers
Larry Smarr - NRP Application Drivers
 
Big Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental ScienceBig Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental Science
 
Low Power High-Performance Computing on the BeagleBoard Platform
Low Power High-Performance Computing on the BeagleBoard PlatformLow Power High-Performance Computing on the BeagleBoard Platform
Low Power High-Performance Computing on the BeagleBoard Platform
 
"Cloud Computing for HPC"
"Cloud Computing for HPC""Cloud Computing for HPC"
"Cloud Computing for HPC"
 
Computing Outside The Box September 2009
Computing Outside The Box September 2009Computing Outside The Box September 2009
Computing Outside The Box September 2009
 
Security Challenges and the Pacific Research Platform
Security Challenges and the Pacific Research PlatformSecurity Challenges and the Pacific Research Platform
Security Challenges and the Pacific Research Platform
 
Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analys...
Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analys...Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analys...
Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analys...
 
Computing Outside The Box June 2009
Computing Outside The Box June 2009Computing Outside The Box June 2009
Computing Outside The Box June 2009
 
Applying Photonics to User Needs: The Application Challenge
Applying Photonics to User Needs: The Application ChallengeApplying Photonics to User Needs: The Application Challenge
Applying Photonics to User Needs: The Application Challenge
 
Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AI
Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AIArm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AI
Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AI
 
OpenACC and Open Hackathons Monthly Highlights May 2023.pdf
OpenACC and Open Hackathons Monthly Highlights May  2023.pdfOpenACC and Open Hackathons Monthly Highlights May  2023.pdf
OpenACC and Open Hackathons Monthly Highlights May 2023.pdf
 
Hadoop for Scientific Workloads__HadoopSummit2010
Hadoop for Scientific Workloads__HadoopSummit2010Hadoop for Scientific Workloads__HadoopSummit2010
Hadoop for Scientific Workloads__HadoopSummit2010
 
DuraMat CO1 Central Data Resource: How it started, how it’s going …
DuraMat CO1 Central Data Resource: How it started, how it’s going …DuraMat CO1 Central Data Resource: How it started, how it’s going …
DuraMat CO1 Central Data Resource: How it started, how it’s going …
 
Blue Waters and Resource Management - Now and in the Future
 Blue Waters and Resource Management - Now and in the Future Blue Waters and Resource Management - Now and in the Future
Blue Waters and Resource Management - Now and in the Future
 
Riding the Light: How Dedicated Optical Circuits are Enabling New Science
Riding the Light: How Dedicated Optical Circuits are Enabling New ScienceRiding the Light: How Dedicated Optical Circuits are Enabling New Science
Riding the Light: How Dedicated Optical Circuits are Enabling New Science
 
The Coming Age of Extreme Heterogeneity in HPC
The Coming Age of Extreme Heterogeneity in HPCThe Coming Age of Extreme Heterogeneity in HPC
The Coming Age of Extreme Heterogeneity in HPC
 
Big data analytics_7_giants_public_24_sep_2013
Big data analytics_7_giants_public_24_sep_2013Big data analytics_7_giants_public_24_sep_2013
Big data analytics_7_giants_public_24_sep_2013
 
The next generation of the Montage image mosaic engine
The next generation of the Montage image mosaic engineThe next generation of the Montage image mosaic engine
The next generation of the Montage image mosaic engine
 

Recently uploaded

BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...M56BOOKSTORE PRODUCT/SERVICE
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfakmcokerachita
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 

Recently uploaded (20)

BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.Compdf
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdf
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 

Handling High Energy Physics Data using Cloud Computing

  • 1. High Energy Physics Data Management using CLOUD Computing ANALYSIS OF THE FAMOUS BABAR EXPERIMENT DATA HANDLING PAPER BY: ABHISHEK DEY, CSE 2nd Year | DIYA GHOSH, CSE 2nd year | Mr. SOMENATH ROY CHOWDHURY 1
  • 2. Contents  Motivation  HEP Legacy Project  CANFAR Astronomical Research Facility  System Architecture  Operational Experience  Summary 5/25/2013 2
  • 3. What exactly is BaBar?  It’s design was motivated by the investigation of CP violation.  set up to understand the disparity between the matter and antimatter content of the universe by measuring CP violation.  BaBar focuses on the study of CP violation in the B meson system.  nomenclature for the B meson (symbol B) and its antiparticle (symbol B, pronounced B bar) 5/25/2013 3
  • 4. BaBar : Data Point of View  9.5 million lines of C++ and Fortran  Compiled size is 30 GB  Significant amount of manpower is required to maintain the software  Each installation must be validated before generated results will be accepted. CANFAR is a partnership between : – University of Victoria – University of British Columbia – National Research Council, Canadian Astronomy Data Centre – Herzberg Institute for Astrophysics  Helps in providing Infrastructure for VMs. 5/25/2013 4
  • 5. Need for Cloud Computing:  Jobs are embarrassingly parallel, much like HEP.  Each of these surveys requires a different processing environment, which require:  A specific version of a Linux distribution.  A specific compiler version.  Specific libraries  Applications have little documentation.  These environments are evolving rapidly 5/25/2013 5
  • 6. DATA is precious, too precious.. We need Infrastructure, which comes easily as a Service 5/25/2013 6
  • 7. A word about Cloud Computing: 5/25/2013 7
  • 8. IaaS: What next?  With IaaS, we can easily create many instances of a VM image  How do we Manage the VMs once booted?  How do we get jobs to the VMs? 5/25/2013 8
  • 9. Our Solution: Cloud Scheduler + Condor  Users create a VM with their experiment software installed.  A basic VM is created by one group, and users add on their analysis or processing software to create their custom VM.  Users then create batch jobs as they would on a regular cluster, but they specify which VM should run their images. CONDOR 5/25/2013 9
  • 10. Steps for the successful architecture setup: 5/25/2013 10
  • 14. CANFAR : MAssive Compact Halo Objects  Detailed re-analysis of data from the MACHO experiment Dark Matter search.  Jobs perform a wget to retrieve the input data (40 M) and have a 4-6 hour run time. Low I/O great for clouds.  Astronomers happy with the environment. 5/25/2013 14
  • 15. Data Handling in BaBar: Analysis Jobs Event data Real Data Simulated Data Configuration BaBar Conditions Database  Data is approximately 2PB.  The file system is hosted on a cluster of six nodes, consisting of a Management/Metadata server (MGS/MDS).  five Object Storage servers (OSS).  single gigabit interface/VLAN to communicate both internally and externally. 5/25/2013 15
  • 16. Xrootd : Need for Distributed Data  Xrootd is a file server providing byte level access and is used by many high energy physics experiments.  provides access to the distributed data.  a read-ahead value of 1 MB  a read-ahead cache size of 10 MB was set on each Xrootd client 5/25/2013 16
  • 17. How a DFS works?  Blocks replicated across several datanodes(usually 3)  Single namenode stores metadata (file names, block locations, etc.)  Optimized for large files, sequential reads  Clients read from closest replica available.(note: locality of reference.)  If the replication for a block drops below target, it is automatically re-replicated. Datanodes 1 2 3 4 1 2 4 2 1 3 1 4 3 3 2 4 Namenode 5/25/2013 17
  • 20. Acknowledgements  A special word of appreciation and thanks to Mr. Somenath Roy Chowdhury.  My heartiest thanks to the entire team who worked hard to build the cloud. 5/25/2013 20