IAC 2024 - IA Fast Track to Search Focused AI Solutions
Accelerating Science through Big Data Collaboration
1. Collaboration, Big Data and the
search for the Higgs Boson
Intel European Research and Innovation
Conference
October 23rd 2012
Andrzej Nowak, CERN openlab
Andrzej.Nowak@cern.ch
2. The European Particle Physics Laboratory based in
Geneva, Switzerland
Founded in 1954 by 12 countries for fundamental
physics research in a post-war Europe
In 2012, it is a global effort of 20 member countries
and scientists from 110 nationalities, working on the
world’s most ambitious physics experiments
~2’500 personnel, > 15’000 users
~1 bln CHF yearly budget
Andrzej Nowak - Collaboration, Big Data and the search for the Higgs Boson 2
3. • How to explain particles have mass?
• What is most of the universe made of?
• Why is there little anti-matter?
• What happened in the Big Bang?
Andrzej Nowak - Collaboration, Big Data and the search for the Higgs Boson 3
4. Mont Blanc (4,808m)
Geneva (pop. 190’000)
Lake Geneva (310m deep)
Andrzej Nowak - Collaboration, Big Data and the search for the Higgs Boson
5. The Large Hadron Collider
27 km underground
superconducting ring – possibly the
largest machine ever built by man
40 million collisions per second
150-200 MW power consumption
Andrzej Nowak - Collaboration, Big Data and the search for the Higgs Boson 5
6. Andrzej Nowak - Collaboration, Big Data and the search for the Higgs Boson 6
7. Andrzej Nowak - Collaboration, Big Data and the search for the Higgs
7
Boson
8. Data flow from the LHC detectors
Reconstruction
Selection and
reconstruction
Online triggering and
filtering in detectors
Raw Data Event
(100%) reprocessing
Event
summary data
(10%)
Event simulation
Analysis
Batch physics
Analysis objects
analysis
(1%)
Processed data
Andrzej Nowak - Collaboration, Big Data and the search for the Higgs Boson 8
9. 100 PB 1G
Big Data
10 PB
100 M
Number of files
1 PB
Tape usage
10 M
100 TB
1M
10 TB
Approximate, smoothed values
1 TB 100 k
2003 2005 2008 2010 2012
Andrzej Nowak - Collaboration, Big Data and the search for the Higgs Boson 9
10. The LHC Computing Grid
INSERT
WORKLOAD
HERE
Andrzej Nowak - Collaboration, Big Data and the search for the Higgs Boson 10
11. Collaboration on big data and computing
The Worldwide LHC Computing Grid
Tier-0 (CERN): data nearly 160 sites
recording,
reconstruction and
distribution ~250’000 cores
Tier-1: permanent
storage, re-
processing, 173 PB of storage
analysis
Tier-2: Simulation,
> 2 million jobs/day
end-user analysis
Andrzej Nowak - Collaboration, Big Data and the search for the Higgs Boson 11
12. Cutting edge science
• Accelerating Science and Innovation
Andrzej Nowak - Collaboration, Big Data and the search for the Higgs Boson 12
13. It would have been impossible to release physics results so quickly without
the outstanding performance of the Grid (including the CERN Tier-0)
Number of concurrent ATLAS jobs Jan-July 2012
Includes MC production,
user and group analysis
at CERN, 10 Tier1-s,
~ 70 Tier-2 federations
100 k > 80 sites
> 1500 distinct ATLAS users
do analysis on the GRID
Available resources fully used/stressed (beyond pledges in some cases)
Massive production of 8 TeV Monte Carlo samples
Very effective and flexible Computing Model and Operation team accommodate high
trigger rates and pile-up, intense MC simulation, analysis demands from worldwide
users (through e.g. dynamic data placement)
14. A wealth of knowledge
Physics
Academic Summer Technical CERN
and Outreach EU FP7
Training Student Training Teacher
computing programs programs
program program program schools
schools
Andrzej Nowak - Collaboration, Big Data and the search for the Higgs Boson 14
15. Innovation in science
Medical Applications as an Example of Particle Physics Spin-off
Hadron Therapy
Tumour Leadership in Ion
Target Beam Therapy now
in Europe and Japan
Protons
light ions
Accelerating particle beams X-ray protons
~30’000 accelerators worldwide >70’000 patients treated worldwide (30 facilities)
~17’000 used for medicine >21’000 patients treated in Europe (9 facilities)
Imaging PET Scanner
Clinical trial in Portugal for
new breast imaging system
(ClearPEM)
Detecting particles
15
From F.Hemmer Andrzej Nowak - Collaboration, Big Data and the search for the Higgs Boson
16. Innovation in computing
1989: First high 2012: LHC
1999: The Grid 2003: Several
bandwidth delivering
vision Internet2 land
transatlantic intense data
materializes speed records
links challenges
2001: CERN wins
1991: The World Computerworld’s 2008: The WLCG
Wide Web is 21st Century is the world’s
born at CERN Achievement Award largest grid
for SHIFT
Andrzej Nowak - Collaboration, Big Data and the search for the Higgs Boson 16
17. The CERN openlab
A unique research partnership of CERN and the industry
Objective: The advancement of cutting-edge computing
solutions to be used by the worldwide LHC community
• Partners support manpower and equipment in dedicated
competence centers
• openlab delivers published research and evaluations based
on partners’ solutions – in a very challenging setting
• Created robust hands-on training program in various
computing topics, including international computing
schools; Summer Student program
• Past involvement: Enterasys Networks, IBM, Voltaire, F-
secure, Stonesoft, EDS; Future involvement: Huawei
• Now in phase IV: 2012-2014
http://cern.ch/openlab
Andrzej Nowak - Collaboration, Big Data and the search for the Higgs Boson 17
18. A European Cloud Computing Partnership:
big science teams up with big business
To create an Earth
To support the Setting up a new
Strategic Plan computing capacity service to simplify
Observation platform,
focusing on
needs for the ATLAS analysis of large
Establish multi-tenant, earthquake and
experiment genomes, for a deeper
multi-provider cloud volcano research
insight into evolution
infrastructure and biodiversity
Identify and adopt policies
for trust, security and
privacy
Create governance
structure
Define funding schemes
Andrzej Nowak - Collaboration, Big Data and the search for the Higgs Boson 18
From B.Jones
19. Big(ger) data
Data rates at the LHC to increase by ~100x
Raw data: Exabytes Millions of
an exabyte stored computing
per second? yearly? cores?
“Sustainable computing”
Andrzej Nowak - Collaboration, Big Data and the search for the Higgs Boson 19
20. Future directions in computing
• Software replacing hardware
– Programmability replaces rigid
structures
• Intensive compute
– Local farms must have much higher
processing capacity
• Accelerators
– Experiments with Intel MIC and GPUs
• Silicon photonics
Andrzej Nowak - Collaboration, Big Data and the search for the Higgs Boson 20
21. Accelerating Science and
Innovation
Continued support of the worldwide
physics community and the European
population
Great science and engineering + great
partners = great innovation
Andrzej Nowak - Collaboration, Big Data and the search for the Higgs Boson 21