1. Root analysis and implications
to analysis model in ATLAS
Akira Shibata, New York University
@ ACAT 08 in Erice
Nov 05, 2008
1
2. Are we ready to face data from LHC collisions?
Grid computing? Do we have enough CPU? Tape? Disks?
RAM? Do we need T1? T2? T3? AF? Do we need backdoor
access? Are the machines maintained? Is it scary? Are they
online? Do we have enough bandwidth? Can we copy data
across the world? Can we reach the data we need? Can we
reduce the data size? ESD? AOD? D1PD? D2PD? D3PD? Can
we download them? Do we need interactive access? How
do we write an analysis? How fast do they run? Do we need
to buy more disk? How big is my ntuple? Do we need to buy
more CPU? Disks? RAM? Are we up to date? Do I look cool
if I buy a mac? Is virtual machine useful? Why do we use
ROOT? What is PROOF? Is python fast enough? Is it easy
to code? How often will I need to process my data? How
fast will my analysis run? What can I do to get faster? What
are the options? What is the future technology?
ACAT - Novebmer 5, 2008 akira.shibata@nyu.edu
2
3. Analysis in the Era of Grid Computing
ESD
ESD
T1 ESD T1
ESD
~500kB/evt Central Root Native
AOD/
DPD Root + POOL
making Rough size estimate
D1PD AOD
T2 D1PD AOD T2
AOD cpu
~100kB/evt Grid
Get Analy /
DPD
making
T3 D1PD D2PD D3PD T3
D1PD D2PD D3PD
30-80kB/evt 10-50kB/evt 1-10kB/evt request
ROOT / ARA deliver
Analysis at Institute
User
User
Local Root Histo
Desktop Ntuple Histo Desktop
Ntuple Analysis
~1kB/evt
Tiered model for computing model. Leveled approach needed to
optimize the system. Above all, how well does it work from the
physicistsʼ point of view?
ACAT - Novebmer 5, 2008 akira.shibata@nyu.edu
3
4. Derived Physics Data
• DPDs are created using the following operations:
• Skimming: selecting the events one needs
• Thinning: selecting the objects one needs
• Slimming: removing information from objects.
• ESDs hold full information from reconstruction. AOD,
DnPDs are derived with increasing level of derivation.
• Primary purpose of D1PD is to have access to parts of
the ESD information that are otherwise difficult to get to.
• D1/2PD are in POOL format. D3PD refers to any DPD that
are in ntuple format.
• ESD, AOD, D1PD contents are defined by groups. Several
types of D1PD are defined by performance groups. D2PD
and D3PD are defined by users.
• First level analysis may be done (variable calculation,
object reco etc) when D2/3PD are created.
ACAT - Novebmer 5, 2008 akira.shibata@nyu.edu
4
5. Motivation for Profiling ROOT Analysis
• The primary use of the Grid is event reconstruction,
storage and production of reduced data. This is done using
ATLAS software, Athena. Some analysis happens here too.
• However, post-Grid (non-Athena) ROOT analysis is the
main stage for physics analysis.
• Mostly a user-level decision due to the private nature of
physics analysis but:
• the situation is becoming more complex due to
availability of new technology;
• no good summary exists comparing the available
options;
• it is an important ingredient for an efficient analysis
model;
• it is needed for estimating resource requirements.
• Technical discussions does not always answer practical
questions. This study will benchmark analysis “modes” in
realistic settings based on wall-time measurements.
ACAT - Novebmer 5, 2008 akira.shibata@nyu.edu
5
6. “Flat” vs POOL Persistency
• Many of the complexity in the current situation is due to
the POOL technology (additional layer to the ROOT
persistency technology) used in ATLAS. POOL supports:
• Metadata lookup - used by TAG to access events in
large file without having to read the full contents.
• More flexibility in writing out complex objects. Has its
own way of T/P separation and schema evolution.
• When the decision was made ROOT persistency was not
so great as it is now.
• Problems writing out STL objects.
• Problems referring to objects in different trees/files.
• ROOT persistency has improved and now has less
issues.
• ARA - enabling reading POOL objects from ROOT by
calling POOL converters on demand. P->T conversion.
Takes extra read time.
ACAT - Novebmer 5, 2008 akira.shibata@nyu.edu
6
7. Summary of Existing Analysis Modes
Mode Draw CINT ACLiC PyRoot g++ Athena
Ntuple
POOL
Compiled/
Interpreted Interpreted Compiled Interpreted Compiled Both
Interpreted
C++ C++
Language (C++)-- C++ Python C++
Python Python
Interactive
Additional MakeClass SFrame
- SPyroot -
packages MakeSelector AMA
Standard
- - -
dev env
Athena
components
Implemented most common options. All codes available in
ATLAS CVS: users/ashibata/RootBenchmark
ACAT - Novebmer 5, 2008 akira.shibata@nyu.edu
7
8. Benchmark Analysis Contents
• A simple Zee reconstruction analysis implemented for
every mode:
1. Access electron container (POOL) / electron
kinematics branches (Ntuple)
2. Select electrons using isEM and pt and charge
3. Fill histograms with electron kinematics (pT and
multiplicity)
4. Combine electrons to reconstruct Z
5. Fill histogram with Z mass
6. Write histograms out in finalize
• Repeated the above 10 times
• Not complex enough for a real analysis but not entirely
trivial.
• For Draw, plot electron after isEM/pt/charge selection.
No four vector arithmetics.
ACAT - Novebmer 5, 2008 akira.shibata@nyu.edu
8
9. Obtaining Reliable Results
• Using POSIX measurement as much as
possible. Use wall time from time module.
• Avoiding somewhat unstable measurement
with TStopwatch.
• Measurements affected by other activities on
the machine. Overcome by multiple
measurements.
• Machine: Acas (BNL) node with normal load
3.34GB mem, 2 cores Xeon@ 2.00 GHz, data
on NFS.
• Disk cache leads to misleading results. CPU
time = Wall time once the data is in memory.
• Force disc read by flushing RAM. Do not re-
read until all other files have been read.
Alternate between AOD and ntuple analyses.
ACAT - Novebmer 5, 2008 akira.shibata@nyu.edu
9
10. Methodology
AOD
AOD 1. Measured time taken to
Wall time (s)
process with increasing
Wall time (s)
gpp (init:6.64e+01s, rate:5.35e+02Hz)
gpp (init:6.64e+01s, rate:5.35e+02Hz)
1600
1600 SFrame (init:3.62e+01s, rate:3.15e+02Hz)
SFrame (init:3.62e+01s, rate:3.15e+02Hz) number of events.
Draw (init:4.62e+01s, rate:1.25e+02Hz)
Draw (init:4.62e+01s, rate:1.25e+02Hz) 2. Repeat measurements and
1400
1400
PyAthena (init:2.74e+01s, rate:9.65e+01Hz)
PyAthena (init:2.74e+01s, rate:9.65e+01Hz)
take average for each point.
Athena (init:3.08e+01s, rate:6.86e+01Hz)
1200 Athena (init:3.08e+01s, rate:6.86e+01Hz)
CINT (init:5.25e+01s, rate:1.85e+01Hz) 3. Fit a straight line to obtain
overhead (offset) and rate
1200 PyRoot (init:2.50e+00s, rate:1.24e+01Hz)
CINT (init:5.25e+01s, rate:1.85e+01Hz)
(evt/sec).
1000 PyRoot (init:2.50e+00s, rate:1.24e+01Hz)
4. Calculate errors from
1000
800 standard deviation.
Only use rate in comparing
800
600 the modes. Overhead varies
between a fraction of seconds
400 to tens of seconds.
600
200
400
0
0 1000020000300004000050000
200 Number of events
ACAT - Novebmer 5, 2008 akira.shibata@nyu.edu
0 10
11. Data and Format
POOL Ntuple
AOD CBNT?
Full contents
144.22 kB/evt not tried
DPD contents TopD1PD TopD3PD
Trigger/Jets/Leptons etc 31.42 kB/evt 4.87 kB/evt
Small DPD contents SmallD2PD SmallD3PD
Tracks + Electrons 18.74 kB/evt 0.71 kB/evt
Very small DPD VerySmallD2PD VerySmallD3PD
Electrons 1.06 kB/evt 0.37 kB/evt
All derived from FDR2 AODs. All produced on PANDA
(except AOD and D1PD). Around 10,000 events per file.
Total sample size for one data type ranges between 1 GB -
100 GB. A use-case driven comparison. Input file sizes are
different.
ACAT - Novebmer 5, 2008 akira.shibata@nyu.edu
11
12. AOD Analysis Results
AOD Input AOD (rate, error)
mode
Compiled non-
gpp (535Hz, 3%) framework analysis is
the fastest.
SFrame (321Hz, 13%)
Draw (138Hz, 35%)
Only small difference
Athana (98Hz, 8%)
between C++/Python in
Athena.
PyAthena (95Hz, 11%)
CINT (21Hz, 15%)
CINT by far the slowest.
TSelector (19Hz, 2%)
PyRoot (17Hz, 18%)
Seems to be reading all
containers in the files
0 200 400 Hz
ACAT - Novebmer 5, 2008 akira.shibata@nyu.edu
Hz
12
17. Summary
• Very clear performance advantage for ROOT native ntuple
format. An order of magnitude difference. Ball park figure:
Thousands evts/sec vs hundreds of Hz. Those numbers
should be taken as upper limit, real analyses would be
more complex.
• Compiled mode is ~two orders of magnitude faster than
non-compiled options.
• Use of frameworks, even quite a simple one, can slow
things down, though, any realistic analysis would require
some infrastructure. Choose/write frameworks wisely!
• With Athena, the overhead of framework seems large,
though typical DPD jobs can be highly CPU intensive.
• Effect of file caching by system ties input file size and the
execution rate (regardless of the actual read-out). Above
20 kb/evt, the analysis is bound by this effect. This is a
very tight slimming/thinning requirement for D12PD. May
be able to improve this with high performance disk.
ACAT - Novebmer 5, 2008 akira.shibata@nyu.edu
17
18. Acknowlegement
I have bothered a lot of people with this project
including (random order):
Scott Snyder, Wim Lavrijsen, Sebastien Binet,
Emil Obrekov, David Quarrie, Kyle Cranmer,
David Adams, Sven Menke, Shuwei Ye, Sergey
Panitkin, Stephanie Majeski, Hong Ma, Tadashi
Maeno, Attila Krasznahorkay, Jim Cochran,
roottalk, Paolo Califiura
Many thanks.
ACAT - Novebmer 5, 2008 akira.shibata@nyu.edu
18