Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Ulrik Egede "Distributed analysis in LHCb"

718 views

Published on

Семинар «Использование современных информационных технологий для решения современных задач физики частиц» в московском офисе Яндекса, 3 июля 2012

Ulrik Egede, Imperial College

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Ulrik Egede "Distributed analysis in LHCb"

  1. 1. Imperial CollegeLondon Ganga A tool for distributed analysis Ulrik Egede Yandex meeting, July 3, 2012
  2. 2. Yandex meeting July 2012The issues involved● Analysis is in some sense a long data reduction Trigger Creation of “group” datasets Data selection Fitting Paper! Ulrik Egede
  3. 3. Yandex meeting July 2012The issues involved● Analysis is in some sense a long data reduction Trigger Creation of “group” datasets Data selection Fitting This is the tricky one Paper! Ulrik Egede
  4. 4. Yandex meeting July 2012Distributed analysis● All the LHC experiments rely on a distributed analysis model● Data available for analysis will be located at either Tier 1 or Tier 2 sites across the globe●● Grid tools are used for performing analysis ● Only a single “sign-on” with Grid certificate ● No remote logins Ulrik Egede
  5. 5. Yandex meeting July 2012The Ganga User Interface A fully programmable interface for the processing of data● Debug code locally, progress to small analysis in batch farms, run full analysis on Grid ● All done by a one line change of job specification Configure once – run anywhere Ulrik Egede
  6. 6. Yandex meeting July 2012Centralised or individualised?● Centralised data reduction or full scale analysis in “analysis trains”. ● Easy to deal with from an execution point of view ● Takes time to organise ● Potential waiting time for physicists Individual physicists submit jobs in “chaotic” manner ● Many, often inexperienced, users attempt large scale data processing ● Physicists in charge of when and how to perform analysis Risk of lower efficiency● LHCb rely on a combination of the above ● Ganga deals (mainly) with the individual aspect Ulrik Egede
  7. 7. Yandex meeting July 2012A typical work flow for an analysis Develop user code Check architecture Build code Find data Copy test data locally Divide up data Debug test code Submit to sites X,Y,Z Days of effort lost just in keeping Keep track of it track of things Extract physics Ulrik Egede
  8. 8. Yandex meeting July 2012A optimised work flow for analysis Develop user code Check architecture Build code Find data Copy test data locally Divide up data Debug test code Submit to sites X,Y,Z Ganga automatise as Keep track of it much as possible Extract physics Ulrik Egede
  9. 9. Yandex meeting July 2012The Ganga User Interface● An analysis process defined through a set of building blocks forming a “job”.● All building blocks provided as plugins ● Easy to write your own●● Programmable through integration with the Python language ● lumi = 0 for j in jobs.select(name=Higgs): if j.status==completed: lumi += j.luminosity() elif j.status==failed: j.resubmit() print Processed %d fb^-1 so far% lumi Ulrik Egede
  10. 10. Yandex meeting July 2012Usage outside LHCb Less information is available about external usage● Many other smaller HEP projects Super B-factory, BES-III collaboration, Lattice QCD, SNO, T2K● Other science ● Cryptography, Flu virus searches, Water table modelling, ...● Some commercial projects funded initially from research council schemes ● Image classification, protein folding, Amazon EC2 usage● Ulrik Egede
  11. 11. Yandex meeting July 2012Ganga developments● Ganga has since its inception been a GridPP controlled project.● Future require ● Follow up on developments in usage within core areas ● User support ● Inclusion of new paradigms ● Efficient usage for analysis on multi-core machines ● Abstraction of data input and output in same way as already done for CPU resources.● Continued outreach and documented scientific output ● Current standard documentation is “Ganga: J. Moscicki et al,Comp. Phys. Comm., 180:11, (2009)”● Ganga released under GPL to maximise impact Ulrik Egede

×