Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

巨量資料分析輕鬆上手_教您玩大強子對撞機公開數據

4,298 views

Published on

COSCUP 2013 講題簡報

Published in: Education
  • Be the first to comment

巨量資料分析輕鬆上手_教您玩大強子對撞機公開數據

  1. 1. 巨量資料分析輕鬆上手巨量資料分析輕鬆上手 教您玩大強子對撞機公開數據教您玩大強子對撞機公開數據 Yuan CHAO ( 趙元 ) (National Taiwan University, Taipei, Taiwan) COSCUP 2012/08/03-04
  2. 2. 我是誰? Yuan CHAO (John) YChao ...
  3. 3. 研究員 高能物理 使用 OSS 做研究 ...
  4. 4. 全球 LHC 計算網格 Worldwide LHC Computing Grid (WLCG) 如何分析處理數據? https://cdsweb.cern.ch/record/1541893 https://www.youtube.com/watch?v=jDC3-QSiLB4
  5. 5. 歐洲粒子物理研究機構 CERN 的地理位置
  6. 6. 瑞士 日內瓦近郊 跨越瑞法邊境
  7. 7. LHC 周長 27 KM 位於地下 50~150 公尺
  8. 8. 質子經逐級加速 接近光速高能對撞 四個對撞點進行實驗 通用型 Atlas, CMS 特定目的 Alice, LHCb 我參加的實驗http://cms.web.cern.ch/org/cms-public http://zh.wikipedia.org/wiki/%E7%B7%8A%E6%B9%8A%E7%
  9. 9. 通用型實驗偵測器 成筒狀包覆在束流管上
  10. 10. 對撞生成的粒子 會穿過偵測器 留下軌跡或能量 的電子訊號
  11. 11. 質子團每秒通過 四千萬次 (40MHz) 平均每次有 15 個對撞
  12. 12. 真正有意義的對撞約 只有百萬分之一
  13. 13. 高速硬體邏輯電路 先篩選出萬分之一事例
  14. 14. 特殊極高速網路傳送至 「線上」叢集電腦
  15. 15. 軟體粗篩出 百分之一事例 可隨時最佳化
  16. 16. 各實驗篩選出 的資料 集中傳送至 零級資料中心 儲存 實驗期間 7 x 24 連續
  17. 17. 事例重建 磁帶長期保存
  18. 18. 資料分散保存在 11 個一級資料中心
  19. 19. 二級資料中心提供實驗學家模擬與分析數據 目前亞洲唯一 一級資料中心 中研院網格中心
  20. 20. LHC 公開數據 主要提供教育用途 http://cms.web.cern.ch/content/cms-public-data
  21. 21. CMS HEP Tutorial 給大學生一周課程 提供約目前 1/500 真實數據量 以及對應的模擬事例 http://ippog.web.cern.ch/resources/2012/cms-hep-tutorial
  22. 22. 23 標準模型標準模型 Standard ModelStandard Model http://atlas.kek.jp/sub/photos/Physics/PhotoPhysicsSM.html 強 子 輕 子 媒 介 子 無 法 單 獨 存 在 The "God-dammed" particle! 構成 pingooo@FNAL 今天不找今天不找 希格斯粒子希格斯粒子
  23. 23. 今天不講物理 ... 只告訴你找什麼
  24. 24. http://en.wikipedia.org/wiki/File:Top_antitop_quark_event.svg Top Quark event
  25. 25. 大人 大人 大人 大人 小孩 寵物 模範家庭
  26. 26. 大人 大人 大人 大人 小孩 寵物 模範家庭 ( 中間過程不重要 )
  27. 27. 翻譯對照表 Jet 大人 Electron 男孩 Muon 女孩 MET 寵物 pt 體重 btag 資深 ...
  28. 28. 29 ROOTROOT RROOTOOT OObject-bject-OOrientedriented TToolkitoolkit Data Analysis toolData Analysis tool Written in C++ (millions of lines)Written in C++ (millions of lines) Open sourceOpen source Integrated C++ interpreterIntegrated C++ interpreter File formatsFile formats I/O handling, graphics, plotting,I/O handling, graphics, plotting, math, histogram binning, eventmath, histogram binning, event display, geometric navigationdisplay, geometric navigation Powerful fitting (RooFit) andPowerful fitting (RooFit) and statistical (RooStats) packagesstatistical (RooStats) packages In use by most of HEP experimentsIn use by most of HEP experiments Standard tool for producing physicsStandard tool for producing physics results at LHCresults at LHC New tools for model creation andNew tools for model creation and combinationscombinations http://root.cern.ch/drupal/
  29. 29. 30 ROOT Sample FormatROOT Sample Format Particles reconstructed and stored inParticles reconstructed and stored in ROOT TreesROOT Trees Monte Carlo
  30. 30. 31 TMVATMVA Multi-variate analysis tool-kitMulti-variate analysis tool-kit Based on supervised learningBased on supervised learning Embedded in ROOTEmbedded in ROOT Easy training and testingEasy training and testing Providing various classifiersProviding various classifiers Linear Discriminant (LD)Linear Discriminant (LD) Artificial Neural Networks (NN)Artificial Neural Networks (NN) Boosted Decision Trees (BDT)Boosted Decision Trees (BDT) ...... http://tmva.sourceforge.net/
  31. 31. 32 Live DEMOLive DEMO Basic ROOT operationsBasic ROOT operations Make plotsMake plots Change styleChange style Export to files and macroExport to files and macro Flatten dataFlatten data Analysis class generatorAnalysis class generator Dump into a new treeDump into a new tree Import to TMVAImport to TMVA Event weightEvent weight Input variablesInput variables Pre-cutsPre-cuts TMVA outputTMVA output Performance plotsPerformance plots MVA class and parametersMVA class and parameters https://github.com/yuanchao/HEPTutorial http://ippog.web.cern.ch/sites/ippog.web.cern.ch/files/HEPTutorial.tar Samle Events Luminosity Real data ~ 500 K ~ 50 pb-1 ttbar ~ 380 K ~ 100 pb-1 W + jets ~ 70 K ~ 100 pb-1 Drell Yan ~ 100 K ~ 100 pb-1 QCD ~ 100 ~ 100 pb-1
  32. 32. 33 TMVA InputsTMVA Inputs Raw Input Variables
  33. 33. 34 TMVA InputsTMVA Inputs PCA Transform
  34. 34. 35 TMVA InputsTMVA Inputs De-correlated
  35. 35. 36 Correlation MatrixCorrelation Matrix
  36. 36. 37 TMVA OutputsTMVA Outputs
  37. 37. 38 TMVA OutputsTMVA Outputs
  38. 38. 39 TMVA OutputsTMVA Outputs
  39. 39. 40 TMVA OutputsTMVA Outputs TMVA by default takes ½ of sample for training and the other ½ for performance tests.
  40. 40. Open Data Open Access Open Source 研究成果開放取用 取之於民、與民享之
  41. 41. You should know what youYou should know what you are doing...are doing... http://arstechnica.com/tech-policy/2013/04/microsoft-excel-the-ruiner-of-global-economies/ BE AWARE!BE AWARE!
  42. 42. 以上
  43. 43. Remerci de Votre Attention
  44. 44. 謝謝
  45. 45. 46 Installing ROOTInstalling ROOT Get the ROOT binary for UbuntuGet the ROOT binary for Ubuntu Go to here:Go to here: http://sourceforge.net/projects/cernrootdebs/http://sourceforge.net/projects/cernrootdebs/ Download the i386 package:Download the i386 package: Click on "Files" → "32bits!" → "root_5.32.00_i386.deb"Click on "Files" → "32bits!" → "root_5.32.00_i386.deb" Open a terminalOpen a terminal Type in the following commands:Type in the following commands: $ cd Download/$ cd Download/ $ sudo dpkg -i root_5.32.00_i386.deb$ sudo dpkg -i root_5.32.00_i386.deb ← use guest passwd!← use guest passwd! $ sudo apt-get install libssl0.9.8$ sudo apt-get install libssl0.9.8 $ sudo apt-get install libjpeg62$ sudo apt-get install libjpeg62 $ source /opt/root/bin/thisroot.sh$ source /opt/root/bin/thisroot.sh ← you can put in ~/.bashrc← you can put in ~/.bashrc You can run root now:You can run root now: $ root -l$ root -l ← " -l" means no splash window← " -l" means no splash window root [0]root [0] TBrowser tTBrowser t ← make sure no error messages← make sure no error messages
  46. 46. LHCLHC LHCLHC 發現新粒子與希格斯粒子相容發現新粒子與希格斯粒子相容 ...... 未發現微觀黑洞或超對稱的存在未發現微觀黑洞或超對稱的存在 ...... http://cdsweb.cern.ch/record/1428128?ln=en

×