Successfully reported this slideshow.
Your SlideShare is downloading. ×

淺嚐 LHCb 數據分析的滋味 Play around the LHCb Data on Kaggle with SK-Learn and MatPlotLib

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 43 Ad

淺嚐 LHCb 數據分析的滋味 Play around the LHCb Data on Kaggle with SK-Learn and MatPlotLib

Download to read offline

LHC實驗是現今粒子物理實驗的最先端,2012年所發現的希格斯粒子更是物理界的一大盛事。繼Atlas實驗在Kaggle公開Higgs挑戰之後,另一個LHC的LHCb實驗也將實驗數據搬上了Kaggle平台。本講題將簡介背後的實驗,並使用LHCb的數據以SciKit-Learn進行多維度數據分析與使用MatPlotLib視覺化。
Play around the LHCb Data on Kaggle with SK-Learn and MatPlotLib

LHC實驗是現今粒子物理實驗的最先端,2012年所發現的希格斯粒子更是物理界的一大盛事。繼Atlas實驗在Kaggle公開Higgs挑戰之後,另一個LHC的LHCb實驗也將實驗數據搬上了Kaggle平台。本講題將簡介背後的實驗,並使用LHCb的數據以SciKit-Learn進行多維度數據分析與使用MatPlotLib視覺化。
Play around the LHCb Data on Kaggle with SK-Learn and MatPlotLib

Advertisement
Advertisement

More Related Content

Similar to 淺嚐 LHCb 數據分析的滋味 Play around the LHCb Data on Kaggle with SK-Learn and MatPlotLib (20)

Advertisement
Advertisement

淺嚐 LHCb 數據分析的滋味 Play around the LHCb Data on Kaggle with SK-Learn and MatPlotLib

  1. 1. 淺嚐淺嚐 LHCbLHCb 數據分析的滋味數據分析的滋味 Play around the LHCb Data on Kaggle withPlay around the LHCb Data on Kaggle with SciKit-Learn and MatPlotLibSciKit-Learn and MatPlotLib Yuan CHAO ( 趙元 ) (National Taiwan University, Taipei, Taiwan) PyCon2017 2017/06/09-11
  2. 2. 我是誰? Yuan CHAO (John) YChao ...
  3. 3. 研究員 高能物理 使用 OSS 做研究 ...
  4. 4. 全球 LHC 計算網格 Worldwide LHC Computing Grid (WLCG) 如何分析處理數據? https://cdsweb.cern.ch/record/1541893 https://www.youtube.com/watch?v=jDC3-QSiLB4
  5. 5. 歐洲粒子物理研究機構 CERN 的地理位置
  6. 6. 瑞士 日內瓦近郊 跨越瑞法邊境
  7. 7. LHC 周長 27 KM 位於地下 50~150 公尺
  8. 8. 質子經逐級加速 接近光速高能對撞 四個對撞點進行實驗 通用型 Atlas, CMS 特定目的 Alice, LHCb 我參加的實驗http://cms.web.cern.ch/org/cms-public http://zh.wikipedia.org/wiki/%E7%B7%
  9. 9. 對撞生成的粒子 會穿過偵測器 留下軌跡或能量 的電子訊號
  10. 10. 質子團每秒通過 四千萬次 (40MHz) 平均每次有 15 個對撞
  11. 11. 真正有意義的對撞約 只有百萬分之一
  12. 12. 高速硬體邏輯電路 先篩選出萬分之一事例
  13. 13. 特殊極高速網路傳送至 「線上」叢集電腦
  14. 14. 軟體粗篩出 百分之一事例 可隨時最佳化
  15. 15. 各實驗篩選出 的資料 集中傳送至 零級資料中心 儲存 實驗期間 7 x 24 不間斷
  16. 16. 事例重建 磁帶長期保存
  17. 17. 資料分散保存在 13 個一級資料中心
  18. 18. 二級資料中心提供實驗學家模擬與分析數據 ( 前 ) 亞洲唯一 一級資料中心 中研院網格中心
  19. 19. 研究員 高能物理 使用 OSS 做研究 ... Member of CMS Experiment
  20. 20. 尋找希格斯粒子 Atlas Higgs ML Challenge https://www.kaggle.com/c/higgs-boson $13,000 & 876 teams
  21. 21. 淺嚐味物理 Search for charged lepton flavour violation https://www.kaggle.com/c/flavours-of-physics Search for new physics on lepton-flavour violation $15,000 & 673 teams
  22. 22. τ 濤子? μ 渺子?? 味物理? 輕子味不守恆? –- 請給我五分鐘
  23. 23. 26 標準模型標準模型 Standard ModelStandard Model ~10-18 m宇宙的尺度 http://htwins.net/scale2/~10-1 m 膠子 光子 W/Z 子 重力子 強作用力 電磁力 弱作用力 重力 夸 克 輕 子 奈米 =10-9 m
  24. 24. 27 大霹靂大霹靂 The Origin of the UniverseThe Origin of the Universe
  25. 25. 28 四大問題四大問題 The QuestionsThe Questions LHC was built for the following purposes: 質量的來源 To find the origin of mass... the Higgs boson. 暗物質與暗能量 Looking for the unification.. Super-symmetry as well as other candidates of Dark Mater & Dark energy 反物質的消失 Investigate the mystery of anti-matter disappearance 宇宙初期狀態 Physics at the early stage of the universe: Heavy Ion Collisions and Quark-Gluon Plasma Courtesy of Center for European Nuclear Research (CERN), Geneva, Switzerland.
  26. 26. 29 Symmetry & Flavor PhysicsSymmetry & Flavor Physics People think the universe is symmetric? E = mc2 Parity violation introduced by T.D. Lee ( 李政道 ) and C.N. Yang ( 楊振寧 ) in 1956. –- 宇稱不守恆 Parity violation seen in a β decay by C.S. Wu ( 吳健雄 ) in 1957. Nobel prize for Lee & Yang. CP violation discovered in Kaon system in 1964. M. Kobayashi and T. Maskawa introduced CP violation in the Standard Model in 1973. –- 電荷・宇稱不守恆 Sanda and Carter pointed out the possibility of CP violation in the B meson system in 1980.
  27. 27. Prof. Wu's experiment in 1956. Prof. Li and Yang got Nobel Prize in 1957. http://de.wikipedia.org/wiki/Wu-Experiment
  28. 28. 31 Symmetry & Flavor PhysicsSymmetry & Flavor Physics KTeV experiment at FNAL established the direct CP violation in Kaon system and confirmed by NA48 at CERN in 1999. Belle and BaBar observed indirect CP violation B meson system in 2002. Belle observed the direct CP violation in B → ππ but not confirmed by BaBar in 2004 Belle and BaBar present the evidence of direct CP violation in B → Kππ in 2004. M. Kobayashi ( 小林誠 ) and T. Maskawa ( 益川敏英 ) share the Nobel Prize in 2008 with Y. Nambu ( 南部陽一郎 ). CP violation can't fully explain the Baryon asymmetry problem. → People cont. searching for NP
  29. 29. Machine Learning is nothing new in HEP People in Tevatron, B- factories, LEP and LHC experiments more or less use MVA in their studies! (LL, LD → BDT, NN, .. → DL?)
  30. 30. Era of analog ~1980 ↓ Digital Processing
  31. 31. 物理到此為止 ...
  32. 32. 35 The Kaggle ChallengeThe Kaggle Challenge τ→3μ breaks lepton flavour conservation Basic Data operations Input variables Signal vs. Background Correlations K-S test, CvM test ROC and AUC Machine Learning Algor. Event weight Training and testing AUC score calculation Summary https://www.kaggle.com/c/flavours-of-physics/data Samle Events training.csv mixed MC & data τ→3μ test.csv mixed MC & data τ→3μ check_agre ement.csv mixed MC & data Ds→φ(μμ)π check_corr elation.csv real background data
  33. 33. 36 The GoalThe Goal Look for the rare events of τ→3μ Classifier not too dependent on MC and data Classifier not too dependent on the τ mass The score is counted using the weighted area under the ROC curve (AUC)
  34. 34. 37 The K-S TestThe K-S Test The τ→3μ process is not yet observed Signal is made with MC simulation Background are from real data The classifier should not pick up the difference A control channel Ds→φ(μμ)π is used for the similarity The Kolmogorov-Smirnov (KS) test used to evaluate the difference; requiring KS < 0.09 F are the cumulative distribution functions for MC and real data
  35. 35. 38 The CvM TestThe CvM Test The provided background events are not τ-free Classifier should not too much depend on τ-mass The distribution of τ-mass could be used to extract signal # The Cramer-von Mises (CvM) test is used to test the correlation; requiring CvM-value < 0.002 F are the predictions cumulative distribution functions for all data and data in some mass interval corresponding.
  36. 36. “Rules… Some of them can be bent, others are to be broken” – Morpheus https://artistotleonline.wordpress.com/category/climax/
  37. 37. 40 LiveLive DEMO with Jupyter NBDEMO with Jupyter NB Forked from Kaggle challenge package https://github.com/yandexdataschool/flavours-of-physics-start Now following my derived Jupyter notebook https://github.com/yuanchao/flavours-of-physics-start/blob/master/my_baseline.ipynb
  38. 38. 41 Related URLsRelated URLs LHC computing grid (LCG) and CERN overview video: https://cds.cern.ch/record/2020780 "Higgs ML" Kaggle Challenge https://www.kaggle.com/c/higgs-boson “Flavour of physics” Kaggle Challenge https://www.kaggle.com/c/flavours-of-physics 宇宙的尺度 http://htwins.net/scale2/ Heavy Flavour Data Mining workshop https://indico.cern.ch/event/433556/ Official jupyter NB: https://github.com/yandexdataschool/flavours-of-physics -start My derived jupter NB: https://github.com/yuanchao/flavours-of-physics-start
  39. 39. 以上 Thank YOU! 謝謝 Remercie de Votre Attention
  40. 40. 43 Installing Jupyter & SciPyInstalling Jupyter & SciPy Setup a virtual environment (you need python installed before hands) Using pip: $ pip3 install virtualenv You can also use easy_install or apt-get instead Open a terminal Type in the following commands: $ virtualenv -p python3 .scienv $ source .scienv/bin/activate ← activate the environment! $ pip3 install --upgrade pip $ pip3 install jupyter $ pip3 install scipy pandas sklearn ← you get all packages Then start the jupyter notebook server: $ jupyter notebook ← a web page will be loaded automatically Here we go!

×