Tokyo Web Mining #45でお話させていただいた内容です。
アブストラクト:
実験素粒子物理学においては、加速器を使った高エネルギー素粒子の衝突実験から生まれる大量のデータを分析するため、かつてよりあらゆる科学分野の中でも最もデータ量の多い領域でした。スイスのCERN研究所で行われている最新の実験、LHC(Large Hadron Collider)では、最初の2年間で、1PB(ペタバイト)のデータが生成され、その一部は昨年オープン化されました。本講演では、LHCのビッグデータがどのように解析されたのか、インフラ及びアプリケーションレベルの観点ご紹介します。特に、アプリケーションレベルにおいては、独自の統計解析ライブラリであるROOTが幅広く使われており、この講演を通じ、ROOTが現在のデータ解析パラダイムのどこに位置しているのかを参加者の皆様と議論したいと思います。
Talk given by Akira Shibata at Developer's Summit 2016, one of the largest conference for software developer's in Japan. Akira, Data Scientist at DataRobot, Inc, talked about the evolution of machine learning techniques, most notably the recent developments in DataRobot and TensorFlow.
Talk given by Akira Shibata at Developer's Summit 2016, one of the largest conference for software developer's in Japan. Akira, Data Scientist at DataRobot, Inc, talked about the evolution of machine learning techniques, most notably the recent developments in DataRobot and TensorFlow.
The EventView framework provides a modular approach to physics analysis using common event objects. It defines an EventView (EV) container to hold the objects and information needed for analysis. Various tools operate on the EventView to perform tasks like object calibration, reconstruction, and calculation of event variables. Tools are independent modules that build up the EventView incrementally. An EVToolLooper schedules the tools and propagates the analysis information through the EventView. This framework aims to divide analysis into reusable and generalizable components.
This document summarizes the opportunities for studying the top quark using data from the first year of the LHC. It discusses several key areas of focus:
1. Precisely measuring top quark properties like mass, cross section, and couplings to search for signs of new physics.
2. Looking for resonant top quark production that could indicate new particles decaying to top quark pairs. Precision is needed to search for masses above 1 TeV.
3. Studying top quark production mechanisms and decay modes to search for non-standard interactions and measure Standard Model predictions with high precision.
The document emphasizes that a huge amount of work is underway to maximize the potential of early LHC data to study the top
This document discusses molecular evolution at the sequence level. It provides context on molecular evolution and defines key terms like purifying selection, neutral theory, and positive selection. It describes how the genetic code works, including synonymous and nonsynonymous substitutions. Methods for estimating substitution rates and codon usage biases are introduced. Applications of molecular evolution analysis to subjects like human/primate relationships and disease origins are also mentioned.
This document discusses using Logstash to collect, process, and store application logs. It begins by describing different types of logs that are generated by applications and services. It then introduces the ELK stack, consisting of Elasticsearch, Logstash, and Kibana, to centralize, index, and visualize log data. Specific examples are provided on using the Monolog PHP logging library to instrument applications and leverage Logstash's processing pipeline to parse, enrich, and output logs to Elasticsearch.
The Large Hadron Collider (LHC) and ATLAS detector:
- The LHC is a large particle accelerator that collides beams of protons around a 4.3km ring to study particle physics.
- ATLAS is one of the main detectors at the LHC, measuring 46m long and weighing 7,000 tonnes.
- The LHC and ATLAS involve thousands of physicists from 34 countries and will collect 1 petabyte of collision data per year over 10 years of operation to study rare particles like the top quark.
Deck used for my talk during PyDataNYC in which I described how we improved thumbnail cropping in our news app, Kamelio. We used Deep Learning object detection to identify the interesting regions of the image which was subsequently fed into image cropping logic.
ICRA 2018 (IEEE International Conference on Robotics and Automation; https://icra2018.org/ )の参加速報を書きました。
この資料には下記の項目が含まれています。
・ICRA 2018の概要
・ICRA 2018での動向や気付き
・ICRAの重要技術/重要論⽂?
・AIST関連の論文
・今後の方針
・論文まとめ(100本あります)
PyData Tokyo二回目で発表した際のプレゼン資料です。一週間前に会ったPyData NYCの模様をレポートしました。
Second PyData Tokyo Meetup where I reported some highlights from PyData NYC which was held a week before.
The document discusses analysis models for processing Large Hadron Collider (LHC) collision data using grid computing resources. It presents benchmark timing results for different analysis modes in ROOT like using C++, Python, and Athena. Processing derived data products like D1PD, D2PD and D3PD files with a C++ compiled analysis provides the best performance, being up to an order of magnitude faster than other modes. The document aims to help physicists optimize their analysis setup by comparing available options and estimating resource requirements.
The document discusses top quark physics that can be studied at the Large Hadron Collider (LHC). It outlines several measurements that could be made with early LHC data, including the observation of top quark production which would indicate the detectors are functioning properly. With 10 inverse picobarns of data, the top quark production cross section could be measured to around 10% precision using dilepton and semileptonic decay channels. The document also discusses issues that may affect early measurements and techniques for improving the purity of the top quark signal in kinematic selections.