SlideShare a Scribd company logo
Eamonn Maguire, CERN
hepdata.net
What is it?
HEP Scattering
experiments going
back to the 1950s
Each group of scientists will
analyse particular signals by
processing large numbers
of collision.
The resulting analysis
will be published as
a paper.
But where does the
processed data go?
RE P P --> X
SQRT(S) IN GEV SIG IN MB
7000 95.35 ± 0.38 (stat) ± 1.25 (sys,experimental) ± 0.37
(sys,extrapolation)
What is it?
HEPDATA
Physics paper
Table description
RE P P --> X
SQRT(S) IN GEV SIG IN MB
7000 95.35 ± 0.38 (stat) ± 1.25 (sys,experimental) ± 0.37
(sys,extrapolation)
RE P P --> X
SQRT(S) IN GEV SIG IN MB
7000 95.35 ± 0.38 (stat) ± 1.25 (sys,experimental) ± 0.37
(sys,extrapolation)
Table 1
Table 2 (F1)
HEPData is the go to place for physicists to get access to the data
underlying plots and tables in a publication. It also links to the scripts
and ROOT files for instance used in the analysis (for reproducibility).
What is it?
What are we doing?
Redesigning the interface from it’s current old school style
Creating a new system based on invenio
Building an interactive data visualization component
Supporting a more streamlined data submission process
Submission Archive
{JSON}
Data Record
{JSON}
Data Record
ROOT PYTHON C++ ROOT
submission.yaml
data records
External data
files & links
submission.zip
HEPdata submission archive
RE P P --> X
SQRT(S) IN GEV SIG IN MB
7000 95.35 ± 0.38 (stat) ± 1.25 (sys,experimental) ± 0.37
(sys,extrapolation)
HEPDATA
Table 1
{JSON}
Tables and plots
Processes YAML file, inserts records in to database
and links publication record with data and files.
Web Server
Table description
Plots rendered automatically
using a custom library built upon D3.js
Tables rendered from
JSON
DownloadScripts
The System - Demo
hepdata.net
Comprehensive Review System
Dashboard for Submission Management
Interactive Plotting Library
Versioning
Sandbox
Sandbox
Getting Data in, getting data out…
Converter
Convert from YAML to ROOT, YODA, CSV
Validator
Validate the YAML input to ensure a stress free
submission
Install via PIP, use as a web service, and contribute to
more conversions!
Install via PIP, easy to use API.
Conversion to many formats
Everything on Github! http://www.github.com/hepdata
Acknowledgements
Eamonn Maguire
Jan Stypka
Salvatore Mele
HEPData @ CERN
Graeme Watt
Michael Whalley
Frank Kraus
HEPData @ Durham
Questions?
Lukas Heinrich
HEPData @ NYU
Kyle Cramner
Alumni
Laura Rueda-Garcia
Michal Szoziak Summer Student
and all the Inspire team including Javier Martin Montull, Jan Age Lavik,
and Samuel Kaplun for their help!

More Related Content

What's hot

Honey I Shrunk the Database
Honey I Shrunk the DatabaseHoney I Shrunk the Database
Honey I Shrunk the Database
Vanessa Hurst
 
Data Structures for Statistical Computing in Python
Data Structures for Statistical Computing in PythonData Structures for Statistical Computing in Python
Data Structures for Statistical Computing in Python
Wes McKinney
 
final_copy_camera_ready_paper (7)
final_copy_camera_ready_paper (7)final_copy_camera_ready_paper (7)
final_copy_camera_ready_paper (7)
Ankit Rathi
 
HPCC Systems vs SAS: The Final Countdown
HPCC Systems vs SAS: The Final CountdownHPCC Systems vs SAS: The Final Countdown
HPCC Systems vs SAS: The Final Countdown
HPCC Systems
 
Unafraid of Change: Optimizing ETL, ML, and AI in Fast-Paced Environments wit...
Unafraid of Change: Optimizing ETL, ML, and AI in Fast-Paced Environments wit...Unafraid of Change: Optimizing ETL, ML, and AI in Fast-Paced Environments wit...
Unafraid of Change: Optimizing ETL, ML, and AI in Fast-Paced Environments wit...
Databricks
 

What's hot (20)

VARIATIONS IN OUTCOME FOR THE SAME MAP REDUCE TRANSITIVE CLOSURE ALGORITHM IM...
VARIATIONS IN OUTCOME FOR THE SAME MAP REDUCE TRANSITIVE CLOSURE ALGORITHM IM...VARIATIONS IN OUTCOME FOR THE SAME MAP REDUCE TRANSITIVE CLOSURE ALGORITHM IM...
VARIATIONS IN OUTCOME FOR THE SAME MAP REDUCE TRANSITIVE CLOSURE ALGORITHM IM...
 
07.bootstrapping
07.bootstrapping07.bootstrapping
07.bootstrapping
 
CS215 - Lec 9 indexing and reclaiming space in files
CS215 - Lec 9  indexing and reclaiming space in filesCS215 - Lec 9  indexing and reclaiming space in files
CS215 - Lec 9 indexing and reclaiming space in files
 
Data engineering and analytics using python
Data engineering and analytics using pythonData engineering and analytics using python
Data engineering and analytics using python
 
Honey I Shrunk the Database
Honey I Shrunk the DatabaseHoney I Shrunk the Database
Honey I Shrunk the Database
 
Tech Talk - JPA and Query Optimization - publish
Tech Talk  -  JPA and Query Optimization - publishTech Talk  -  JPA and Query Optimization - publish
Tech Talk - JPA and Query Optimization - publish
 
A Workshop on R
A Workshop on RA Workshop on R
A Workshop on R
 
20190110 tfug fukuoka
20190110 tfug fukuoka20190110 tfug fukuoka
20190110 tfug fukuoka
 
Training in Analytics, R and Social Media Analytics
Training in Analytics, R and Social Media AnalyticsTraining in Analytics, R and Social Media Analytics
Training in Analytics, R and Social Media Analytics
 
MySQL 8.0: What Is New in Optimizer and Executor?
MySQL 8.0: What Is New in Optimizer and Executor?MySQL 8.0: What Is New in Optimizer and Executor?
MySQL 8.0: What Is New in Optimizer and Executor?
 
Data Structures for Statistical Computing in Python
Data Structures for Statistical Computing in PythonData Structures for Statistical Computing in Python
Data Structures for Statistical Computing in Python
 
final_copy_camera_ready_paper (7)
final_copy_camera_ready_paper (7)final_copy_camera_ready_paper (7)
final_copy_camera_ready_paper (7)
 
Machine Learning in R
Machine Learning in RMachine Learning in R
Machine Learning in R
 
ACADILD:: HADOOP LESSON
ACADILD:: HADOOP LESSON ACADILD:: HADOOP LESSON
ACADILD:: HADOOP LESSON
 
HPCC Systems vs SAS: The Final Countdown
HPCC Systems vs SAS: The Final CountdownHPCC Systems vs SAS: The Final Countdown
HPCC Systems vs SAS: The Final Countdown
 
Unafraid of Change: Optimizing ETL, ML, and AI in Fast-Paced Environments wit...
Unafraid of Change: Optimizing ETL, ML, and AI in Fast-Paced Environments wit...Unafraid of Change: Optimizing ETL, ML, and AI in Fast-Paced Environments wit...
Unafraid of Change: Optimizing ETL, ML, and AI in Fast-Paced Environments wit...
 
R Programming For Beginners | R Language Tutorial | R Tutorial For Beginners ...
R Programming For Beginners | R Language Tutorial | R Tutorial For Beginners ...R Programming For Beginners | R Language Tutorial | R Tutorial For Beginners ...
R Programming For Beginners | R Language Tutorial | R Tutorial For Beginners ...
 
Stacks
StacksStacks
Stacks
 
Hadoop for Data Science: Moving from BI dashboards to R models, using Hive st...
Hadoop for Data Science: Moving from BI dashboards to R models, using Hive st...Hadoop for Data Science: Moving from BI dashboards to R models, using Hive st...
Hadoop for Data Science: Moving from BI dashboards to R models, using Hive st...
 
Apriori algorithm
Apriori algorithmApriori algorithm
Apriori algorithm
 

Viewers also liked (6)

Visual Compression of Workflow Visualizations with Automated Detection of Mac...
Visual Compression of Workflow Visualizations with Automated Detection of Mac...Visual Compression of Workflow Visualizations with Automated Detection of Mac...
Visual Compression of Workflow Visualizations with Automated Detection of Mac...
 
Taxonomy-Based Glyph Design
Taxonomy-Based Glyph DesignTaxonomy-Based Glyph Design
Taxonomy-Based Glyph Design
 
Clusterix at VDS 2016
Clusterix at VDS 2016Clusterix at VDS 2016
Clusterix at VDS 2016
 
Principles of Data Visualization
Principles of Data VisualizationPrinciples of Data Visualization
Principles of Data Visualization
 
Web valley talk - usability, visualization and mobile app development
Web valley talk - usability, visualization and mobile app developmentWeb valley talk - usability, visualization and mobile app development
Web valley talk - usability, visualization and mobile app development
 
Visualization of Publication Impact
Visualization of Publication ImpactVisualization of Publication Impact
Visualization of Publication Impact
 

Similar to HEPData

Analysis crop trials using climate data
Analysis crop trials using climate dataAnalysis crop trials using climate data
Analysis crop trials using climate data
Alberto Labarga
 
Frequency Response with MATLAB Examples.pdf
Frequency Response with MATLAB Examples.pdfFrequency Response with MATLAB Examples.pdf
Frequency Response with MATLAB Examples.pdf
Sunil Manjani
 
Climate data in r with the raster package
Climate data in r with the raster packageClimate data in r with the raster package
Climate data in r with the raster package
Alberto Labarga
 
High Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopHigh Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and Hadoop
Revolution Analytics
 

Similar to HEPData (20)

R Language Introduction
R Language IntroductionR Language Introduction
R Language Introduction
 
design-compiler.pdf
design-compiler.pdfdesign-compiler.pdf
design-compiler.pdf
 
Analysis crop trials using climate data
Analysis crop trials using climate dataAnalysis crop trials using climate data
Analysis crop trials using climate data
 
Logistic Regression in Case-Control Study
Logistic Regression in Case-Control StudyLogistic Regression in Case-Control Study
Logistic Regression in Case-Control Study
 
Uwe Friedrichsen – Extreme availability and self-healing data with CRDTs - No...
Uwe Friedrichsen – Extreme availability and self-healing data with CRDTs - No...Uwe Friedrichsen – Extreme availability and self-healing data with CRDTs - No...
Uwe Friedrichsen – Extreme availability and self-healing data with CRDTs - No...
 
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
 
Rattle Graphical Interface for R Language
Rattle Graphical Interface for R LanguageRattle Graphical Interface for R Language
Rattle Graphical Interface for R Language
 
Perm winter school 2014.01.31
Perm winter school 2014.01.31Perm winter school 2014.01.31
Perm winter school 2014.01.31
 
Gsas intro rvd (1)
Gsas intro rvd (1)Gsas intro rvd (1)
Gsas intro rvd (1)
 
R data-import, data-export
R data-import, data-exportR data-import, data-export
R data-import, data-export
 
Hands on data science with r.pptx
Hands  on data science with r.pptxHands  on data science with r.pptx
Hands on data science with r.pptx
 
Frequency Response with MATLAB Examples.pdf
Frequency Response with MATLAB Examples.pdfFrequency Response with MATLAB Examples.pdf
Frequency Response with MATLAB Examples.pdf
 
Climate data in r with the raster package
Climate data in r with the raster packageClimate data in r with the raster package
Climate data in r with the raster package
 
User biglm
User biglmUser biglm
User biglm
 
High Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopHigh Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and Hadoop
 
PPT ON MACHINE LEARNING by Ragini Ratre
PPT ON MACHINE LEARNING by Ragini RatrePPT ON MACHINE LEARNING by Ragini Ratre
PPT ON MACHINE LEARNING by Ragini Ratre
 
Qicheng yu
Qicheng yuQicheng yu
Qicheng yu
 
My Postdoctoral Research
My Postdoctoral ResearchMy Postdoctoral Research
My Postdoctoral Research
 
Introduction to Functional Data Analysis
Introduction to Functional Data AnalysisIntroduction to Functional Data Analysis
Introduction to Functional Data Analysis
 
Scala as a Declarative Language
Scala as a Declarative LanguageScala as a Declarative Language
Scala as a Declarative Language
 

Recently uploaded

Anemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditionsAnemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditions
muralinath2
 
Detectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a TechnosignatureDetectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a Technosignature
Sérgio Sacani
 
THYROID-PARATHYROID medical surgical nursing
THYROID-PARATHYROID medical surgical nursingTHYROID-PARATHYROID medical surgical nursing
THYROID-PARATHYROID medical surgical nursing
Jocelyn Atis
 
The solar dynamo begins near the surface
The solar dynamo begins near the surfaceThe solar dynamo begins near the surface
The solar dynamo begins near the surface
Sérgio Sacani
 
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
muralinath2
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
YOGESH DOGRA
 
Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...
Sérgio Sacani
 

Recently uploaded (20)

Anemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditionsAnemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditions
 
Detectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a TechnosignatureDetectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a Technosignature
 
GBSN - Microbiology (Lab 1) Microbiology Lab Safety Procedures
GBSN -  Microbiology (Lab  1) Microbiology Lab Safety ProceduresGBSN -  Microbiology (Lab  1) Microbiology Lab Safety Procedures
GBSN - Microbiology (Lab 1) Microbiology Lab Safety Procedures
 
THYROID-PARATHYROID medical surgical nursing
THYROID-PARATHYROID medical surgical nursingTHYROID-PARATHYROID medical surgical nursing
THYROID-PARATHYROID medical surgical nursing
 
mixotrophy in cyanobacteria: a dual nutritional strategy
mixotrophy in cyanobacteria: a dual nutritional strategymixotrophy in cyanobacteria: a dual nutritional strategy
mixotrophy in cyanobacteria: a dual nutritional strategy
 
National Biodiversity protection initiatives and Convention on Biological Di...
National Biodiversity protection initiatives and  Convention on Biological Di...National Biodiversity protection initiatives and  Convention on Biological Di...
National Biodiversity protection initiatives and Convention on Biological Di...
 
The solar dynamo begins near the surface
The solar dynamo begins near the surfaceThe solar dynamo begins near the surface
The solar dynamo begins near the surface
 
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
 
GBSN - Microbiology (Lab 2) Compound Microscope
GBSN - Microbiology (Lab 2) Compound MicroscopeGBSN - Microbiology (Lab 2) Compound Microscope
GBSN - Microbiology (Lab 2) Compound Microscope
 
INSIGHT Partner Profile: Tampere University
INSIGHT Partner Profile: Tampere UniversityINSIGHT Partner Profile: Tampere University
INSIGHT Partner Profile: Tampere University
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
 
electrochemical gas sensors and their uses.pptx
electrochemical gas sensors and their uses.pptxelectrochemical gas sensors and their uses.pptx
electrochemical gas sensors and their uses.pptx
 
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
 
Lab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerinLab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerin
 
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
 
Erythropoiesis- Dr.E. Muralinath-C Kalyan
Erythropoiesis- Dr.E. Muralinath-C KalyanErythropoiesis- Dr.E. Muralinath-C Kalyan
Erythropoiesis- Dr.E. Muralinath-C Kalyan
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
 
Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...
 

HEPData

  • 2. What is it? HEP Scattering experiments going back to the 1950s Each group of scientists will analyse particular signals by processing large numbers of collision. The resulting analysis will be published as a paper. But where does the processed data go? RE P P --> X SQRT(S) IN GEV SIG IN MB 7000 95.35 ± 0.38 (stat) ± 1.25 (sys,experimental) ± 0.37 (sys,extrapolation)
  • 3. What is it? HEPDATA Physics paper Table description RE P P --> X SQRT(S) IN GEV SIG IN MB 7000 95.35 ± 0.38 (stat) ± 1.25 (sys,experimental) ± 0.37 (sys,extrapolation) RE P P --> X SQRT(S) IN GEV SIG IN MB 7000 95.35 ± 0.38 (stat) ± 1.25 (sys,experimental) ± 0.37 (sys,extrapolation) Table 1 Table 2 (F1) HEPData is the go to place for physicists to get access to the data underlying plots and tables in a publication. It also links to the scripts and ROOT files for instance used in the analysis (for reproducibility).
  • 5. What are we doing? Redesigning the interface from it’s current old school style Creating a new system based on invenio Building an interactive data visualization component Supporting a more streamlined data submission process
  • 6. Submission Archive {JSON} Data Record {JSON} Data Record ROOT PYTHON C++ ROOT submission.yaml data records External data files & links submission.zip
  • 7. HEPdata submission archive RE P P --> X SQRT(S) IN GEV SIG IN MB 7000 95.35 ± 0.38 (stat) ± 1.25 (sys,experimental) ± 0.37 (sys,extrapolation) HEPDATA Table 1 {JSON} Tables and plots Processes YAML file, inserts records in to database and links publication record with data and files. Web Server Table description Plots rendered automatically using a custom library built upon D3.js Tables rendered from JSON DownloadScripts
  • 8. The System - Demo hepdata.net
  • 15. Getting Data in, getting data out… Converter Convert from YAML to ROOT, YODA, CSV Validator Validate the YAML input to ensure a stress free submission Install via PIP, use as a web service, and contribute to more conversions! Install via PIP, easy to use API.
  • 17. Everything on Github! http://www.github.com/hepdata
  • 18. Acknowledgements Eamonn Maguire Jan Stypka Salvatore Mele HEPData @ CERN Graeme Watt Michael Whalley Frank Kraus HEPData @ Durham Questions? Lukas Heinrich HEPData @ NYU Kyle Cramner Alumni Laura Rueda-Garcia Michal Szoziak Summer Student and all the Inspire team including Javier Martin Montull, Jan Age Lavik, and Samuel Kaplun for their help!