SlideShare a Scribd company logo
© 2020 MLPerf
MLPerf: An Industry Standard
Benchmark Suite for Machine
Learning
Carole-Jean Wu
carolejeanwu@fb.com
Facebook, ASU
MLPerf Inference Chair and
Recommendation Benchmark Advisory Board Chair
(Work by Many People in the MLPerf Community)
© 2020 MLPerf
Deep Learning is Fueling the HW Renaissance
2
Survey and Benchmarking of AI Accelerators. Reuther et al. MIT Lincoln Lab Supercomputing Center. Arxiv-2019
ML hardware is projected to be a ~$60B industry in 2025.
(Tractica.com $66.3B, Marketsandmarkets.com: $59.2B)
© 2020 MLPerf
How Do We Compare AI Hardware?
3
Systems
What task?
What model?
What dataset?
What batch size?
What quantization?
What software libraries?
…
© 2020 MLPerf
We Need a Benchmark for ML
4
“What get measured, gets improved.”
— Peter Drucker
Benchmarking aligns research with development,
engineering with marketing,
and competitors across the industry in
pursuit of a clear objective
© 2020 MLPerf
Agenda
5
• Why ML needs a benchmark suite?
• Are there lessons we can borrow?
• What is MLPerf?
• How does MLPerf curate a benchmark?
• What is the “science” behind the curation?
• What comes next for MLPerf?
• How can we contribute to MLPerf?
© 2020 MLPerf
Are There Lessons We Can
Borrow?
6
© 2020 MLPerf
Successful History in Benchmarks
7
© 2020 MLPerf
SPEC Impact
8
• Settled arguments in the marketplace (grow the pie)
• Resolved internal engineering debates (better investments)
• Cooperative nonprofit corporation with 22 members
• Universities join at modest cost and help drive innovation
• Became standard in marketplace, papers, and textbooks
• Needed to revise regularly to maintain usefulness:
SPEC89, SPEC92, SPEC95, SPEC2000, SPEC2006, SPEC2017
⇒ Fueled the golden age of microprocessor design
© 2020 MLPerf
How Do We Start a New
Golden Age for ML Systems?
9
© 2020 MLPerf
Agenda
10
• Why ML needs a benchmark suite?
• Are there lessons we can borrow?
• What is MLPerf?
• How does MLPerf curate a benchmark?
• Training
• Inference
• What comes next for MLPerf?
• How can we contribute to MLPerf?
© 2020 MLPerf
MLPerf is an ML Performance Benchmarking Effort with Wide
Industry and Academic Support
11
Researchers from
© 2020 MLPerf
ML Benchmark Design Overview
12
Big Questions Training Inference
1. Benchmark definition What is a benchmark task?
2. Benchmark selection Which benchmark tasks?
3. Metric definition What is performance?
4. Implementation equivalence
How do submitters run on very different hardware/software
systems?
5. Issues specific to training or
inference
Which hyperparameters can
submitters tune? Quantization, calibration,
and/or retraining?
Reduce result variance?
6. Presentation Do we normalize and/or summarize results?
© 2020 MLPerf
Training Benchmark Definition
13
Train a
model 75.9%
Target Quality1
Dataset
1. Target quality set by experts in area, raised as SOTA improves
Do we specify the model?
© 2020 MLPerf
MLPerf Training Benchmark Suite (v0.7)
14
Criteo Terabyte CTR
DLRM
(Naumov et al., 2019)
0.8025 AUC
75.9% Top-1 accuracy
23.0% mAP
0.712 Mask-LM accuracy
50% win rate
NLP
Go
(19 x 19 Board)
Wikipedia
01/01/2020
BERT
© 2020 MLPerf
Training Metric: Time to Reach Quality Target
• Quality target is specific for each benchmark and close to state-
of-the-art
• Updated w/ each release to keep up with the SOTA
• Time includes preprocessing and validation over of
N runs
• MLPerf provides the reference implementations that achieve
quality target
15
© 2020 MLPerf
Agenda
16
• Why ML needs a benchmark suite?
• Are there lessons we can borrow?
• What is MLPerf?
• How does MLPerf curate a benchmark?
• Training
• Inference
• What comes next for MLPerf?
• How can we contribute to MLPerf?
© 2020 MLPerf
Inference Benchmark Definition
17
“Leopard”
Result
(with required quality, e.g. 75.1%)
Input
Trained
ResNet
Do you specify the model? Closed Division does, Open Division does not.
© 2020 MLPerf
But How is Inference Really Used?
18
Single stream
(e.g. cell phone augmented vision)
Multiple stream
(e.g. multiple camera driving assistance)
Server
(e.g. translation app)
Offline
(e.g. photo sorting app)
© 2020 MLPerf
MLPerf Inference Benchmark Suite (v0.5)
19
Area Task Model Dataset
Vision Image classification Resnet50-v1.5 ImageNet (224x224)
Vision Image classification MobileNets-v1 224 ImageNet (224x224)
Vision Object detection SSD-ResNet34 COCO (1200x1200)
Vision Object detection SSD-MobileNets-v1 COCO (300x300)
Language Machine translation GNMT WMT16
Minimum-viable-benchmark, maximize submitters, reflect real use cases
© 2020 MLPerf
Inference Metric: One Metric for Each Scenario
20
Single stream
(e.g. cell phone
augmented vision)
Offline
(e.g. photo sorting app)
Multiple stream
(e.g. multiple camera
driving assistance)
Server
(e.g. translation app)
Latency
# of streams
subject to latency
bound
QPS
subject to latency
bound
Throughput
© 2020 MLPerf
Inference Benchmark Suite v0.5
21
Area Task Model Dataset
Vision Image classification Resnet50-v1.5 ImageNet (224x224)
Vision Image classification MobileNets-v1 224 ImageNet (224x224)
Vision Object detection SSD-ResNet34 COCO (1200x1200)
Vision Object detection SSD-MobileNets-v1 COCO (300x300)
Language Machine translation GNMT WMT16
Minimum-viable-benchmark, maximize submitters, reflect real use cases
© 2020 MLPerf
MLPerf Inference Benchmark Suite v0.7
22
Neural Network
Vision
ResNet-50 v1.5
SSD ResNet-34
SSD MobileNet v1 (edge)
3D UNET
Speech RNN-T
Language BERT Large
Commerce DLRM (datacenter)
MOBILE Neural Network
Vision
MobileNet EdgeTPU
SSD-MobileNet v2
DeepLabv3
Language Mobile-BERT
© 2020 MLPerf
MLPerf Training and Inference Call for Submissions
23
Closed division submissions
• Enables apples-to-apples comparison
• Requires using the specified model
• Limits overfitting
• Simplifies work for HW groups
Open division submissions
• Encourages innovation
• Open division allows using any model
• Ensures Closed division does not
stagnate
© 2020 MLPerf
Timeline for Benchmark Result Submission
Benchmark
Freeze
Code &
Rule Freeze
Submission
Deadline
Result
Publication
Week 0
Week -12 Week -6 Week 4
ATTEND WEEKLY SUBMITTER WORKING GROUP MEETINGS
• Discuss details on rules and
models and implementations
• Develop the SW stack
• Clarify rules
• Fine-tune SW for HW
• Pass compliance check
• Sign CLA
• Release SW
• Review submission results
• Prepare result release
© 2020 MLPerf
MLPerf Inference Results
https://mlperf.org/inference-results/
25
Benchmarks
Scenarios
Submissions
Systems
Inference
Scores
© 2020 MLPerf
MLPerf Inference v0.5 Presented Almost 600 Results Across a
Wide Range of Platform Scales
26
Normalized performance on log10 scale for models and scenarios. Results are normalized to the slowest system and
show up to a 10,000X range in performance.
MobileNet
(SS)
ResNet50
(SS)
SSD-MobileNet
(SS)
GNMT
(SS)
GNMT
(O)
GNMT
(S)
Inference
Performance
© 2020 MLPerf
Agenda
27
• Why ML needs a benchmark suite?
• Are there lessons we can borrow?
• What is MLPerf?
• How does MLPerf curate a benchmark?
• Training
• Inference
• What comes next for MLPerf?
• How can we contribute to MLPerf?
© 2020 MLPerf
MLPerf Focuses for 2020
28
Evolve the benchmark suites fairly
Improve efficiency information
Reduce result sparsity
Reduce benchmarking cost
Inference (Datacenter/Edge)
Inference (Mobile)
© 2020 MLPerf
Conclusion
29
• Benchmarking ML Systems is hard
due to the fragmented ecosystem
• MLPerf is a community-driven ML
benchmark for the HW/SW industry
• The benchmark suite helps level the
playing field, enabling ML system comparison
• Defines Tasks, Scenarios, Datasets, Methods
• Establish clear set of metrics and divisions
• Allows for hardware/software flexibility
© 2020 MLPerf
MLPerf is the Work of Many
30
Aaron Zhong David Patterson Jared Duke Peter Bailis
Abid Muslim Debajyoti Pal Jeff Jiao Peter Baldwin
Andrew Hock Debo Dutta Jeffery Liao Peter Mattson
Ankur Ankur Deepak Narayanan Jonah Alben Ramesh Chukka
Anton Lokhmotov Dehao Chen Jonathan Cohen Sachin Idgunji
Arun Rajan Dilip Sequeira Kim Hazelwood Sam Davis
Ashish Sirasao Ephrem Wu Koichi Yamada Sarah Bird
Atsushi Ike Fei Sun Lillian Pentecost Sergey Serebryakov
Bill Jia Francisco Massa Lingjie Xu Steve Farrell
Bing Yu Frank Wei Mark Charlebois Taylor Robie
Brian Anderson Gennady Pekhimenko Masafumi Yamazaki Tayo Oguntebi
Carole-Jean Wu George Yuan Matei Zaharia Thomas B. Jablin
Christine Cheng Greg Diamos Maximilien Breughe Tom St. John
Cliff Young Gu-Yeon Wei Michael Thomson Tsuguchika Tabaru
Cody Coleman Guenther Schmuelling Naveen Kumar Udit Gupta
Colin Osborne Guokai Ma Pan Deng Victor Bittorf
Daniel Kang Hanlin Tang Pankaj Kanwar Vijay Janapa Reddi
Dave Fick Itay Hubara Paulius Micikevicious William Chou
David Brooks J. Scott Gardner Peizhao Zhang Xinyuan Huang
David Kanter Jacob Balma Peng Meng Yuchen Zhou
© 2020 Your Company Name
© 2020 MLPerf
MLPerf Needs Your Help!
31
• Join the discussion community at mlperf.org
• Help us by joining a working group
• Cloud training and inference
• Mobile Inference, HPC
• On-premises scale, submitters, special topics
• Help us design submission criteria, to include the data you want
• Propose new benchmarks and data sets
• Address challenging “special topic” issues
• Submit your benchmark results!
© 2020 MLPerf
More at MLPerf.org
info@mlperf.org
32
Latest Inference Results
October 19, 2020
© 2020 MLPerf
Thank You
33
MLPerf: An Industry Standard Benchmark Suite for
Machine Learning
Carole-Jean Wu
carolejeanwu@fb.com
MLPerf Inference Chair
Recommendation Benchmark Advisory Board Chair

More Related Content

Similar to “MLPerf: An Industry Standard Performance Benchmark Suite for Machine Learning,” a Presentation from Facebook and Arizona State University

DevOpsGuys FutureDecoded 2016 - is DevOps the Answer
DevOpsGuys FutureDecoded 2016 - is DevOps the AnswerDevOpsGuys FutureDecoded 2016 - is DevOps the Answer
DevOpsGuys FutureDecoded 2016 - is DevOps the Answer
DevOpsGroup
 
Devops Strategy Roadmap Lifecycle Ppt Powerpoint Presentation Slides Complete...
Devops Strategy Roadmap Lifecycle Ppt Powerpoint Presentation Slides Complete...Devops Strategy Roadmap Lifecycle Ppt Powerpoint Presentation Slides Complete...
Devops Strategy Roadmap Lifecycle Ppt Powerpoint Presentation Slides Complete...
SlideTeam
 
CIAB Febraban - Michael Wagner
CIAB Febraban - Michael Wagner CIAB Febraban - Michael Wagner
CIAB Febraban - Michael Wagner
CNseg
 
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...
Alan Said
 
Cloud and Network Transformation using DevOps methodology : Cisco Live 2015
Cloud and Network Transformation using DevOps methodology : Cisco Live 2015Cloud and Network Transformation using DevOps methodology : Cisco Live 2015
Cloud and Network Transformation using DevOps methodology : Cisco Live 2015
Vimal Suba
 
Limited Budget but Effective End to End MLOps Practices (Machine Learning Mod...
Limited Budget but Effective End to End MLOps Practices (Machine Learning Mod...Limited Budget but Effective End to End MLOps Practices (Machine Learning Mod...
Limited Budget but Effective End to End MLOps Practices (Machine Learning Mod...
IRJET Journal
 
Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...
Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...
Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...
Aditya Bhattacharya
 
Network and IT Ops Series: Build Production Solutions
Network and IT Ops Series: Build Production Solutions Network and IT Ops Series: Build Production Solutions
Network and IT Ops Series: Build Production Solutions
Neo4j
 
DevOps 2021 Research
DevOps 2021 ResearchDevOps 2021 Research
DevOps 2021 Research
Enterprise Management Associates
 
Innovate2011 DevOps TSRM RTC
Innovate2011 DevOps TSRM RTCInnovate2011 DevOps TSRM RTC
Innovate2011 DevOps TSRM RTC
Steve Speicher
 
.NET Fest 2019. Оля Гавриш. Машинное обучение для .NET программистов
.NET Fest 2019. Оля Гавриш. Машинное обучение для .NET программистов.NET Fest 2019. Оля Гавриш. Машинное обучение для .NET программистов
.NET Fest 2019. Оля Гавриш. Машинное обучение для .NET программистов
NETFest
 
Experimentation to Industrialization: Implementing MLOps
Experimentation to Industrialization: Implementing MLOpsExperimentation to Industrialization: Implementing MLOps
Experimentation to Industrialization: Implementing MLOps
Databricks
 
ICLR 2020 Recap
ICLR 2020 RecapICLR 2020 Recap
ICLR 2020 Recap
Sri Ambati
 
Microsoft Office 365
Microsoft Office 365Microsoft Office 365
Microsoft Office 365
Dipayan Das
 
PureApp Hybrid Cloud - Mark Willemse ING Presentation 11th September 2014
PureApp Hybrid Cloud - Mark Willemse ING Presentation 11th September 2014PureApp Hybrid Cloud - Mark Willemse ING Presentation 11th September 2014
PureApp Hybrid Cloud - Mark Willemse ING Presentation 11th September 2014
IBM Systems UKI
 
Сергей Баранов. Enterprise DevOps
Сергей Баранов. Enterprise DevOpsСергей Баранов. Enterprise DevOps
Сергей Баранов. Enterprise DevOps
ScrumTrek
 
Microsoft Technical Lead Resume (1)
Microsoft Technical Lead Resume (1)Microsoft Technical Lead Resume (1)
Microsoft Technical Lead Resume (1)Ritanshu Barnwal
 
MSR 2022 Foundational Contribution Award Talk: Software Analytics: Reflection...
MSR 2022 Foundational Contribution Award Talk: Software Analytics: Reflection...MSR 2022 Foundational Contribution Award Talk: Software Analytics: Reflection...
MSR 2022 Foundational Contribution Award Talk: Software Analytics: Reflection...
Tao Xie
 

Similar to “MLPerf: An Industry Standard Performance Benchmark Suite for Machine Learning,” a Presentation from Facebook and Arizona State University (20)

DevOpsGuys FutureDecoded 2016 - is DevOps the Answer
DevOpsGuys FutureDecoded 2016 - is DevOps the AnswerDevOpsGuys FutureDecoded 2016 - is DevOps the Answer
DevOpsGuys FutureDecoded 2016 - is DevOps the Answer
 
Devops Strategy Roadmap Lifecycle Ppt Powerpoint Presentation Slides Complete...
Devops Strategy Roadmap Lifecycle Ppt Powerpoint Presentation Slides Complete...Devops Strategy Roadmap Lifecycle Ppt Powerpoint Presentation Slides Complete...
Devops Strategy Roadmap Lifecycle Ppt Powerpoint Presentation Slides Complete...
 
CIAB Febraban - Michael Wagner
CIAB Febraban - Michael Wagner CIAB Febraban - Michael Wagner
CIAB Febraban - Michael Wagner
 
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...
 
Cloud and Network Transformation using DevOps methodology : Cisco Live 2015
Cloud and Network Transformation using DevOps methodology : Cisco Live 2015Cloud and Network Transformation using DevOps methodology : Cisco Live 2015
Cloud and Network Transformation using DevOps methodology : Cisco Live 2015
 
Limited Budget but Effective End to End MLOps Practices (Machine Learning Mod...
Limited Budget but Effective End to End MLOps Practices (Machine Learning Mod...Limited Budget but Effective End to End MLOps Practices (Machine Learning Mod...
Limited Budget but Effective End to End MLOps Practices (Machine Learning Mod...
 
Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...
Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...
Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...
 
Network and IT Ops Series: Build Production Solutions
Network and IT Ops Series: Build Production Solutions Network and IT Ops Series: Build Production Solutions
Network and IT Ops Series: Build Production Solutions
 
DevOps 2021 Research
DevOps 2021 ResearchDevOps 2021 Research
DevOps 2021 Research
 
Innovate2011 DevOps TSRM RTC
Innovate2011 DevOps TSRM RTCInnovate2011 DevOps TSRM RTC
Innovate2011 DevOps TSRM RTC
 
.NET Fest 2019. Оля Гавриш. Машинное обучение для .NET программистов
.NET Fest 2019. Оля Гавриш. Машинное обучение для .NET программистов.NET Fest 2019. Оля Гавриш. Машинное обучение для .NET программистов
.NET Fest 2019. Оля Гавриш. Машинное обучение для .NET программистов
 
Experimentation to Industrialization: Implementing MLOps
Experimentation to Industrialization: Implementing MLOpsExperimentation to Industrialization: Implementing MLOps
Experimentation to Industrialization: Implementing MLOps
 
ICLR 2020 Recap
ICLR 2020 RecapICLR 2020 Recap
ICLR 2020 Recap
 
Microsoft Office 365
Microsoft Office 365Microsoft Office 365
Microsoft Office 365
 
Corporate Mailer
Corporate MailerCorporate Mailer
Corporate Mailer
 
PureApp Hybrid Cloud - Mark Willemse ING Presentation 11th September 2014
PureApp Hybrid Cloud - Mark Willemse ING Presentation 11th September 2014PureApp Hybrid Cloud - Mark Willemse ING Presentation 11th September 2014
PureApp Hybrid Cloud - Mark Willemse ING Presentation 11th September 2014
 
Сергей Баранов. Enterprise DevOps
Сергей Баранов. Enterprise DevOpsСергей Баранов. Enterprise DevOps
Сергей Баранов. Enterprise DevOps
 
Ravindra Prasad
Ravindra PrasadRavindra Prasad
Ravindra Prasad
 
Microsoft Technical Lead Resume (1)
Microsoft Technical Lead Resume (1)Microsoft Technical Lead Resume (1)
Microsoft Technical Lead Resume (1)
 
MSR 2022 Foundational Contribution Award Talk: Software Analytics: Reflection...
MSR 2022 Foundational Contribution Award Talk: Software Analytics: Reflection...MSR 2022 Foundational Contribution Award Talk: Software Analytics: Reflection...
MSR 2022 Foundational Contribution Award Talk: Software Analytics: Reflection...
 

More from Edge AI and Vision Alliance

“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...
“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...
“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...
Edge AI and Vision Alliance
 
“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...
“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...
“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...
Edge AI and Vision Alliance
 
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
Edge AI and Vision Alliance
 
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
Edge AI and Vision Alliance
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
Edge AI and Vision Alliance
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
Edge AI and Vision Alliance
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
Edge AI and Vision Alliance
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
Edge AI and Vision Alliance
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
Edge AI and Vision Alliance
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...
Edge AI and Vision Alliance
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
Edge AI and Vision Alliance
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
Edge AI and Vision Alliance
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
Edge AI and Vision Alliance
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
Edge AI and Vision Alliance
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
Edge AI and Vision Alliance
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
Edge AI and Vision Alliance
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
Edge AI and Vision Alliance
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
Edge AI and Vision Alliance
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara
Edge AI and Vision Alliance
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
Edge AI and Vision Alliance
 

More from Edge AI and Vision Alliance (20)

“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...
“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...
“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...
 
“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...
“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...
“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...
 
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
 
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
 

Recently uploaded

PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
Abida Shariff
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
Fwdays
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 

Recently uploaded (20)

PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 

“MLPerf: An Industry Standard Performance Benchmark Suite for Machine Learning,” a Presentation from Facebook and Arizona State University

  • 1. © 2020 MLPerf MLPerf: An Industry Standard Benchmark Suite for Machine Learning Carole-Jean Wu carolejeanwu@fb.com Facebook, ASU MLPerf Inference Chair and Recommendation Benchmark Advisory Board Chair (Work by Many People in the MLPerf Community)
  • 2. © 2020 MLPerf Deep Learning is Fueling the HW Renaissance 2 Survey and Benchmarking of AI Accelerators. Reuther et al. MIT Lincoln Lab Supercomputing Center. Arxiv-2019 ML hardware is projected to be a ~$60B industry in 2025. (Tractica.com $66.3B, Marketsandmarkets.com: $59.2B)
  • 3. © 2020 MLPerf How Do We Compare AI Hardware? 3 Systems What task? What model? What dataset? What batch size? What quantization? What software libraries? …
  • 4. © 2020 MLPerf We Need a Benchmark for ML 4 “What get measured, gets improved.” — Peter Drucker Benchmarking aligns research with development, engineering with marketing, and competitors across the industry in pursuit of a clear objective
  • 5. © 2020 MLPerf Agenda 5 • Why ML needs a benchmark suite? • Are there lessons we can borrow? • What is MLPerf? • How does MLPerf curate a benchmark? • What is the “science” behind the curation? • What comes next for MLPerf? • How can we contribute to MLPerf?
  • 6. © 2020 MLPerf Are There Lessons We Can Borrow? 6
  • 7. © 2020 MLPerf Successful History in Benchmarks 7
  • 8. © 2020 MLPerf SPEC Impact 8 • Settled arguments in the marketplace (grow the pie) • Resolved internal engineering debates (better investments) • Cooperative nonprofit corporation with 22 members • Universities join at modest cost and help drive innovation • Became standard in marketplace, papers, and textbooks • Needed to revise regularly to maintain usefulness: SPEC89, SPEC92, SPEC95, SPEC2000, SPEC2006, SPEC2017 ⇒ Fueled the golden age of microprocessor design
  • 9. © 2020 MLPerf How Do We Start a New Golden Age for ML Systems? 9
  • 10. © 2020 MLPerf Agenda 10 • Why ML needs a benchmark suite? • Are there lessons we can borrow? • What is MLPerf? • How does MLPerf curate a benchmark? • Training • Inference • What comes next for MLPerf? • How can we contribute to MLPerf?
  • 11. © 2020 MLPerf MLPerf is an ML Performance Benchmarking Effort with Wide Industry and Academic Support 11 Researchers from
  • 12. © 2020 MLPerf ML Benchmark Design Overview 12 Big Questions Training Inference 1. Benchmark definition What is a benchmark task? 2. Benchmark selection Which benchmark tasks? 3. Metric definition What is performance? 4. Implementation equivalence How do submitters run on very different hardware/software systems? 5. Issues specific to training or inference Which hyperparameters can submitters tune? Quantization, calibration, and/or retraining? Reduce result variance? 6. Presentation Do we normalize and/or summarize results?
  • 13. © 2020 MLPerf Training Benchmark Definition 13 Train a model 75.9% Target Quality1 Dataset 1. Target quality set by experts in area, raised as SOTA improves Do we specify the model?
  • 14. © 2020 MLPerf MLPerf Training Benchmark Suite (v0.7) 14 Criteo Terabyte CTR DLRM (Naumov et al., 2019) 0.8025 AUC 75.9% Top-1 accuracy 23.0% mAP 0.712 Mask-LM accuracy 50% win rate NLP Go (19 x 19 Board) Wikipedia 01/01/2020 BERT
  • 15. © 2020 MLPerf Training Metric: Time to Reach Quality Target • Quality target is specific for each benchmark and close to state- of-the-art • Updated w/ each release to keep up with the SOTA • Time includes preprocessing and validation over of N runs • MLPerf provides the reference implementations that achieve quality target 15
  • 16. © 2020 MLPerf Agenda 16 • Why ML needs a benchmark suite? • Are there lessons we can borrow? • What is MLPerf? • How does MLPerf curate a benchmark? • Training • Inference • What comes next for MLPerf? • How can we contribute to MLPerf?
  • 17. © 2020 MLPerf Inference Benchmark Definition 17 “Leopard” Result (with required quality, e.g. 75.1%) Input Trained ResNet Do you specify the model? Closed Division does, Open Division does not.
  • 18. © 2020 MLPerf But How is Inference Really Used? 18 Single stream (e.g. cell phone augmented vision) Multiple stream (e.g. multiple camera driving assistance) Server (e.g. translation app) Offline (e.g. photo sorting app)
  • 19. © 2020 MLPerf MLPerf Inference Benchmark Suite (v0.5) 19 Area Task Model Dataset Vision Image classification Resnet50-v1.5 ImageNet (224x224) Vision Image classification MobileNets-v1 224 ImageNet (224x224) Vision Object detection SSD-ResNet34 COCO (1200x1200) Vision Object detection SSD-MobileNets-v1 COCO (300x300) Language Machine translation GNMT WMT16 Minimum-viable-benchmark, maximize submitters, reflect real use cases
  • 20. © 2020 MLPerf Inference Metric: One Metric for Each Scenario 20 Single stream (e.g. cell phone augmented vision) Offline (e.g. photo sorting app) Multiple stream (e.g. multiple camera driving assistance) Server (e.g. translation app) Latency # of streams subject to latency bound QPS subject to latency bound Throughput
  • 21. © 2020 MLPerf Inference Benchmark Suite v0.5 21 Area Task Model Dataset Vision Image classification Resnet50-v1.5 ImageNet (224x224) Vision Image classification MobileNets-v1 224 ImageNet (224x224) Vision Object detection SSD-ResNet34 COCO (1200x1200) Vision Object detection SSD-MobileNets-v1 COCO (300x300) Language Machine translation GNMT WMT16 Minimum-viable-benchmark, maximize submitters, reflect real use cases
  • 22. © 2020 MLPerf MLPerf Inference Benchmark Suite v0.7 22 Neural Network Vision ResNet-50 v1.5 SSD ResNet-34 SSD MobileNet v1 (edge) 3D UNET Speech RNN-T Language BERT Large Commerce DLRM (datacenter) MOBILE Neural Network Vision MobileNet EdgeTPU SSD-MobileNet v2 DeepLabv3 Language Mobile-BERT
  • 23. © 2020 MLPerf MLPerf Training and Inference Call for Submissions 23 Closed division submissions • Enables apples-to-apples comparison • Requires using the specified model • Limits overfitting • Simplifies work for HW groups Open division submissions • Encourages innovation • Open division allows using any model • Ensures Closed division does not stagnate
  • 24. © 2020 MLPerf Timeline for Benchmark Result Submission Benchmark Freeze Code & Rule Freeze Submission Deadline Result Publication Week 0 Week -12 Week -6 Week 4 ATTEND WEEKLY SUBMITTER WORKING GROUP MEETINGS • Discuss details on rules and models and implementations • Develop the SW stack • Clarify rules • Fine-tune SW for HW • Pass compliance check • Sign CLA • Release SW • Review submission results • Prepare result release
  • 25. © 2020 MLPerf MLPerf Inference Results https://mlperf.org/inference-results/ 25 Benchmarks Scenarios Submissions Systems Inference Scores
  • 26. © 2020 MLPerf MLPerf Inference v0.5 Presented Almost 600 Results Across a Wide Range of Platform Scales 26 Normalized performance on log10 scale for models and scenarios. Results are normalized to the slowest system and show up to a 10,000X range in performance. MobileNet (SS) ResNet50 (SS) SSD-MobileNet (SS) GNMT (SS) GNMT (O) GNMT (S) Inference Performance
  • 27. © 2020 MLPerf Agenda 27 • Why ML needs a benchmark suite? • Are there lessons we can borrow? • What is MLPerf? • How does MLPerf curate a benchmark? • Training • Inference • What comes next for MLPerf? • How can we contribute to MLPerf?
  • 28. © 2020 MLPerf MLPerf Focuses for 2020 28 Evolve the benchmark suites fairly Improve efficiency information Reduce result sparsity Reduce benchmarking cost Inference (Datacenter/Edge) Inference (Mobile)
  • 29. © 2020 MLPerf Conclusion 29 • Benchmarking ML Systems is hard due to the fragmented ecosystem • MLPerf is a community-driven ML benchmark for the HW/SW industry • The benchmark suite helps level the playing field, enabling ML system comparison • Defines Tasks, Scenarios, Datasets, Methods • Establish clear set of metrics and divisions • Allows for hardware/software flexibility
  • 30. © 2020 MLPerf MLPerf is the Work of Many 30 Aaron Zhong David Patterson Jared Duke Peter Bailis Abid Muslim Debajyoti Pal Jeff Jiao Peter Baldwin Andrew Hock Debo Dutta Jeffery Liao Peter Mattson Ankur Ankur Deepak Narayanan Jonah Alben Ramesh Chukka Anton Lokhmotov Dehao Chen Jonathan Cohen Sachin Idgunji Arun Rajan Dilip Sequeira Kim Hazelwood Sam Davis Ashish Sirasao Ephrem Wu Koichi Yamada Sarah Bird Atsushi Ike Fei Sun Lillian Pentecost Sergey Serebryakov Bill Jia Francisco Massa Lingjie Xu Steve Farrell Bing Yu Frank Wei Mark Charlebois Taylor Robie Brian Anderson Gennady Pekhimenko Masafumi Yamazaki Tayo Oguntebi Carole-Jean Wu George Yuan Matei Zaharia Thomas B. Jablin Christine Cheng Greg Diamos Maximilien Breughe Tom St. John Cliff Young Gu-Yeon Wei Michael Thomson Tsuguchika Tabaru Cody Coleman Guenther Schmuelling Naveen Kumar Udit Gupta Colin Osborne Guokai Ma Pan Deng Victor Bittorf Daniel Kang Hanlin Tang Pankaj Kanwar Vijay Janapa Reddi Dave Fick Itay Hubara Paulius Micikevicious William Chou David Brooks J. Scott Gardner Peizhao Zhang Xinyuan Huang David Kanter Jacob Balma Peng Meng Yuchen Zhou © 2020 Your Company Name
  • 31. © 2020 MLPerf MLPerf Needs Your Help! 31 • Join the discussion community at mlperf.org • Help us by joining a working group • Cloud training and inference • Mobile Inference, HPC • On-premises scale, submitters, special topics • Help us design submission criteria, to include the data you want • Propose new benchmarks and data sets • Address challenging “special topic” issues • Submit your benchmark results!
  • 32. © 2020 MLPerf More at MLPerf.org info@mlperf.org 32 Latest Inference Results October 19, 2020
  • 33. © 2020 MLPerf Thank You 33 MLPerf: An Industry Standard Benchmark Suite for Machine Learning Carole-Jean Wu carolejeanwu@fb.com MLPerf Inference Chair Recommendation Benchmark Advisory Board Chair