SlideShare a Scribd company logo
Data Generation with
PROSPECT: A Probability
Specification Tool
Alan Ismaiel, Ivan Ruchkin, Jason Shu, Oleg Sokolsky, Insup
Lee
University of Pennsylvania Computer and Information Science Department
Winter Simulation Conference 2021
December 14th, 2021
1
Motivating Scenario: Autonomous Car
• Engineering team is building autonomous cleaning vehicle
• Team intends to simulate the vehicle in desired conditions:
• Time of day is determined by the cleaning schedule
• Lane occupancy is determined by the parking ticket history
• Obstacle detection rate differs by time of day
2
How should they simulate the conditions?
Motivating Scenario: Network Latency
• Network monitor estimates latency based on N latest ping delays
• Need simple synthetic data to test the monitor
• Goal: quickly generate a simple dataset of
• Observed ping delays
• Underlying network latencies
• Desirable properties of the dataset:
• Low latency on average
• Ping delays change occasionally over time
• High latency sometimes leads to high ping delays
3
How to generate this dataset?
Automated Data Generation
• Increasingly important: testing complex systems, deep learning
• Obtaining real data often infeasible or impractical
• Many information sources: requirements, common-sense constraints,
intuition, known statistics
Current data generation tools:
• Tailored to specific model
• Imperative sampling
• Little support for arbitrary constraints
• High complexity
4
Problem
Declaratively specify and automatically sample discrete temporal distribution
under known constraints:
● Algebraic constraints on marginal/joint/conditional probabilities
● Conditional/unconditional independence
● Temporal relations
5
Intractable in general
Approach Overview
1) Define tractable cases with a shared underlying model
2) The user specifies a distribution in our high-level declarative language
3) The specification is translated into polynomial equations
4) The system of polynomial equations is solved algebraically
5) If the solution defines a unique distribution, we sample it
6
Approach Overview
1) Define tractable cases with a shared underlying model
2) The user specifies a distribution in our high-level declarative language
3) The specification is translated into polynomial equations
4) The system of polynomial equations is solved algebraically
5) If the solution defines a unique distribution, we sample it
7
Discrete Time Markov Chains (DTMCs)
DTMC: A discrete stochastic process that adheres to the Markov Property,
where conditional probabilities of future states of the process depend only
on the present state.
8
Arbitrary DTMCs are difficult to specify
Three Case Types (I)
Static Case: Time is irrelevant, sampling is conducted i.i.d.
Time-Invariant Case: Sampling is not independent, but the temporal
distributions don’t change over time.
Time-Variant Case: Sampling is not independent, and the temporal
distributions change over time.
9
Three Case Types (II)
10
Approach Overview
1) Define tractable cases with a shared underlying model
2) The user specifies a distribution in our high-level declarative language
3) The specification is translated into polynomial equations
4) The system of polynomial equations is solved algebraically
5) If the solution defines a unique distribution, we sample it
11
Specification Language: Scenario
12
Specification: Case and Variables
DSL
13
Specification: Independence
DSL
14
Specification: Probability Constraints
DSL
15
Approach Overview
1) Define tractable cases with a shared underlying model
2) The user specifies a distribution in our high-level declarative language
3) The specification is translated into polynomial equations
4) The system of polynomial equations is solved algebraically
5) If the solution defines a unique distribution, we sample it
16
Parameterizing Specifications
Goal: Parameterize all the probability specifications into algebraic equations
We define O-Parameters to represent the probabilities of elementary events
over the user’s defined sample space.
Every syntax element can be expressed with O-Parameters:
• Parameterize conditional and unconditional event probabilities
• Parameterize conditional and unconditional independence
• Parameterize the stationary assumption (time invariant case)
• Parameterize recursive probability specifications (time variant case)
17
Motivating Scenario Parameters (1)
18
Motivating Scenario Parameters (2)
19
Motivating Scenario Parameters (3)
20
Approach Overview
1) Define tractable cases with a shared underlying model
2) The user specifies a distribution in our high-level declarative language
3) The specification is translated into polynomial equations
4) The system of polynomial equations is solved algebraically
5) If the solution defines a unique distribution, we sample it
21
Solving a System of Equations
Goal: Find unique distribution parameters satisfying the equations in step 3
Relies on Buchberger’s Algorithm and Cylindrical Algebraic Decomposition,
solving algorithms that are guaranteed to terminate.
Our implementation based on Wolfram Mathematica automatically picks an
appropriate solving algorithm, and returns solutions in complex numbers
22
• We only consider solutions that define valid probability distributions
Solving for Unique Solutions
Solving algorithm returns a set S of probability distributions.
● |S|= 1: the distribution is unique
● |S| > 1: the distribution is underspecified (not enough constraints for a
unique distribution)
● |S|< 1: the distribution is overspecified (conflicting constraints, no
distribution possible)
23
Solving the Motivating Scenario
24
Approach Overview
1) Define tractable cases with a shared underlying model
2) The user specifies a distribution in our high-level declarative language
3) The specification is translated into polynomial equations
4) The system of polynomial equations is solved algebraically
5) If the solution defines a unique distribution, we sample it
25
PROSPECT
PROSPECT is a software tool that allows users to provide an input made in
the specification language, and outputs generated data.
https://prospect.precise.seas.upenn.edu
https://github.com/bisc/prospect
26
PROSPECT (Pre-Recorded Demo)
27
Evaluation
Goal: Compare the required manual effort, length of code/specification, and
accuracy of data generation between PROSPECT approach and probabilistic
programming (PPL) baseline
For each scenario, we made two data generation programs in the PPL Pyro
v1.5.1:
1. Accurate solution, correctly interprets specifications and assumptions,
manually inferring the intended distribution
2. Naive solution, demonstrates plausible errors by ignoring implicit
dependencies between variables
28
Evaluation: Length of Code/Spec
29
PROSPECT specifications were substantially more succinct than
probabilistic programs, achieving 2-3x reduction of line count
Evaluation: Sampling Accuracy
30
Naive Baseline
Accurate Baseline
PROSPECT
Accurate baseline and PROSPECT were statistically
indistinguishable on a full sample of 10000 points, both
obtained more accurate results than the naive baseline
Future Work
• Syntax extensions for broader sampling settings:
• Continuous parametric distributions
• Probabilities constrained by variable values
• Semantic extensions for under-specified distributions:
• Resolving ambiguity with meta-models
• Tuning to available data
• Tool extensions for usability:
• Conditional termination of sampling
• When over-specified, return the minimal conflicting sub-spec
31
Conclusion
Our contributions:
1. A specification language for discrete distributions
2. An algebraic inference approach for distributions from the specifications
3. A software tool PROSPECT that implements the language and interface
4. An evaluation of PROSPECT on 3 case studies
We believe this approach can be used for simulation, probabilistic reasoning,
design and analysis, and other tasks that require probabilistic specifications.
https://prospect.precise.seas.upenn.edu
32
Related Works
DSLs: SESSL, NEDL, build deterministic designs with predefined patterns
• PROSPECT samples potentially complex random designs, compliments the work
Simulators: CARLA, Udacity, X-Plane, AirSim
• Focuses on specific discrete domains whereas PROSPECT performs at any level of abstraction
Graphical Models: Markov/Bayesian Networks
• PROSPECT represents a domain-agnostic approach that can create graphical models
PPLs: Pyro, Scenic
• Stronger focus on inferring a model given a program and dataset, whereas PROSPECT relies on explicit
declarative specifications
Coupla: ARTA, NORTA, VARTA, Stochastic Programming: SAMPL
• Focus on continuous distributions, not discrete, requires knowledge/data to choose models, PROSPECT does
not
33

More Related Content

What's hot

Static Analysis of Your OSS Project with Coverity
Static Analysis of Your OSS Project with CoverityStatic Analysis of Your OSS Project with Coverity
Static Analysis of Your OSS Project with Coverity
Samsung Open Source Group
 
Improving Code Quality In Medical Software Through Code Reviews - Vincit Teat...
Improving Code Quality In Medical Software Through Code Reviews - Vincit Teat...Improving Code Quality In Medical Software Through Code Reviews - Vincit Teat...
Improving Code Quality In Medical Software Through Code Reviews - Vincit Teat...
VincitOy
 
Keeping code clean
Keeping code cleanKeeping code clean
Keeping code clean
Brett Child
 
Code Review
Code ReviewCode Review
Code Review
Lukas Rypl
 
Code Review
Code ReviewCode Review
Code Review
Tu Hoang
 
The Psychology of C# Analysis
The Psychology of C# AnalysisThe Psychology of C# Analysis
The Psychology of C# Analysis
Coverity
 
Presentation slides: "How to get 100% code coverage"
Presentation slides: "How to get 100% code coverage" Presentation slides: "How to get 100% code coverage"
Presentation slides: "How to get 100% code coverage"
Rapita Systems Ltd
 
Making Strongly-typed NETCONF Usable
Making Strongly-typed NETCONF UsableMaking Strongly-typed NETCONF Usable
Making Strongly-typed NETCONF Usable
Open Networking Summit
 
PHPUnit with Magento
PHPUnit with MagentoPHPUnit with Magento
PHPUnit with Magento
Tu Hoang
 
White Box Testing
White Box TestingWhite Box Testing
White Box Testing
Alisha Roy
 
Git branching policy and review comment's prefix
Git branching policy and review comment's prefixGit branching policy and review comment's prefix
Git branching policy and review comment's prefix
Kumaresh Chandra Baruri
 
Code review
Code reviewCode review
Code review
dqpi
 
Bye Bye Cowboy Coder Days! (Legacy Code & TDD)
Bye Bye Cowboy Coder Days! (Legacy Code & TDD)Bye Bye Cowboy Coder Days! (Legacy Code & TDD)
Bye Bye Cowboy Coder Days! (Legacy Code & TDD)
Kaunas Java User Group
 
Code quality
Code qualityCode quality
Code quality
Provectus
 
Calculation of Cyclomatic complexity
Calculation of Cyclomatic complexityCalculation of Cyclomatic complexity
Calculation of Cyclomatic complexity
nikshaikh786
 
Quality metrics and angular js applications
Quality metrics and angular js applicationsQuality metrics and angular js applications
Quality metrics and angular js applicationsnadeembtech
 
Win at life with unit testing
Win at life with unit testingWin at life with unit testing
Win at life with unit testing
markstory
 
Code Review
Code ReviewCode Review
Code Reviewrantav
 

What's hot (20)

Static Analysis of Your OSS Project with Coverity
Static Analysis of Your OSS Project with CoverityStatic Analysis of Your OSS Project with Coverity
Static Analysis of Your OSS Project with Coverity
 
Improving Code Quality In Medical Software Through Code Reviews - Vincit Teat...
Improving Code Quality In Medical Software Through Code Reviews - Vincit Teat...Improving Code Quality In Medical Software Through Code Reviews - Vincit Teat...
Improving Code Quality In Medical Software Through Code Reviews - Vincit Teat...
 
Keeping code clean
Keeping code cleanKeeping code clean
Keeping code clean
 
Code Review
Code ReviewCode Review
Code Review
 
Code Review
Code ReviewCode Review
Code Review
 
The Psychology of C# Analysis
The Psychology of C# AnalysisThe Psychology of C# Analysis
The Psychology of C# Analysis
 
Presentation slides: "How to get 100% code coverage"
Presentation slides: "How to get 100% code coverage" Presentation slides: "How to get 100% code coverage"
Presentation slides: "How to get 100% code coverage"
 
Making Strongly-typed NETCONF Usable
Making Strongly-typed NETCONF UsableMaking Strongly-typed NETCONF Usable
Making Strongly-typed NETCONF Usable
 
PHPUnit with Magento
PHPUnit with MagentoPHPUnit with Magento
PHPUnit with Magento
 
Chapter17 of clean code
Chapter17 of clean codeChapter17 of clean code
Chapter17 of clean code
 
White Box Testing
White Box TestingWhite Box Testing
White Box Testing
 
Git branching policy and review comment's prefix
Git branching policy and review comment's prefixGit branching policy and review comment's prefix
Git branching policy and review comment's prefix
 
Code review
Code reviewCode review
Code review
 
Bye Bye Cowboy Coder Days! (Legacy Code & TDD)
Bye Bye Cowboy Coder Days! (Legacy Code & TDD)Bye Bye Cowboy Coder Days! (Legacy Code & TDD)
Bye Bye Cowboy Coder Days! (Legacy Code & TDD)
 
Code quality
Code qualityCode quality
Code quality
 
Calculation of Cyclomatic complexity
Calculation of Cyclomatic complexityCalculation of Cyclomatic complexity
Calculation of Cyclomatic complexity
 
Quality metrics and angular js applications
Quality metrics and angular js applicationsQuality metrics and angular js applications
Quality metrics and angular js applications
 
Win at life with unit testing
Win at life with unit testingWin at life with unit testing
Win at life with unit testing
 
Code Review
Code ReviewCode Review
Code Review
 
Rajesh - CV
Rajesh - CVRajesh - CV
Rajesh - CV
 

Similar to Data Generation with PROSPECT: a Probability Specification Tool

Revisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software TestingRevisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software Testing
Lionel Briand
 
a deep reinforced model for abstractive summarization
a deep reinforced model for abstractive summarizationa deep reinforced model for abstractive summarization
a deep reinforced model for abstractive summarization
JEE HYUN PARK
 
Smart like a Fox: How clever students trick dumb programming assignment asses...
Smart like a Fox: How clever students trick dumb programming assignment asses...Smart like a Fox: How clever students trick dumb programming assignment asses...
Smart like a Fox: How clever students trick dumb programming assignment asses...
Nane Kratzke
 
Online machine learning in Streaming Applications
Online machine learning in Streaming ApplicationsOnline machine learning in Streaming Applications
Online machine learning in Streaming Applications
Stavros Kontopoulos
 
Experimental Design for Distributed Machine Learning with Myles Baker
Experimental Design for Distributed Machine Learning with Myles BakerExperimental Design for Distributed Machine Learning with Myles Baker
Experimental Design for Distributed Machine Learning with Myles Baker
Databricks
 
ExplainableAI.pptx
ExplainableAI.pptxExplainableAI.pptx
ExplainableAI.pptx
Andrea Morichetta
 
DATI, AI E ROBOTICA @POLITO
DATI, AI E ROBOTICA @POLITODATI, AI E ROBOTICA @POLITO
DATI, AI E ROBOTICA @POLITO
MarcoMellia
 
Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012
Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012
Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012
TEST Huddle
 
ACSAC2016: Code Obfuscation Against Symbolic Execution Attacks
ACSAC2016: Code Obfuscation Against Symbolic Execution AttacksACSAC2016: Code Obfuscation Against Symbolic Execution Attacks
ACSAC2016: Code Obfuscation Against Symbolic Execution Attacks
Sebastian Banescu
 
Testing of Object-Oriented Software
Testing of Object-Oriented SoftwareTesting of Object-Oriented Software
Testing of Object-Oriented Software
Praveen Penumathsa
 
Generating test cases using UML Communication Diagram
Generating test cases using UML Communication Diagram Generating test cases using UML Communication Diagram
Generating test cases using UML Communication Diagram
Praveen Penumathsa
 
Machine Learning Algorithms for Anomaly Detection in Particles Accelerators T...
Machine Learning Algorithms for Anomaly Detection in Particles Accelerators T...Machine Learning Algorithms for Anomaly Detection in Particles Accelerators T...
Machine Learning Algorithms for Anomaly Detection in Particles Accelerators T...
Deep Learning Italia
 
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
HostedbyConfluent
 
Autonomous Systems: How to Address the Dilemma between Autonomy and Safety
Autonomous Systems: How to Address the Dilemma between Autonomy and SafetyAutonomous Systems: How to Address the Dilemma between Autonomy and Safety
Autonomous Systems: How to Address the Dilemma between Autonomy and Safety
Lionel Briand
 
From ensembles to computer networks
From ensembles to computer networksFrom ensembles to computer networks
From ensembles to computer networks
CSIRO
 
Compeition-Level Code Generation with AlphaCode.pptx
Compeition-Level Code Generation with AlphaCode.pptxCompeition-Level Code Generation with AlphaCode.pptx
Compeition-Level Code Generation with AlphaCode.pptx
San Kim
 
From sensor readings to prediction: on the process of developing practical so...
From sensor readings to prediction: on the process of developing practical so...From sensor readings to prediction: on the process of developing practical so...
From sensor readings to prediction: on the process of developing practical so...
Manuel Martín
 
Presentation for lama.pptx
Presentation for lama.pptxPresentation for lama.pptx
Presentation for lama.pptx
AdityaNath38
 

Similar to Data Generation with PROSPECT: a Probability Specification Tool (20)

Revisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software TestingRevisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software Testing
 
a deep reinforced model for abstractive summarization
a deep reinforced model for abstractive summarizationa deep reinforced model for abstractive summarization
a deep reinforced model for abstractive summarization
 
Smart like a Fox: How clever students trick dumb programming assignment asses...
Smart like a Fox: How clever students trick dumb programming assignment asses...Smart like a Fox: How clever students trick dumb programming assignment asses...
Smart like a Fox: How clever students trick dumb programming assignment asses...
 
Online machine learning in Streaming Applications
Online machine learning in Streaming ApplicationsOnline machine learning in Streaming Applications
Online machine learning in Streaming Applications
 
Experimental Design for Distributed Machine Learning with Myles Baker
Experimental Design for Distributed Machine Learning with Myles BakerExperimental Design for Distributed Machine Learning with Myles Baker
Experimental Design for Distributed Machine Learning with Myles Baker
 
ExplainableAI.pptx
ExplainableAI.pptxExplainableAI.pptx
ExplainableAI.pptx
 
DATI, AI E ROBOTICA @POLITO
DATI, AI E ROBOTICA @POLITODATI, AI E ROBOTICA @POLITO
DATI, AI E ROBOTICA @POLITO
 
Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012
Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012
Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012
 
ACSAC2016: Code Obfuscation Against Symbolic Execution Attacks
ACSAC2016: Code Obfuscation Against Symbolic Execution AttacksACSAC2016: Code Obfuscation Against Symbolic Execution Attacks
ACSAC2016: Code Obfuscation Against Symbolic Execution Attacks
 
Testing of Object-Oriented Software
Testing of Object-Oriented SoftwareTesting of Object-Oriented Software
Testing of Object-Oriented Software
 
Generating test cases using UML Communication Diagram
Generating test cases using UML Communication Diagram Generating test cases using UML Communication Diagram
Generating test cases using UML Communication Diagram
 
Machine Learning Algorithms for Anomaly Detection in Particles Accelerators T...
Machine Learning Algorithms for Anomaly Detection in Particles Accelerators T...Machine Learning Algorithms for Anomaly Detection in Particles Accelerators T...
Machine Learning Algorithms for Anomaly Detection in Particles Accelerators T...
 
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
 
Autonomous Systems: How to Address the Dilemma between Autonomy and Safety
Autonomous Systems: How to Address the Dilemma between Autonomy and SafetyAutonomous Systems: How to Address the Dilemma between Autonomy and Safety
Autonomous Systems: How to Address the Dilemma between Autonomy and Safety
 
From ensembles to computer networks
From ensembles to computer networksFrom ensembles to computer networks
From ensembles to computer networks
 
Compeition-Level Code Generation with AlphaCode.pptx
Compeition-Level Code Generation with AlphaCode.pptxCompeition-Level Code Generation with AlphaCode.pptx
Compeition-Level Code Generation with AlphaCode.pptx
 
From sensor readings to prediction: on the process of developing practical so...
From sensor readings to prediction: on the process of developing practical so...From sensor readings to prediction: on the process of developing practical so...
From sensor readings to prediction: on the process of developing practical so...
 
Computer Engineer Master Project
Computer Engineer Master ProjectComputer Engineer Master Project
Computer Engineer Master Project
 
Presentation for lama.pptx
Presentation for lama.pptxPresentation for lama.pptx
Presentation for lama.pptx
 
SSBSE10.ppt
SSBSE10.pptSSBSE10.ppt
SSBSE10.ppt
 

More from Ivan Ruchkin

Language-Enhanced Latent Representations for Out-of-Distribution Detection in...
Language-Enhanced Latent Representations for Out-of-Distribution Detection in...Language-Enhanced Latent Representations for Out-of-Distribution Detection in...
Language-Enhanced Latent Representations for Out-of-Distribution Detection in...
Ivan Ruchkin
 
​Poster: Zero-shot Safety Prediction for Autonomous Robots with Foundation Wo...
​Poster: Zero-shot Safety Prediction for Autonomous Robots with Foundation Wo...​Poster: Zero-shot Safety Prediction for Autonomous Robots with Foundation Wo...
​Poster: Zero-shot Safety Prediction for Autonomous Robots with Foundation Wo...
Ivan Ruchkin
 
Curating Naturally Adversarial Datasets for Learning-Enabled Medical Cyber-Ph...
Curating Naturally Adversarial Datasets for Learning-Enabled Medical Cyber-Ph...Curating Naturally Adversarial Datasets for Learning-Enabled Medical Cyber-Ph...
Curating Naturally Adversarial Datasets for Learning-Enabled Medical Cyber-Ph...
Ivan Ruchkin
 
Repairing Learning-Enabled Controllers While Preserving What Works
Repairing Learning-Enabled Controllers While Preserving What WorksRepairing Learning-Enabled Controllers While Preserving What Works
Repairing Learning-Enabled Controllers While Preserving What Works
Ivan Ruchkin
 
Poster: Conservative Safety Monitors of Stochastic Dynamical Systems
Poster: Conservative Safety Monitors of Stochastic Dynamical SystemsPoster: Conservative Safety Monitors of Stochastic Dynamical Systems
Poster: Conservative Safety Monitors of Stochastic Dynamical Systems
Ivan Ruchkin
 
Poster: How Safe Am I Given What I See? Calibrated Prediction of Safety Chanc...
Poster: How Safe Am I Given What I See? Calibrated Prediction of Safety Chanc...Poster: How Safe Am I Given What I See? Calibrated Prediction of Safety Chanc...
Poster: How Safe Am I Given What I See? Calibrated Prediction of Safety Chanc...
Ivan Ruchkin
 
Verify-then-Monitor: Calibration Guarantees for Safety Confidence
Verify-then-Monitor: Calibration Guarantees for Safety ConfidenceVerify-then-Monitor: Calibration Guarantees for Safety Confidence
Verify-then-Monitor: Calibration Guarantees for Safety Confidence
Ivan Ruchkin
 
Causal Repair of Learning-Enabled Cyber-physical Systems
Causal Repair of Learning-Enabled Cyber-physical SystemsCausal Repair of Learning-Enabled Cyber-physical Systems
Causal Repair of Learning-Enabled Cyber-physical Systems
Ivan Ruchkin
 
Conservative Safety Monitors of Stochastic Dynamical Systems
Conservative Safety Monitors of Stochastic Dynamical SystemsConservative Safety Monitors of Stochastic Dynamical Systems
Conservative Safety Monitors of Stochastic Dynamical Systems
Ivan Ruchkin
 
Confidence Composition for Monitors of Verification Assumptions
Confidence Composition for Monitors of Verification AssumptionsConfidence Composition for Monitors of Verification Assumptions
Confidence Composition for Monitors of Verification Assumptions
Ivan Ruchkin
 
Overcoming Heterogeneity in Autonomous Cyber-Physical Systems
Overcoming Heterogeneity in Autonomous Cyber-Physical SystemsOvercoming Heterogeneity in Autonomous Cyber-Physical Systems
Overcoming Heterogeneity in Autonomous Cyber-Physical Systems
Ivan Ruchkin
 
High-Confidence Data Programming for Evaluating Suppression of Physiological ...
High-Confidence Data Programming for Evaluating Suppression of Physiological ...High-Confidence Data Programming for Evaluating Suppression of Physiological ...
High-Confidence Data Programming for Evaluating Suppression of Physiological ...
Ivan Ruchkin
 
Confidence Monitoring and Composition for Dynamic Assurance of Learning-Enabl...
Confidence Monitoring and Composition for Dynamic Assurance of Learning-Enabl...Confidence Monitoring and Composition for Dynamic Assurance of Learning-Enabl...
Confidence Monitoring and Composition for Dynamic Assurance of Learning-Enabl...
Ivan Ruchkin
 
Confidence Composition (CoCo) for Dynamic Assurance of Learning-Enabled Auton...
Confidence Composition (CoCo) for Dynamic Assurance of Learning-Enabled Auton...Confidence Composition (CoCo) for Dynamic Assurance of Learning-Enabled Auton...
Confidence Composition (CoCo) for Dynamic Assurance of Learning-Enabled Auton...
Ivan Ruchkin
 
Confidence Monitoring and Composition for Dynamic Assurance of Learning-Enabl...
Confidence Monitoring and Composition for Dynamic Assurance of Learning-Enabl...Confidence Monitoring and Composition for Dynamic Assurance of Learning-Enabl...
Confidence Monitoring and Composition for Dynamic Assurance of Learning-Enabl...
Ivan Ruchkin
 
On the Role of Assumptions in Engineering Smart Systems
On the Role of Assumptions in Engineering Smart SystemsOn the Role of Assumptions in Engineering Smart Systems
On the Role of Assumptions in Engineering Smart Systems
Ivan Ruchkin
 
Compositional Probabilistic Analysis of Temporal Properties over Stochastic D...
Compositional Probabilistic Analysis of Temporal Properties over Stochastic D...Compositional Probabilistic Analysis of Temporal Properties over Stochastic D...
Compositional Probabilistic Analysis of Temporal Properties over Stochastic D...
Ivan Ruchkin
 
Overview of Epidemic Models for COVID-19
Overview of Epidemic Models for COVID-19Overview of Epidemic Models for COVID-19
Overview of Epidemic Models for COVID-19
Ivan Ruchkin
 
Thesis Defense: Integration of Modeling Methods for Cyber-Physical Systems
Thesis Defense: Integration of Modeling Methods for Cyber-Physical SystemsThesis Defense: Integration of Modeling Methods for Cyber-Physical Systems
Thesis Defense: Integration of Modeling Methods for Cyber-Physical Systems
Ivan Ruchkin
 
Towards a Formal Framework for Hybrid Planning in Self-Adaptation
Towards a Formal Framework for Hybrid Planning in Self-AdaptationTowards a Formal Framework for Hybrid Planning in Self-Adaptation
Towards a Formal Framework for Hybrid Planning in Self-Adaptation
Ivan Ruchkin
 

More from Ivan Ruchkin (20)

Language-Enhanced Latent Representations for Out-of-Distribution Detection in...
Language-Enhanced Latent Representations for Out-of-Distribution Detection in...Language-Enhanced Latent Representations for Out-of-Distribution Detection in...
Language-Enhanced Latent Representations for Out-of-Distribution Detection in...
 
​Poster: Zero-shot Safety Prediction for Autonomous Robots with Foundation Wo...
​Poster: Zero-shot Safety Prediction for Autonomous Robots with Foundation Wo...​Poster: Zero-shot Safety Prediction for Autonomous Robots with Foundation Wo...
​Poster: Zero-shot Safety Prediction for Autonomous Robots with Foundation Wo...
 
Curating Naturally Adversarial Datasets for Learning-Enabled Medical Cyber-Ph...
Curating Naturally Adversarial Datasets for Learning-Enabled Medical Cyber-Ph...Curating Naturally Adversarial Datasets for Learning-Enabled Medical Cyber-Ph...
Curating Naturally Adversarial Datasets for Learning-Enabled Medical Cyber-Ph...
 
Repairing Learning-Enabled Controllers While Preserving What Works
Repairing Learning-Enabled Controllers While Preserving What WorksRepairing Learning-Enabled Controllers While Preserving What Works
Repairing Learning-Enabled Controllers While Preserving What Works
 
Poster: Conservative Safety Monitors of Stochastic Dynamical Systems
Poster: Conservative Safety Monitors of Stochastic Dynamical SystemsPoster: Conservative Safety Monitors of Stochastic Dynamical Systems
Poster: Conservative Safety Monitors of Stochastic Dynamical Systems
 
Poster: How Safe Am I Given What I See? Calibrated Prediction of Safety Chanc...
Poster: How Safe Am I Given What I See? Calibrated Prediction of Safety Chanc...Poster: How Safe Am I Given What I See? Calibrated Prediction of Safety Chanc...
Poster: How Safe Am I Given What I See? Calibrated Prediction of Safety Chanc...
 
Verify-then-Monitor: Calibration Guarantees for Safety Confidence
Verify-then-Monitor: Calibration Guarantees for Safety ConfidenceVerify-then-Monitor: Calibration Guarantees for Safety Confidence
Verify-then-Monitor: Calibration Guarantees for Safety Confidence
 
Causal Repair of Learning-Enabled Cyber-physical Systems
Causal Repair of Learning-Enabled Cyber-physical SystemsCausal Repair of Learning-Enabled Cyber-physical Systems
Causal Repair of Learning-Enabled Cyber-physical Systems
 
Conservative Safety Monitors of Stochastic Dynamical Systems
Conservative Safety Monitors of Stochastic Dynamical SystemsConservative Safety Monitors of Stochastic Dynamical Systems
Conservative Safety Monitors of Stochastic Dynamical Systems
 
Confidence Composition for Monitors of Verification Assumptions
Confidence Composition for Monitors of Verification AssumptionsConfidence Composition for Monitors of Verification Assumptions
Confidence Composition for Monitors of Verification Assumptions
 
Overcoming Heterogeneity in Autonomous Cyber-Physical Systems
Overcoming Heterogeneity in Autonomous Cyber-Physical SystemsOvercoming Heterogeneity in Autonomous Cyber-Physical Systems
Overcoming Heterogeneity in Autonomous Cyber-Physical Systems
 
High-Confidence Data Programming for Evaluating Suppression of Physiological ...
High-Confidence Data Programming for Evaluating Suppression of Physiological ...High-Confidence Data Programming for Evaluating Suppression of Physiological ...
High-Confidence Data Programming for Evaluating Suppression of Physiological ...
 
Confidence Monitoring and Composition for Dynamic Assurance of Learning-Enabl...
Confidence Monitoring and Composition for Dynamic Assurance of Learning-Enabl...Confidence Monitoring and Composition for Dynamic Assurance of Learning-Enabl...
Confidence Monitoring and Composition for Dynamic Assurance of Learning-Enabl...
 
Confidence Composition (CoCo) for Dynamic Assurance of Learning-Enabled Auton...
Confidence Composition (CoCo) for Dynamic Assurance of Learning-Enabled Auton...Confidence Composition (CoCo) for Dynamic Assurance of Learning-Enabled Auton...
Confidence Composition (CoCo) for Dynamic Assurance of Learning-Enabled Auton...
 
Confidence Monitoring and Composition for Dynamic Assurance of Learning-Enabl...
Confidence Monitoring and Composition for Dynamic Assurance of Learning-Enabl...Confidence Monitoring and Composition for Dynamic Assurance of Learning-Enabl...
Confidence Monitoring and Composition for Dynamic Assurance of Learning-Enabl...
 
On the Role of Assumptions in Engineering Smart Systems
On the Role of Assumptions in Engineering Smart SystemsOn the Role of Assumptions in Engineering Smart Systems
On the Role of Assumptions in Engineering Smart Systems
 
Compositional Probabilistic Analysis of Temporal Properties over Stochastic D...
Compositional Probabilistic Analysis of Temporal Properties over Stochastic D...Compositional Probabilistic Analysis of Temporal Properties over Stochastic D...
Compositional Probabilistic Analysis of Temporal Properties over Stochastic D...
 
Overview of Epidemic Models for COVID-19
Overview of Epidemic Models for COVID-19Overview of Epidemic Models for COVID-19
Overview of Epidemic Models for COVID-19
 
Thesis Defense: Integration of Modeling Methods for Cyber-Physical Systems
Thesis Defense: Integration of Modeling Methods for Cyber-Physical SystemsThesis Defense: Integration of Modeling Methods for Cyber-Physical Systems
Thesis Defense: Integration of Modeling Methods for Cyber-Physical Systems
 
Towards a Formal Framework for Hybrid Planning in Self-Adaptation
Towards a Formal Framework for Hybrid Planning in Self-AdaptationTowards a Formal Framework for Hybrid Planning in Self-Adaptation
Towards a Formal Framework for Hybrid Planning in Self-Adaptation
 

Recently uploaded

Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.
ViralQR
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
Vlad Stirbu
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 

Recently uploaded (20)

Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 

Data Generation with PROSPECT: a Probability Specification Tool

  • 1. Data Generation with PROSPECT: A Probability Specification Tool Alan Ismaiel, Ivan Ruchkin, Jason Shu, Oleg Sokolsky, Insup Lee University of Pennsylvania Computer and Information Science Department Winter Simulation Conference 2021 December 14th, 2021 1
  • 2. Motivating Scenario: Autonomous Car • Engineering team is building autonomous cleaning vehicle • Team intends to simulate the vehicle in desired conditions: • Time of day is determined by the cleaning schedule • Lane occupancy is determined by the parking ticket history • Obstacle detection rate differs by time of day 2 How should they simulate the conditions?
  • 3. Motivating Scenario: Network Latency • Network monitor estimates latency based on N latest ping delays • Need simple synthetic data to test the monitor • Goal: quickly generate a simple dataset of • Observed ping delays • Underlying network latencies • Desirable properties of the dataset: • Low latency on average • Ping delays change occasionally over time • High latency sometimes leads to high ping delays 3 How to generate this dataset?
  • 4. Automated Data Generation • Increasingly important: testing complex systems, deep learning • Obtaining real data often infeasible or impractical • Many information sources: requirements, common-sense constraints, intuition, known statistics Current data generation tools: • Tailored to specific model • Imperative sampling • Little support for arbitrary constraints • High complexity 4
  • 5. Problem Declaratively specify and automatically sample discrete temporal distribution under known constraints: ● Algebraic constraints on marginal/joint/conditional probabilities ● Conditional/unconditional independence ● Temporal relations 5 Intractable in general
  • 6. Approach Overview 1) Define tractable cases with a shared underlying model 2) The user specifies a distribution in our high-level declarative language 3) The specification is translated into polynomial equations 4) The system of polynomial equations is solved algebraically 5) If the solution defines a unique distribution, we sample it 6
  • 7. Approach Overview 1) Define tractable cases with a shared underlying model 2) The user specifies a distribution in our high-level declarative language 3) The specification is translated into polynomial equations 4) The system of polynomial equations is solved algebraically 5) If the solution defines a unique distribution, we sample it 7
  • 8. Discrete Time Markov Chains (DTMCs) DTMC: A discrete stochastic process that adheres to the Markov Property, where conditional probabilities of future states of the process depend only on the present state. 8 Arbitrary DTMCs are difficult to specify
  • 9. Three Case Types (I) Static Case: Time is irrelevant, sampling is conducted i.i.d. Time-Invariant Case: Sampling is not independent, but the temporal distributions don’t change over time. Time-Variant Case: Sampling is not independent, and the temporal distributions change over time. 9
  • 10. Three Case Types (II) 10
  • 11. Approach Overview 1) Define tractable cases with a shared underlying model 2) The user specifies a distribution in our high-level declarative language 3) The specification is translated into polynomial equations 4) The system of polynomial equations is solved algebraically 5) If the solution defines a unique distribution, we sample it 11
  • 13. Specification: Case and Variables DSL 13
  • 16. Approach Overview 1) Define tractable cases with a shared underlying model 2) The user specifies a distribution in our high-level declarative language 3) The specification is translated into polynomial equations 4) The system of polynomial equations is solved algebraically 5) If the solution defines a unique distribution, we sample it 16
  • 17. Parameterizing Specifications Goal: Parameterize all the probability specifications into algebraic equations We define O-Parameters to represent the probabilities of elementary events over the user’s defined sample space. Every syntax element can be expressed with O-Parameters: • Parameterize conditional and unconditional event probabilities • Parameterize conditional and unconditional independence • Parameterize the stationary assumption (time invariant case) • Parameterize recursive probability specifications (time variant case) 17
  • 21. Approach Overview 1) Define tractable cases with a shared underlying model 2) The user specifies a distribution in our high-level declarative language 3) The specification is translated into polynomial equations 4) The system of polynomial equations is solved algebraically 5) If the solution defines a unique distribution, we sample it 21
  • 22. Solving a System of Equations Goal: Find unique distribution parameters satisfying the equations in step 3 Relies on Buchberger’s Algorithm and Cylindrical Algebraic Decomposition, solving algorithms that are guaranteed to terminate. Our implementation based on Wolfram Mathematica automatically picks an appropriate solving algorithm, and returns solutions in complex numbers 22 • We only consider solutions that define valid probability distributions
  • 23. Solving for Unique Solutions Solving algorithm returns a set S of probability distributions. ● |S|= 1: the distribution is unique ● |S| > 1: the distribution is underspecified (not enough constraints for a unique distribution) ● |S|< 1: the distribution is overspecified (conflicting constraints, no distribution possible) 23
  • 24. Solving the Motivating Scenario 24
  • 25. Approach Overview 1) Define tractable cases with a shared underlying model 2) The user specifies a distribution in our high-level declarative language 3) The specification is translated into polynomial equations 4) The system of polynomial equations is solved algebraically 5) If the solution defines a unique distribution, we sample it 25
  • 26. PROSPECT PROSPECT is a software tool that allows users to provide an input made in the specification language, and outputs generated data. https://prospect.precise.seas.upenn.edu https://github.com/bisc/prospect 26
  • 28. Evaluation Goal: Compare the required manual effort, length of code/specification, and accuracy of data generation between PROSPECT approach and probabilistic programming (PPL) baseline For each scenario, we made two data generation programs in the PPL Pyro v1.5.1: 1. Accurate solution, correctly interprets specifications and assumptions, manually inferring the intended distribution 2. Naive solution, demonstrates plausible errors by ignoring implicit dependencies between variables 28
  • 29. Evaluation: Length of Code/Spec 29 PROSPECT specifications were substantially more succinct than probabilistic programs, achieving 2-3x reduction of line count
  • 30. Evaluation: Sampling Accuracy 30 Naive Baseline Accurate Baseline PROSPECT Accurate baseline and PROSPECT were statistically indistinguishable on a full sample of 10000 points, both obtained more accurate results than the naive baseline
  • 31. Future Work • Syntax extensions for broader sampling settings: • Continuous parametric distributions • Probabilities constrained by variable values • Semantic extensions for under-specified distributions: • Resolving ambiguity with meta-models • Tuning to available data • Tool extensions for usability: • Conditional termination of sampling • When over-specified, return the minimal conflicting sub-spec 31
  • 32. Conclusion Our contributions: 1. A specification language for discrete distributions 2. An algebraic inference approach for distributions from the specifications 3. A software tool PROSPECT that implements the language and interface 4. An evaluation of PROSPECT on 3 case studies We believe this approach can be used for simulation, probabilistic reasoning, design and analysis, and other tasks that require probabilistic specifications. https://prospect.precise.seas.upenn.edu 32
  • 33. Related Works DSLs: SESSL, NEDL, build deterministic designs with predefined patterns • PROSPECT samples potentially complex random designs, compliments the work Simulators: CARLA, Udacity, X-Plane, AirSim • Focuses on specific discrete domains whereas PROSPECT performs at any level of abstraction Graphical Models: Markov/Bayesian Networks • PROSPECT represents a domain-agnostic approach that can create graphical models PPLs: Pyro, Scenic • Stronger focus on inferring a model given a program and dataset, whereas PROSPECT relies on explicit declarative specifications Coupla: ARTA, NORTA, VARTA, Stochastic Programming: SAMPL • Focus on continuous distributions, not discrete, requires knowledge/data to choose models, PROSPECT does not 33