Quantifying thefuture

Improving Decision-Making Support
by Linking Database results to Simulations

Gio Wiederhold
Stanford University
July 2011

Gio Wiederhold SimQL 1

Problem : Mismatch
Database Technology should support Decision-Making

• What does database technology do?
o Databases provide information about past events
» Consistent
» Reliable
» Fast

• What does a decision-maker do?
o Guess how decisions will affect the future
» Multiple possibilities
» Uncertainty
» Slow, manual, multiple tools
8/17/2012 Gio: SimQL

Information Systems should also
Project into the Future

past now future
time

Support of decision-making requires dealing with the future ,
as well the past
• Databases deal well with the past
• Sensors can provide current status
• Spreadsheets, simulations deal with the likely futures
Information systems should be able to combine all three

Decision-making (DM)

Analyze Alternatives
• Current Capabilities
• Future Expectations
• Planning for them
now future

Process tasks:
• List resources
• Enumerate alternatives
• Prune alternative
• Compare alternatives
8/17/2012 Gio: SimQL Gio Wiederhold SimQL 4

Current Processes

• Data conversion to files for spreadsheets.
• Model building and testing by analysts
• Planning for likely future scenarios
• Recording expected results .
• Data collection • Comparing many scenarios .
• Data validation • Finding the best plans .
• Data integration • Advising the actual .
• Information selection decision maker
• Data reduction & summarization
• File generation for analysts

Progress in Data Integration

Information Integration has progressed in supporting
Decision Making
1. Integrate data from distributed sources
o Issues: inconsistency of scope and timing
2. Capture new relationships
o Often requires expert inter-domain knowledge

3. Include current sensor data
o Select streaming data
4. include predictions about future courses
******* A new, potentially major topic *******

DM support is disjoint
does not interoperate

Planning Science

extensions to move
Distribution to networked support
are also disjoint

Current state of DM Support
past now future

organized support disjointed support
x17 @qbfera
ffga 67 .78 jjkl,a
nsnd nn 23.5a Intuition +
Data integration • Spreadsheets
• Resource allocations
• Explicit simulations
Databases
various point assessments
distributed, heterogeneous

Past future time

Prediction Requires Tools

E-mail this book,
Alfred Knopf, 1997


Requirements for DM
• Ubiquitous access to simulations
of a wide variety of types
• Rapid response to parameter changes
o Access to up-to-date facts
o May need High-Performance recomputation

• Model, scenario, and choice retention
o Analysts’ planning to be reused
» But updatable


How to merge 2 disciplines
• Databases
o High-level languages
» Data descriptions
o Drive detailed processes
o Intentional
• Simulations & spreadsheets
o High-level languages
» Model desriptions
o Parameter driven
o Extensional

Integration concept
• Enable intentional simulation access
o Follow database model
» Similar to data description
o Provide interfaces
»To support needed processes

Create SimQL similar to SQL
schema & links to access procedures


Transform Data to Information
Database oo middle-

-)
Design management
Schema SQL user
Data Reports

:-(
Collection

Model value-added
:-)

Design services

Data-driven decision-makers
Modeling
Plans
o o


Language implementation
Stanford Experiment uses an existing SQL parser:
1. Replace the SELECT verb with ESTIMATE;
2. Remove the UPDATE statement. Nothing persists
3. Replace CREATE DATABASE with CREATE MODEL;
4. Add to the CREATE attributes IN, OUT, and INOUT;
5. Add a REGISTER statement to identify resources;
6. Replaced SQL’s functions code generators that access
stored data with functions that deliver the
a. Query IN parameters to various simulations
b. Collect the data specified as OUT parameters
c. Return the result.

Examples
SQL:
SELECT Temperature, Cloudcover, Windspeed,
Winddirection FROM
WeatherDB WHERE Date = `yesterday' AND
Location = `ORD'.

SimQL:
ESTIMATE Temperature, Cloudcover, Windspeed,
Winddirection FROM
WeatherSimulation WHERE Date = `tomorrow' AND
Location = `ORD'.

Available Functions
1. Continously executing: weather prediction
o SimQL result reports best match samples

2. Execution specific to query: Spreadsheet what-if assessment
o may require HPC power for adequate response

3. Past simulations collect results in a base: materials
o performs inter- or extra-polations to match query parameters

4. Combinations, i.e., 2. + 3.: top layer simulation using stored
partial lower level results: weapon performance in new setting
5. Human-in-the-loop: Wrapper for Amazon’s Mechanical Turk
Note
• A simulation service program can be written in any language
• A simulation service must be compliant to the interface spec.

System Concept Layout


Interfaces enable integration:
SimQL to access Simulations

past now future
time

Databases, Simulations,
accessed via SQL or
XML, CORBA compliant accessed via SimQL and
wrappers compliant wrappers
Msg
systems,
sensors

Current State of SimQL research
GUI
collect language
requirements

Test Application

wrapper wrapper wrapper

Spreadsheets Weather Engineering

Stanford Experiment Models

Logistics
Application Manufacturing
Application

SimQL access SimQL access
SimQL access
SQL access
wrapper wrapper
wrapper wrapper

Weather Test Engineering
Spreadsheets Data
(short-, long-term) Gio Wiederhold SimQL 20

More to be done
• Stanford experiment only produced point results.
• A decision maker would estimate multiple scenarios
1. Collect results identified with parameters
2. Provide search functions to compare results
1. Consider time lines for result synchronization
3. Support pruning of low-value results
4. Deliver only high-value results to decision-maker


Use of Simulation Results
0.6 0.3 0.2
0.1 0.07
0.5 0.03
0.5 0.5 0.3

0.1 0.2
time
0.4 0.2 0.1 0.1 prob

Simulation results can be composed for
alternative Courses-of-actions
Composition should include computation
and recomputation of likelihoods
Likelihoods change as now moves forwards
and eliminates earlier alternatives.

Estimates have probabilities
• p=30% chance of rain
• Flight p=91% likely to arrive with 15 min of ETA
• Interest rate p=50% same, p=25% 1% higher, … .
• Employee p=50% returns to work in a week, … .
• Project p=10% completed in time, …
• Spreadsheets can compute alternative values
with such data provided by the model builder,
not the SimQL user.


The branches can be labeled with probabilities,
then assessed using the outcome with values
prob

value

0.1 100 0.3 1000
Next period alternatives
1200 0.4 2000
0.5
and subsequent periods 600
0.6 0.1 5000
66
1266 0.1 0.3 1100 0.2
500
1000
134 0.2
0.3 200 200 0
0.1
-1086 -420
0.07 0 -6000
0.4 -1220
0.2
-820 0.13
-400 -3000
Values
past now future
time

Integrating data & planning support will make
our data reusable and much more valuable

A Pruned Bush
Re-assess as time 100 ? ?
marches forward ! 1200 600
1000

1266 ? 2000
1100 500
66 5000
200 200
1000
0
0
past now future
time
Spreadsheets,
Databases, . . . other simulations,

8/17/2012
Msgs
Gio: SimQL

sensors

Even the present needs SimQL
point-in-time for
last recorded observations situational
assessment

simple simulations
to extrapolate data

past now future
time

Is the delivery truck in X?
Not all data are current: • Is the right stuff on the truck?
• Will the crew be at X?
• Will the forces be ready to accept delivery?


Use of Simulation Results

Simulation results can be composed for
Alternative Courses-of-actions
Composition should be seamless, elegant,
with computation and recomputation of
likelihoods
Results change as now moves forwards and
eliminates earlier alternatives.

Summary
Databases Simulations should
• serve clients via SQL by • serve clients via SimQL by
Sharing a Model (The Schema)
Sharing a Model (research q.)
A query language over the model A query language over the model
the SQL interface enables a SimQL interface enables
• independence of • independence of
application development application development
DBMS technology development simulation technology develop’t
reuse of infrastructure reuse of infrastructure
Today Objective
• most new systems use a • build information systems
DBMS for data storage combining DBMS, Simulations
even with less performance, even with less performance,
inability to handle all problems, inability to handle all problems,
but enough of them well enough. but enough of them . . .

Further research questions

• How to move seamlessly from the past to the future?
• How can multiple futures be managed (indexed)?
• How can multiple futures be compared, selected?
• How should joint uncertainty be computed?
• How can the NOW point be moved automatically?


Future information systems
Combine data from the past, with current data,
knowledge, and predictions into the future

oo
o o

o o

Assessment of the
values of alternative
possible outcomes


SimQL research questions

• How little of the model needs to be exposed?
• How can defaults be set rationally?
• How should expected execution cost be reported?
• How should uncertainty be reported?
• Are there differences among application areas that
require different language structures?
• Are there differences among application areas that
require different language features?
• How will the language interface support effective
partitioning and distribution?

Moving to a Service Paradigm

• Server is an independent contractor, defines service
• Client selects service, and specifies parameters
• Server’s success depends on value provided
• Some form of payment received for services

x,y

Databases are a current example.
Simulations have the same potential.

Summary of SimQL
A new service for Decision Making:
• follows database paradigm
– ( by about 25 years )
• coherence in prediction
– displacement of ad-hoc practices
• seamless information integration
– single paradigm for decision makers
• simulation industry infrastructure
– investment has a potential market
– should follows database industry model:
Interfaces promote new industries

Publications
Gio Wiederhold: "Information Systems that Really
Support Decision-making"; 11th International
Symposium on Methodologies for Intelligent Systems
(ISMIS), Warsaw Poland, June 1999, in Ras & Skowron
Foundations for Intelligent Systems, Springer LNAI
1609, pages 56-66
Gio Wiederhold and Rushan Jiang: “Augmenting
Information Systems with Access to Predictive Tools”;
http://infolab.stanford.edu/pub/gio/2000/VLDB2000-1.htm

The specifics of the language as implemented are at
http://www-db.stanford.edu/LIC/SimQL.html


Quantifying thefuture

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (11)

Similar to Quantifying thefuture

Similar to Quantifying thefuture (20)

More from Gio Wiederhold

More from Gio Wiederhold (8)

Recently uploaded

Recently uploaded (20)

Quantifying thefuture