Whenever decision makers lament the problem with metrics...it's not really a problem with metrics. The problem lies with the design. Here's our take on how to better create a framework from the beginning that informs and is informed by how you will assess the expenditure of resources.
1. FROM QUALITATIVE TO
QUANTITATIVE AND BACK AGAIN
MAJ Kristin Saling
Maj David Kerns
USPACOM J83, Strategic Assessments
Camp Smith, HI
Optimizing Resource-Informed Metrics
for the Strategic Decision Maker
UNCLASSIFIED
This briefing is:
3. AGENDA
• WHAT IS ASSESSMENT?
• PROBLEMS WITH ASSESSMENT
• ELEMENTS OF THE PROBLEM
• ADDRESSING THE PROBLEM
– DEFINING THE WIN
– DEVELOPING THE FRAMEWORK
– ANALYZING THE FRAMEWORK
– LEADERSHIP ENGAGEMENT
• RESOURCES
• DISCUSSION
UNCLASSIFIED
UNCLASSIFIED
4. WHAT IS ASSESSMENT?
Assessment is a process that evaluates changes
in the environment and measures progress of
the joint force toward mission accomplishment.
-Joint Pub 3-0
UNCLASSIFIED
UNCLASSIFIED
5. FUNCTIONS OF ASSESSMENT
• Provide a picture of the current state of the
operational environment
• Provide context for stakeholder decisions
involving the most effective application of
forces, resources and policy
• Provide insight on whether current policies,
forces, and resources are effective
UNCLASSIFIED
UNCLASSIFIED
6. PROBLEMS WITH ASSESSMENT
• Not providing the right level of insight and
context to the commander’s decision cycle
• Measuring too many things
• Measuring the wrong things
• Showing too many numbers
• Not showing enough background numbers
• Numbers without proper context
• False appearance of mathematical rigor
• Lack of linkage between objectives & metrics
• Failure to integrate performance measures
UNCLASSIFIED
UNCLASSIFIED
7. PROBLEMS WITH ASSESSMENT
• Guidance for assessment stops at telling staffs to
conduct assessment and some differentiation
between MOEs and MOPs
• We have a problem with metrics, but the metrics
are not the only problem
• If we don’t better codify how to create a total
campaign framework, tying in environmental
indicators, capabilities, and resources needed, we
will not have a good assessment
UNCLASSIFIED
UNCLASSIFIED
8. ELEMENTS OF THE PROBLEM
Leadership
Contradictory guidance
and interest from leaders
on what they expect the
assessment to deliver
Defining
the Win
Incomplete mission
analysis prevents creation
of linkages between
objectives, metrics, and
success
Framework
Current framework focuses
on the wrong elements,
colors and numbers vs.
insights and analysis
UNCLASSIFIED
UNCLASSIFIED
9. LEADERSHIP
• Every paper highlighting the failure of assessment
indicates the critical need for command emphasis
• Commands and agencies need to integrate
assessment training and discussions into
commander training and staff orientation
• Assessment teams need to develop their own
calendars of key staff engagements to provide the
commander and key staff with necessary
assessment familiarization
UNCLASSIFIED
UNCLASSIFIED
10. DEFINING THE WIN
• Metric development begins in mission analysis
• We cannot measure the achievement of an
objective or attainment of an endstate if we
cannot define what it means to successfully
achieve it
• How do we know we are winning if we don’t
know how to define winning?
UNCLASSIFIED
UNCLASSIFIED
11. MODELING THE SYSTEM
SYSTEM
PROCESS
CONTROLS
SYSTEM
INPUTS
SYSTEM
OUTPUTS
EXOGENOUS
VARIABLES
BEGINNING
STATE OF THE
SYSTEM
ERROR VARIABLES,
RANDOM ACTIONS
OPERATIONS,
ACTIONS, ACTIVITIES
MEASURABLE
OUTCOMES AND
INDICATORS THAT
INDICATE THE STATE
OF THE SYSTEM
Modeling the environment as a
semi-black box system prevents
analysts from drawing
unnecessary conclusions from
actions and ensures they focus on
measurable outcomes indicative
of the state of the system.
UNCLASSIFIED
UNCLASSIFIED
The Integration Definition for
Function Modeling (IDEF0) Model
is a way to understand the
relationship between the inputs
and outputs of a system and
other things that may impact
them.
12. ADDRESSING THE PROBLEM
• Focus groups and interviews with analysts,
leaders, staff members, and components
• Going “back to basics” with the science of
systems analysis
• Developing documents and procedures to train
analysts and staff members on and codify better
campaign framework development procedures
• Developing the “right” metrics for assessment
• Defining the “win” throughout the process
• Engaging key leaders on assessment
UNCLASSIFIED
UNCLASSIFIED
13. BODIES OF ANALYSIS THAT CAN HELP
• Current assessment processes have many echoes
in technical operations analysis
• By reexamining the parent theories, we can see
what is missing in current assessment practices
COMPLEXITY
SCIENCE
SYSTEMS
ANALYSIS
DECISION
ANALYSIS
QUANTITATIVE
VALUE
MODELING
SENSITIVITY
ANALYSIS
UNCLASSIFIED
UNCLASSIFIED
14. STARTING WITH MISSION ANALYSIS
HIGHER HQ MISSION
INTELLIGENCE PREPARATION OF THE BATTLEFIELD
SPECIFIED, IMPLIED, ESSENTIAL TASKS
REVIEW ASSETS
CONSTRAINTS
FACTS + ASSUMPTIONS
RISK ASSESSMENT
CCIR + EEFI
ISR
TIMELINE
MISSION + TASKS
APPROVAL
INFORM CURRENT AND NECESSARY
ENVIRONMENTAL CONDITIONS
(OBJECTIVES + EFFECTS)
INFORM CURRENT AND NECESSARY
PERFORMANCE CONDITIONS
(OBJECTIVES + EFFECTS)
INFORM SUCCESS AND THRESHOLD
CRITERIA FOR OBJECTIVES
ONCE SUCCESS CRITERIA IS DEFINED
FOR THE OBJECTIVES, DEFINING
NECESSARY CONDITIONS AND
METRICS TO MEASURE THEM IS
FAIRLY STRAIGHTFORWARD.
UNCLASSIFIED
UNCLASSIFIED
15. SUCCESS CRITERIA
• During mission analysis, the planners and
analysts define the critical conditions that
must be achieved in order for the objective to
be considered achieved
• From this success criteria, planners and
analysts derive sub-objectives, necessary
conditions, or effects
• Repeat the same procedure with conditions or
sub-objectives for metric development
UNCLASSIFIED
UNCLASSIFIED
16. ENVIRONMENT VS. PERFORMANCE
• Analysts generally measure
environmental indicators as
performance outputs
• There are also performance
indicators that show success in
terms of an output
– Force capability
– Capacity
– Posture
• There is a difference between
task assessment and
performance assessment
STRATEGIC
ENVIRONMENT
OPERATIONS,
ACTIONS,
ACTIVITIES
INITIAL
ENVIRONMENT
+
CAPABILITIES
END
ENVIRONMENT
+ CAPABILITIES
ACTIONS OF
ALLIES,
PARTNERS,
ACTORS;
NATURAL
EVENTS
UNCLASSIFIED
UNCLASSIFIED
17. Operations, Actions, and Activities (OAAs)
Environmental Metrics Tasks / Performance Metrics
Critical Conditions
(Environment)
Intermediate Military
Objectives
Higher
Objectives
or
Endstates
Measurable items that indicate the
presence of environmental
conditions necessary for, or
indicative of, the objective’s
success, can be direct or proxy
Environmental conditions
necessary for the success
of the objective.
DoD conditions / requirements
necessary for the achievement
of the objective.
Efforts and actions by OPRs with stated achievable and measurable objectives to support the
accomplishment of key (strategic level) tasks, the improvement of environmental indicators, or
the application of resources toward service-specific objectives.
Clearly defined, decisive, and
attainable goal toward which every
military operation should be directed.
UNCLASSIFIED
UNCLASSIFIED
Critical Conditions
(Performance)
BASIC CAMPAIGN FRAMEWORK
Measurable items that indicate the
achievement of capabilities /
resources necessary for or
indicative of the objective’s success,
generally direct
18. DEVELOPING THE FRAMEWORK
FUNDAMENTAL
OBJECTIVE
FUNCTION 1
OBJECTIVE 1.1
VALUE MEASURE
1.1.1
VALUE MEASURE
1.1.2
OBJECTIVE 1.2
FUNCTION 2
OBJECTIVE 2.1
OBJECTIVE 2.2
FUNCTION 3
The value hierarchy is a
pictorial representation of
a value model, starting
with the fundamental
objective or endstate at
the top and decomposing
the system into sub-
systems or sub-functions,
subordinate objectives for
those functions, and
associated value
measures.
UNCLASSIFIED
UNCLASSIFIED
19. DEVELOPING THE METRIC
• What is a “bad” metric?
• A bad metric
– Does not provide context for
objective completion
– Is overly vague
– Is unnecessarily precise
– Does not link to conditions and
objectives
– Is measured just for the sake of
measuring something
• What makes a “good” metric?
• A good metric
– Allows data collectors or
subject matter experts to
answer questions relating to
the accomplishment of an
objective
– Can be objective or subjective
(Objective metrics may require
additional metrics to provide
context for objective
accomplishment)
– May have strong links to
decision triggers, CCIR, or other
important decision factors
UNCLASSIFIED
UNCLASSIFIED
20. QUALITATIVE AND QUANTITATIVE
• Analysts should be comparing subjective
measurements throughout the assessment
• Objective metrics provide good data, but not an
assessment – they provide no context
• Objective metrics can be given subjective context
either through an additional calculation against a
set standard or by obtaining subject matter
expertise
UNCLASSIFIED
UNCLASSIFIED
21. VALUE FUNCTIONS
• Current assessment strategy
assumes a linear return to scale
(RTS) where all responses are
valued equally
• Value functions measure return to
scale on the value measure
• These are useful in determining
points of diminishing returns or
losses
• Value functions can also be
discrete, with value given for
certain ratings and none for others
Linear RTS
Decreasing
RTS
Increasing
RTS
UNCLASSIFIED
UNCLASSIFIED
22. RATINGS WITH AND WITHOUT VALUE
FUNCTIONS
The change in average
created by value
functions is not always
as significant as changes
in the individual rating,
but it can account for a
more accurate
description of how a
stakeholder assigns
value and priority.
DISCRETE VF.
RATING POINTS
1 8
2 7.5
3 6
4 3
5 1
RATINGS VALUE
5 1
1 8
4 3
5 1
1 8
4 3
2 7.5
AVG NVF 5.714286
AVG VF 4.5
23. WHAT DOES “GREEN” MEAN?
The “stoplight” method has been used extremely ineffectively,
but it can be made effective through defining success, partial
success, or failure for each metric
GREEN Conditions are ranging from the ideal state to the lower limit of the
commander’s risk tolerance; no additional resources or policy
changes required at this time to improve progress toward the
objective; maintain
YELLOW Conditions are outside the commander’s risk tolerance but not at a
state deemed critical or dangerous; additional resources or policy
changes may be required and can be addressed by amendment to
the current plan
RED Conditions are at a state deemed critical or dangerous; branch or
contingency plans may need to be enacted, additional resources
and policy changes needed to address the environment
UNCLASSIFIED
UNCLASSIFIED
24. INPUT OPTIONS
• What options
provide information
in the best context to
a decision maker?
• What options
provide the best
context and clarity to
the subject matter
expert?
• How much
“precision” do you
need on a subjective
measurement?
Scale A. 5 Point B. Mix
5 Met Exceeded
4 Favorable Met
3 Concerns Concerns
2 Serious Concerns Did Not Meet
1 Did Not Meet Failed
5-Point Scale Options
Scale A. 3 Bins B. Thresholds C. Bins and Ends C. Alt 5 Point D. Alt Mix
10 Met
9 Exceeded Met Exceeded
8
7 Met Favorable Met
6 Some Concerns
5 Concerns Concerns
4 Serious Concerns
3 Will not meet Serious Concerns Did Not Meet
2
1 Failed Did Not Meet Failed
0 Did Not Meet
Favorable
Concerns
Serious Concerns
Favorable
Concerns
Serious Concerns
10 Range Scale Options
In strategic assessment, which is inherently subjective, the number and its
precision are only as important as what it can communicate to a decision maker
and how intuitive it is to the respondent.
UNCLASSIFIED
UNCLASSIFIED
Scale Confidence Bins
9 Strongly Agree
8 Agree
7 Somewhat Agree
6 Strongly Agree
5 Agree
4 Somewhat Agree
3 Somewhat Agree
2 Agree
1 Strongly Agree
Confounded Option
Favorable
Concerns
Serious Concerns
25. RATING SYSTEMS AND THRESHOLDS
• Defining an intuitive rating system that allows subject
matter experts to best answer questions is integral
• It can sometimes be difficult to translate a more
detailed rating system into three color stoplight bins
• Two separate rating systems can be used in concert
with the right thresholds established
UNCLASSIFIED
UNCLASSIFIED
26. STEPS TO IDENTIFYING THRESHOLDS
• STEP 1: Obtain
Sample Data
• STEP 2: Enter
Subjective
Assessment
• STEP 3: Create
Averages
• STEP 4: Sort and
Identify Natural
Thresholds
Planner Executor Intel Client PA AVE Result
5 5 5 2 5 4.4 G
3 4 3 4 4 3.6 G
4 4.5 4 3 2 3.5 G
4 4 4 4 1 3.4 G
4 4 4 3 2 3.4 G
3 4 3 4 3 3.4 Y
2 4 3 3 4.5 3.3 Y
3 4 3 4 2 3.2 Y
2.5 3.5 3 3 4 3.2 Y
3 4 3.5 4 1 3.1 G
3 4 3 4 1 3.0 Y
1 4 2 3 4 2.8 Y
2 2 2.5 5 2 2.7 R
2 2 2 5 2 2.6 R
2 3 1 4 3 2.6 Y
2 2 1 5 3 2.6 R
3 2.5 3 2 2 2.5 R
2 2 1 4.5 3 2.5 R
3 2 3 2 2 2.4 R
2.75
3.4
1 23
4
A good assessment should average stakeholder data to a
value that makes intuitive sense to subject matter
experts and leaders.
UNCLASSIFIED
UNCLASSIFIED
27. REFINE AND ANALYZE
• Colors get you close. Discussions add quality and clarity.
– For Case 1-2: Why did PA perceive more failure in the metrics?
– For Case 3-4: Do the Planner and Intel reps know something the others
do not?
– For Case 5-6: If the Client is happy, is that all that matters?
Planner Executor Intel Client PA AVE Result
4 4.5 4 3 2 3.5 G
4 4 4 4 1 3.4 G
3 4 3 4 3 3.4 Y
1 4 2 3 4 2.8 Y
2 3 1 4 3 2.6 R
2 2 2 5 2 2.6 R
Case
1
2
3
4
5
6
UNCLASSIFIED
UNCLASSIFIED
28. WEIGHTING METRICS + RATERS
• Most metrics in assessment are
currently prioritized equally,
but not all measures have or
should have an equal effect on
an outcome
• Weighting is a method of
discriminating between metrics
in terms of priority
• We can determine the impact
of weighting metrics and
respondents on the outcome of
the objective’s rating through
the use of sensitivity analysis
Rater Score Points Weight Result
1 5 1 0.027777778 0.138888889
2 5 3 0.083333333 0.416666667
3 3 2 0.055555556 0.166666667
4 4 8 0.222222222 0.888888889
5 3 7 0.194444444 0.583333333
6 5 4 0.111111111 0.555555556
7 5 6 0.166666667 0.833333333
8 4 5 0.138888889 0.555555556
Score: 4.138888889
Unweighted: 4.25
UNCLASSIFIED
UNCLASSIFIED
29. SENSITIVITY ANALYSIS
• Determine the impact
of weighting metrics
and raters with
sensitivity analysis
• This can be done
either using simulation
or rough calculations
in Excel
RATER 1 SCORE
0.1 4.22
0.2 4.31
0.3 4.4
0.4 4.48
0.5 4.57
0.6 4.66
0.7 4.74
0.8 4.83
0.9 4.91
UNCLASSIFIED
UNCLASSIFIED
30. APPLYING METRICS TO RESOURCES
• Tracking resources and applying to subjective ratings is a start, but it
is only useful, as with the ratings, when subject matter experts
provide context
• Often, this provides a starting point for questions and discussions
with experts as to whether the resources spent are necessary to
maintain/hold ground or whether the effort is ineffective
RATINGS QTR 1 QTR 2 QTR 3 QTR 4
OBJ 1 2 2 2 3
OBJ 2 1 2 1 1
OBJ 3 3 4 3.5 4
OBJ 4 4 4 4 4
DOLLARS
IN $M QTR 1 QTR 2 QTR 3 QTR 4
OBJ 1 $4.00 $4.00 $4.00 $4.00
OBJ 2 $2.00 $2.00 $2.00 $2.50
OBJ 3 $2.00 $2.00 $2.00 $2.00
OBJ 4 $4.00 $3.00 $3.00 $3.00 0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
QTR 1 QTR 2 QTR 3 QTR 4
OBJ 1
OBJ 2
OBJ 3
OBJ 4
UNCLASSIFIED
UNCLASSIFIED
31. CREATE A LASTING NARRATIVE
• The most useful tool in an analyst’s
assessment arsenal is a lasting narrative that
codifies the following
– State of the system/objective
– Expert opinion and analysis as to the reason
– Resources applied to changing the system
– Recommended changes to forces, posture, policy,
or resource application
UNCLASSIFIED
UNCLASSIFIED
32. FUTURE WORK
• Finalize development of campaign plan structure to include
incorporation of performance and capability elements and
performance metrics (resources)
• Incorporate multi-objective decision analysis methods and analyze
tradeoffs
• Upgrade existing systems in use to incorporate more robust Gantt
and resource tracking functions for more thorough analysis
• Incorporate more focus groups and follow-on analysis time into the
assessment – the assessment only begins with a data call, but
finishes with thorough analysis of data
UNCLASSIFIED
UNCLASSIFIED
33. REFERENCES
• Armstrong, J. Scott (2001) Principles of Forecasting:
A Handbook for Researchers and Practitioners.
Springer: New York
• Campbell, Jason, Michael O’Hanlon, and Jeremy
Shapiro (2009). “How to Measure the War.” Policy
Review, n. 157, 15-30.
• Downes-Martin, Stephen (2011). “Operations
Assessment in Afghanistan is Broken: What is to be
Done?” Naval War College Review, Autumn, 103-
125.
• Kilcullen, David (2010). “Measuring Progress in
Afghanistan.” Counterinsurgency, 51-83.
• Kramlich, Gary (2013). “Assessment vs. Decision
Support: Crafting Assessments the Commander
Needs.” White Paper.
www.milsuite.mil/book/broups/fa49-orsa.
• Parnell, Gregory, Patrick J. Driscoll, Dale L.
Henderson (2011). Decision Making in Systems
Engineering and Management, 2d ed. Wiley:
Hoboken, NJ.
• Schroden, Jonathan (2013). “A New Paradigm for
Assessment in Counter-Insurgency.” Military
Operations Research, v. 18, n. 3, 5-20.
• Schroden, Jonathan (2013). “Why Operations
Assessments Fail: It’s Not Just the Metrics.” Naval
War College Review, Autumn, 89-102.
• US Joint Staff (2011). Commander’s Handbook for
Assessment Planning and Execution, Version 1.0.
• US Joint Staff (2011). Joint Publication 3-0, Joint
Operations. www.dtic.mil
• US Joint Staff (2011). Joint Publication 5-0, Joint
Operations Planning. www.dtic.mil
UNCLASSIFIED
UNCLASSIFIED