nej2.3

A Method for Predicting Future Trainer Costs via
Analysis of Historical Data
Zachary Forrest
13312 Thomasville Circle #54 D, Tampa, FL, 33617; zachary9@mail.usf.edu; (813) 438-3297; Naval Air
Warfare Center Training Systems Division, Cost Department, Code 4.2; University of South Florida;
M.A. in Mathematics
Abstract
This paper discuss the motivations behind the abstraction of training systems’ cost data to statis-
tical models and difficulties present within such efforts. Current methods of generating trainer costs
require a significant investment in terms of both time and effort. Special attention is given to data
taken from a particular database; and all efforts described within this paper - although discussed in
abstract detail - were applied to the problem of generating initial work towards a parametric evalua-
tion method for trainer costs. The paper covers some evaluated analytical methods; and introduces
one method of analysing cost data for use in future prediction, with the intention of providing an
approach which is grounded in empirical data. Finally, the paper identifies some difficulties present
in the method; provides some comments for the purposes of improving this and future models; and
discusses some of the goals which are being set for predictive analyses of training systems.
The views expressed herein are those of the author and do not necessarily reflect the official position
of the Department of Defense or its components.
Motivations
The NAVAIR Cost Department provides cost and scheduling support for training systems utilized in
training of warfighters; and as such, estimation of trainer costs plays a vital role in all tasks handled
within the department. Such estimates require the use of historical contract costs for the purposes
of determining viable statistical models - empirical data naturally drives all cost generation methods.
Due, however, to considerations of cost threshold, contract type, and program risk, the data available
for such uses is exceptionally limited. Within the Training Systems Division, the primary source of
data for this analysis is the manually generated Trainer Estimating Resource Network (TERN) which
contains two-hundred and forty-five data points describing device costs, system sub-costs, and device
information pertinent to the contract - among other information. From this data emerge several
questions: (1) is it possible to reliably generate future estimates of trainer cost data under such
limiting conditions?; (2) in what manner would such estimates be generated?; and (3) in the event
that such a tool does indeed exist, can it be abstracted to a form which could potentially apply to
other cost data? As this paper will endeavor to show, the answer to all of these questions is (within
certain limitations), “yes.”
The possibility of constructing statistical models for estimating training system costs is a desirable
one. Such a tool would permit a quantitative, mathematical manner in which we could represent
contract costs; represent trends in contracts; and even provide an empirical framework on which to
found skepticism with regard to contractor bids and aid in decision-making pertaining to those bids.
All of these are true and useful benefits of a good statistical model; however, the intrinsic value of
such a tool extends even further.
Currently, many (if not all) cost predictions generated by the Training Systems Division for trainers
entail a time consuming process which involves careful scutiny of all relevant information to the specific
1

trainer; and while there is certainly no dearth of historical information at the CLIN -level on purchases
of trainers, very little of this is in a format which is ideal for use in cost estimation. Although
experienced cost analysts may develop a sense for which costs are likely and unlikely, it remains true
that, for the majority of contract costs, an ad hoc approach1 for determining accurate estimates is
necessary. Similarly, while some suggestions have been made regarding comparisons of sub-costs to
base costs of trainers, proposed estimation factors often lack testing or support from empirical data.
Such circumstances propagate sub-optimal conditions in which to provide more accurate training
system costs for our nation’s warfighters. For if we cannot produce accurate predictions swiftly and
effectively (in a repeatable manner), we shall necessarily incur increased costs both in terms of money
and man-hours spent pursuing an estimate; and if unchecked, such costs could potentially inhibit our
financial capability to acquire and maintain training systems.
If a good statistical model for training systems costs is a necessary tool, several questions are of
immediate concern in the effort to build such a model. Specifically, these questions are: (1) what do
we mean by a “good statistical model”?; and (2) what are key details to look for in a good model?
The first question seems to have a fairly obvious (if somewhat vague) answer: a good statistical model
is any model which accurately and reliably produces predictions regarding the quantitative details of a
given subject matter; and moreover, should be more simple and time efficient to apply to the subject
matter than an ad hoc analysis. The second question, however, requires a little more thought. Clearly,
an important criterion is the ability to apply a proposed methodology across any recent cost data with
impunity; that is, without fear that such a technique may succeed with regard to certain cost data
and yet fail with regard to other data. And since we wish to predict future events in addition to
describing past events, we must also restrict our considerations to statistical techniques that provide
such a capability. What other criteria, then, are important for our model?
Another important point to be considered here is that there are many different variants of training
systems - even for each platform. Moreover, training systems from one platform may not be comparable
to the same variety of training systems for a different platform. (e.g. A flight simulator built for an
F/A-18C platform is almost certainly distinct from a flight simulator built for an MH-60R.) Whatever
method we adopt, it must be capable of separating (or partitioning) data so that similar data remains
categorized together apart from non-similar data. Thought should also be given to the notion of
unusual costs. Certainly such costs do exist (e.g. in first units, where certain non-recurring costs are
commonly found) and our method must be capable of recognizing these outliers; recording the extent
of their deviation from typical data points; and making use of the unusual data to further predictive
capabilities. The approach should be able to apply to new data sets - in the sense of either entirely
new data sets or old data sets with new data included - without undue difficulty. Finally, to be of
true benefit, our method of choice must be capable of summary in some easily read format so that
cost analysts and decision-makers alike may make swift use of results. With these criteria firmly in
mind, we are now ready to turn our attention to questions of detail.
Initial Attempts
Primary concern in developing the details of the final analytical method was initially given to finding
a uniform approach to partitioning the TERN database. (We will write C to denote TERN cost data.)
Table 1: Breakdown of C
Full-task Trainers: 134
Part-task Trainers: 105
Desktop Trainers: 6
Total Number of Devices: 245
1
More commonly referred to as a Bottoms Up or Technical Assessment approach.
2

As mentioned above, it is not necessarily true that any two arbitrary devices (even if classified
with identical device types) may be considered together meaningfully in an analysis; and so initial
statistical tests were run on multiple partitionings of C for the purpose of determining in what manner
similarity could be guaranteed amongst data - i.e. to ensure data homogeneity. These tests included
using statistical measures of central tendency including means, standard deviations, and correlation
coefficients taken over various partitionings of subsets of C - and these partitions were subsequently
tested again under the approach that we shall presently discuss. From these minor statistical tests
- and, indeed, even through use of our proposed methodology - a rather striking fact was quickly
deduced. Namely, that few partitionings of C would support general predictive analysis due to the
wide amounts of variation in data present in C.
Few similarities were seen when devices were partitioned by platform, contract year, contractor,
or even device type - for example.2 In all cases explored, it became quite apparent that there was not
significant similarity between members of partitions in the sense that the difference between the cost
of members - and the difference between members and the mean of those members - was sufficiently
large to guarantee large standard deviations. From this, the predictive tools utilized - which will be
discussed in the following section - within the scope of this project were incapable of generating useful
cost estimates. After some experimentation it became clear that the only method of partitioning in
which any meaningful similarity could be observed was in dividing the data between new training
systems and upgrades of existing training systems.
Further experimentation and observation of trends suggested that a second-level partition of devices
- this time by device type - was necessary to continued analysis; and subsequent, similarly executed
work suggested further such continuations of partitioning. The resultant partition called for devices
to be partitioned in the following manner: first by device type; second by platform; third by whether
the product was new or an upgrade of a previous product; fourth (for upgrade products) by whether
an upgrade was a modification or a “tech refresh”; and fifth, products were divided into full-task,
part-task, and desktop training devices.
At this point in the analysis, thought was finally turned to the question of describing the data
present in C in a fashion amenable to prediction. As with the determination of the method by which C
was to be partitioned, multiple different approaches were considered and discarded; and determination
of an approach’s efficacy was judged upon whether data resultant from the approach could be used
by a cost analyst. From these proceedings, the CERPA analysis was created.
The CERPA Analysis Method
Terminology and Definitions
In order to discuss the Cost Estimating Resource for Predictive Analysis (CERPA) method,
it is first necessary to consider technical details regarding notation and some definitions. If A is a
subset of C (written A ⊆ C), then the sample mean and sample standard deviation of cost data in A
are written as the symbols ¯xA and sA respectively. (Note that for our purposes, we will never consider
the situation A = C.) By a prediction interval for a subset A, we refer to all cost data x so that
|x − ¯xA| ≤ λ with λ defined as
λ =: tn,α/2 · sA · 1 +
1
n
, (1)
where tn,α/2 is a Student-t value defined for n - the number of elements in A, which we assume is
at least 3 - and α =: 0.20.3 Note that given A, a prediction interval generated on A is constructed
to predict individual, point-data of subsequent samples drawn from the same population of data; and
2
More experimentation with a less limited data set is required.
3
It is important to stress that (1) forms predictions for a future point of observation; and does not predict future measures
of central tendency. In this way, it is different from tools like confidence intervals, which are commonly used in hypothesis
testing.
3

from the given choice of α, there is an 80% chance that any new data that is to be included in A will
fall between the values ¯xA − λ and ¯xA + λ. Finally, the following is presented in order to formalize a
definition for “unusual” data in C:
Definition: Let A ⊆ C so that A =: {x1, x2, . . . , xn}. Supposing that y is a point of A (written
y ∈ A), we say that y is a cost outlier for A provided that either y < ¯xA − sA or y > ¯xA + sA. If
y1, y2, . . . , ym are the cost outliers of A then, writing B =: A ∼ {yj}m
j=1 (the subset of A which fails
to contain cost outliers), we define the modification ˆyj of yj (j = 1, . . . , m) to be
ˆyj =:
yj − (|yj − ¯xB| − sB) if y < ¯xA − sA
yj + (|yj − ¯xB| − sB) if y > ¯xA + sA,
(2)
and write Â to mean the set A with each yj replaced by ˆyj.
Before proceeding, it is crucial that we understand the meaning of this definition and the value in
(2). The points singled out as being unusual in the above definition are those which fail to fall within
the “middle” 68% of a normal distribution with a mean of ¯xA and standard deviation of sA; and so is
a direct appeal to the Empirical Rule of normal distributions. The value ˆyj can be thought of as a
“horizontal translation of yj to the nearest extremal value of the distribution of non-outlier points.”
At this juncture, it may be prudent to briefly discuss some assumptions and decisions made re-
garding the definition above - and to clarify what is important to the CERPA methodology. From
empirical observations made on the set C, it is convenient to choose to call points which fall outside
the range of values which correspond to a distance of at most one standard deviation from the mean
unusual - although, without knowledge of C, readers may find this choice to be somewhat arbitrary. It
may be that with a greater amount of data, it will be more convenient to define some other regions of
a normal curve as containing unusual values; however, within the context of the set C, this particular
choice of definition is both reasonable and natural. It also may seem presumptuous to assume that
we may apply properties of a normal distribution to A when A may not be normally distributed; but
the implicit claim to be understood is not that A is normally distributed: rather, that the population
from which A is drawn is normally distributed. (Indeed, such an assumption is valid if for no reason
other than the truth of the Central Limit Theorem of statistics.) In order to make use of the
cost-outliers yj it is necessary to replace each such value with a value more typical to the distribution
implied by B; and in order to maintain the relationships between “low” and “high” cost-outliers,
the modification defined by the value ˆyj above has been chosen as being best capable of fulfilling
all necessary considerations - as opposed to mapping each yj to some randomly generated value, for
example.
Finally, it should be noted that the partitioning discussed in previous sections - while chosen for use
in the analysis discussed herein - is not an essential requirement to fulfilling the CERPA methodology.
Rather, it is simply the best empirically-backed manner in which to guarantee data homogeneity; and
on some other set of cost data (or other data), it is prudent to invoke the CERPA method only after
discerning a partitioning best suited to the set in question. We are now ready to discuss the CERPA
approach.
Details of CERPA
Although CERPA is a methodology, it was also generated within a Microsoft Excel workbook which
performed all necessary calculations. Thus, in our discussion of the approach, we will appeal to the
layout of the CERPA as an Excel workbook for reasons of simplicity and clarity.
Taking the cost data-set C, we populate a “Normed Non-Aggregated” (NNA) worksheet with the
method of partitioning discussed above which splits C into the subsets A1, A2, . . . , Ar; and then calcu-
late, for each index i = 1, 2, . . . , r, the outliers of Ai and their modified values. These modified values
are taken into a “Breakdown” worksheet in which prediction intervals are calculated and displayed.
Finally, the information represented in “Breakdown” is used to populate a “Cost Estimation” sheet,
in which summary-level values are displayed in a format which is meant to maximize the ease with
4

which the data can be interpreted. Additionally, “Cost Estimation” also displays values referred to
as Outlier Adjustment Values (OAV’s) which are given as a means of handling unusual data within
each partition-set Ai. Referring to the definition in the previous section, this value is defined as
max
j≤m
|yj − ¯xB|; and OAV’s are used to modify the upper and lower values of the prediction intervals
displayed in “Cost Estimation” for unusual products. (e.g. First products.) As an example, suppose
that the CERPA generates a predicted minimum of 1.2 million and predicted maximum of 2.6 million
for a certain partition; and an associated OAV of 0.7 million. Then our modified predicted minimum
and maximum are 0.5 million and 3.3 million respectively.
Generating values in this fashion, the CERPA methodology is able to produce prediction intervals
for each relevant partition of C in which both lower and upper values are strictly larger than zero.
These prediction intervals have been given to cost analysts; and it is hoped that, after testing, CERPA
will aid in the creation of a mathematical tool-box which can be used for estimating future training
systems.
Issues and Potential Improvements
Despite the positive nature of comments above, it is necessary to point out some flaws in the CERPA
approach as it currently stands. CERPA is, after all, a first step in a new direction; and it is almost
inevitable that it would suffer some defects. First, and most seriously, there was an insufficient
amount of data available for either strengthening the methodology or for performing statistical tests
(e.g. hypothesis tests) which might provide more insight into cost analysis efforts and the CERPA
itself; and furthermore, the lack of data-points limits which partitions may be considered under the
methodology. (For additional comments, consider the previous sections of this paper.) In the case of
partitions containing precisely 2 data points, the maximum and minimum cost data was substituted
for the CERPA approach; and singleton partitions were ignored completely. Another flaw in the
CERPA is that the CERPA is, by construction, capable of only a broad analysis of costs; and is
fundamentally incapable (in its current iteration) of answering questions regarding information which
pertains to the definition of trainers at the subsystem-level of specificity.
Yet another point to be considered concerns the intricacy present in the approach to modifying
cost outlier values. It should be noted that the proposed calculations involve three separate means
and standard deviation values; and that the values used to modify unusual points specifically exclude
those unusual points. It may seem more logical and reasonable to make use of fewer such calculations;
to make use, perhaps, of merely two such calculation pairs, and to utilize the mean and standard
deviation taken over the entire partition Ai (i = 1, . . . , r) to modify outlier values. But while this
approach may be intuitively superior for its simplicity, if nothing else; and while it is certainly the
place of this paper to propose (and indeed, encourage) the exploration of such changes to the CERPA
methodology; it must be noted that within the context of C, this change failed to produce meaningful
data. Until such time as greater quantities of contract-data containing greater amounts of detail are
readily available for use in testing and expanding the CERPA approach, it is likely that propositions
of this nature will meet similar difficulties.
However, the above are not fatal flaws within the approach. In fact, all of the critiques mentioned
can be seen as originating from the same essential problem: a lack of information (in terms of both
quantity and depth) in contracts paired with a lack of contract data to analyze. It should be recalled
that efforts expended upon the CERPA method were meant to determine if a predictive method could
be generated from the limited amount of data available within C; and in view of this, the efforts
described here are to be considered a success and a step forward. From the analyses performed,
it was shown to be possible to generate predictive values and predictive intervals directly from the
data in C. With the knowledge that such results do exist and are attainable, it is possible to either
refine CERPA or develop a more appropriate analytical tool as new data points are made available to
TERN. Although such tasks are time-consuming and tedious to perform, such an effort will produce
invaluable assets for the purposes of cost analysis; and moreover, such a tool may even harbor the
5

capacity of producing further, more powerful mathematical tools for assessing training system cost
data. Therefore, it is highly recommended that thought be given to the task of working with and
improving CERPA.
References
[1] Larson, Ron, and Betsy Farber. Elementary Statistics: Picturing the World 4th Edition. Upper
Saddle River: Prentice Hall, 2008. Print.
[2] Ramachandran, K. M., and Chris P. Tsokos. Mathematical Statistics with Applications. Burlington:
Academic Press, 2009. Print.
[3] Turner, Bryan. Trainer Estimating Resource Network (TERN) Master. 2012. Microsoft Excel ﬁle.
Information from the coursebook associated with the Defense Acquisition University course In-
termediate Cost Analysis (BCF 204) was also used in the development of material pertinent to this
paper.
6

nej2.3

Recommended

Recommended

More Related Content

What's hot

What's hot (18)

Viewers also liked

Viewers also liked (13)

Similar to nej2.3

Similar to nej2.3 (20)

nej2.3