February 2, 2019 - It's Not Hip to be Square - The Importance of Cost Functions in Production Forecasting

ADA Upstream | March 18–21, 2019 | Houston, TX, USA
IT’S (NOT) HIP TO BE SQUARE — THE IMPORTANCE OF
COST FUNCTIONS IN PRODUCTION FORECASTING
APPLIED DATA ANALYTICS: UPSTREAM
MARCH 18–21, 2019
HOUSTON, TEXAS, USA
DAVID S. FULFORD
DATA ENGINEERING & ANALYTICS — SUBSURFACE ANALYTICS
APACHE CORPORATION

INTRODUCTION

INTRODUCTION
Valid Model Data Transform
Robust Regression Is this a Valid Model?

 What’s a cost function?
 Alternative Norms
 Regression of Non-linear problems
 Applications to Production Forecasting
 Conclusions
OUTLINE

 To fit a model to data, we require an quantification of
“goodness of fit” – a cost function
 Applies to any machine learning (ML) algorithm
 In general, we can write the process of model fitting
as regression
 we desire to map predictor variables to response variables
WHAT’S A COST FUNCTION?
6

 The simplest case of regression would be a linear model
 Holding that the assumptions of a linear model hold true to other models, viz.
 E 𝑒 = 0
 𝑒 are homoscedastic and uncorrelated
 If we have a linear model:
𝐘 = 𝐗𝛽 + 𝒆
 We predict 𝐘 with:
෡𝐘 = map 𝑓 መ𝛽, 𝑥 , 𝐗 + ො𝒆
 Then the cost function is:
J መ𝛽 = 𝐘 − መ𝛽𝐗
𝟐
≡ 𝐘 − map 𝑓 መ𝛽, 𝑥 , 𝐗
𝟐
 And we regress the model by minimizing J መ𝛽
መ𝛽 ← argmin J መ𝛽
7
ො𝐳 is an estimator of 𝐳

 Minimization of J( መ𝛽) is equivalent to setting the gradient of J( መ𝛽) to zero:
argmin J መ𝛽 ≡
𝜕J
𝜕 መ𝛽
→ 0
𝜕J
𝜕 መ𝛽
= −2𝐗 𝑌 − map 𝑓 መ𝛽, 𝑥 , 𝐗
 If 𝐗 is:
𝐗 = [1, 1, 1, … , 1]
 Then the gradient becomes:
𝜕J
𝜕 መ𝛽
= −2 𝐘 − መ𝛽 ≔ 0
 And its obvious that a value of መ𝛽 = ത𝑦 satisfies the equation
8

 Summarizing:
 The arithmetic mean as a best estimator of a parameter value,
and
 Squared errors as the cost function of choice to regress a model
 … are consequences of the estimator’s linear properties
 መ𝛽 is a fixed linear combination of 𝐘
 e.g. መ𝛽 = 𝐚T
𝐘 for some 𝐚 such as 𝐚 = 𝐗 𝐓
𝐗
−𝟏
𝐗 𝐓
 E መ𝛽 = 𝛽
 Most problems in which we’re interested are not linear!
 e.g. production data
 more on this later…
9

 Additionally, means are not robust
ҧ𝑥 =
1
𝑛
෍ 𝐗 =
1
𝑛
෍
𝑖=1
𝑛
𝑥𝑖
 With some manipulation we can show that:
ҧ𝑥 =
𝑛 − 1
𝑛
ҧ𝑥 𝑛−1 +
1
𝑛
𝑥 𝑛
 Indicating that any single value, if large enough, can
dominate ҧ𝑥
10

11

 We can write the least squares cost function as the
Euclidean distance between two points:
𝛂 = 𝐘 − map 𝑓 መ𝛽, 𝑥 , 𝐗
𝛂 2 = σ 𝛂 𝟐
1
2
 Generally, we can have any level of distance, which
we call a norm:
𝛂 𝑛 = σ 𝛂 𝑛
1
𝑛
ALTERNATIVE NORMS
12

 L2, L1, and L0 norms:
𝛂 2 = σ 𝛂 𝟐
1
2
𝛂 1 = σ 𝛂
𝛂 0 = ቊ
0 if 𝑥 = 0
1 else
 We can even have a L∞ norm:
𝛂 ∞ = max 𝛂
ALTERNATIVE NORMS
13

 However, not all provide a closed-form analytic
solution for regression
 If we draw the L2 norm between two points, we find
a unique solution
ALTERNATIVE NORMS
14

 Adding some L1 norms, we find multiple non-unique
solutions
ALTERNATIVE NORMS
15
L2
by-flight distance
L1
taxicab distance

ALTERNATIVE NORMS
16

ALTERNATIVE NORMS

ALTERNATIVE NORMS
𝛻 J 𝐻 = ቊ
𝛿 if 𝛼 ≥ 𝛿
𝛼 sign 𝛼 else

ALTERNATIVE NORMS
J 𝐻 = ൝
𝛿𝛼 − Τ1 2 𝛼2
if 𝛼 ≥ 𝛿
Τ1 2 𝛼2
else

 Use stochastic gradient descent (SGD) to minimize
cost functions:
1. while J መ𝛽 𝑖−1
− J መ𝛽 𝑖
> 𝜀
2. J መ𝛽 = 𝐘 − map 𝑓 መ𝛽, 𝑥 , 𝐗
2
3. መ𝛽 ← መ𝛽 − 𝜂𝛻J መ𝛽
 where:
𝜀 = precision threshold
𝜂 = learn rate
ALTERNATIVE NORMS
22

 We can also regularize based upon norms:
 LASSO Regression → L2 norm of data, L1 norm of መ𝛽
J መ𝛽 = 𝐘 − map 𝑓 መ𝛽, 𝑥 , 𝐗
2
+ 𝜆 መ𝛽
 Ridge Regression → L2 norm of data, L2 norm of መ𝛽
J መ𝛽 = 𝐘 − map 𝑓 መ𝛽, 𝑥 , 𝐗
2
+ 𝜆 መ𝛽2
ALTERNATIVE NORMS
24

REGRESSION OF NONLINEAR PROBLEMS
25

26

27

 Example well from Eagle Ford
 ≈3.5 years of production history
APPLICATIONS TO PRODUCTION FORECASTING
28

 If we use a hyperbolic model, we believe that:
𝑞 = 𝑞𝑖 1 + 𝐷𝑖 𝑏𝑡 − ൗ1
𝑏
𝑞 ≈
𝑞𝑖 𝐷𝑖 𝑏 − ൗ1
𝑏
𝑡 ൗ1
𝑏
𝑞 ≈
𝛼
𝑡 ൗ1
𝑏
 Meaning, a power-law function is the base functional
relationship and must log-transform the data
29
URTeC 2903036 (Fulford) – A Model-Based Diagnostic Workflow, 2018

 Regression on data... neither is good
30

 Point-by-point cost; note the scale on the colorbars
31

32

33

34
 Is this unique?
 Minimize cost function for 𝑞𝑖 = 200, 2000 with interval
of 100

 Plot the cost function over mesh with:
𝑥 = 𝑞𝑖
𝑦 = 𝐷𝑖
35

 Plot the gradient…
 𝜀 = 1 × 10−40 (precision)
 … yet algorithm still did
not find 𝛻J = 0 for each
forecast in list
36

 Extract the minimum cost from the mesh at each
row/column
 Which is the “correct” forecast?
37

 We’re not limited to pre-defined cost functions… make your own!
 SPE 174784-PA proposed the following:
JF =
1
𝑛
σ 𝑒 − 𝜖min
𝜎𝜖
2
+
1
𝑛
σ 𝑒 −
1
𝑛
σ 𝑒
2
− 𝜀min
𝜎𝜀
2
 Which is more clearly written as:
JF =
E 𝑒 − 𝜖min
𝜎𝜖
2
+ 𝜆 𝜀 Var 𝑒 − 𝜀min
2
 The features of this cost function are:
 L1 cost against data
 L2 cost against best-fit model
 L2 regularization for min. variance (generalizes to noisy data / outliers)
38
SPE 174784-PA (Fulford, Bowie, Berry, Bowen, Turk) – Machine Learning as a Reliable Technology, 2016

 Regularization on variance limits range of 𝐷𝑖 and 𝑏-
parameter
 L2 cost against best-fit penalizes spread through data
39

 The choice of cost function(s) may impact regression
results as much as the choice of predictive model
 Understanding the base expectation of data & model
functional-forms gives insight into appropriate choice of
cost function
 Uncertainty is a fundamental characteristic of modeling
 A best-fit is not the same as a best forecast
 It does not mean only one unique set of model parameters exists!
CONCLUSIONS
40

41
APPENDIX

 𝐗 𝐗 𝐓
𝐗
−𝟏
𝐗 𝐓
is the projection matrix of 𝐘 to ෡𝐘
 𝐘 = 𝐗 𝐗 𝐓 𝐗
−𝟏
𝐗 𝐓 𝐘 = 𝐗𝛽
42

DISTRIBUTION OF FORECASTS
 Possible fits of data + uncertainty of future
performance
Actual vs. MCMC Forecasts
Time
Rate
43

REVISIONS
 How much should I expect to revise forecasts from
month-to-month with this approach?
 On average, zero!
Change in EUR from prior month
Clifford and Torres (2017)
44

February 2, 2019 - It's Not Hip to be Square - The Importance of Cost Functions in Production Forecasting

Recommended

Recommended

More Related Content

Similar to February 2, 2019 - It's Not Hip to be Square - The Importance of Cost Functions in Production Forecasting

Similar to February 2, 2019 - It's Not Hip to be Square - The Importance of Cost Functions in Production Forecasting (8)

Recently uploaded

Recently uploaded (20)

February 2, 2019 - It's Not Hip to be Square - The Importance of Cost Functions in Production Forecasting