Mining Data from Reservoir Simulation Result

Mining Data from Reservoir Simulation Results
using R
(to be presented at ICIPEG ’10)

Akmal Aulia, Tham Boon Keat, M. Sanif Maulut,
Dr. Noaman El-Khatib, Mazuin Jasamai

EOR Centre, UT PETRONAS
Supervisor: Prof. Dr. Noaman El-Khatib

June 9th , 2010

Introduction to Association Rules

Market Basket Analysis - imagine a set of transactions


”Does a person who purchase a milk and eggs tends to buy
bread?”


bread?”
Math-wise: Degree of chance of the frequent set S such that,
S = {milk,eggs, bread}, where,
A = {milk,eggs},
B = {bread}
Thus, A ⇒ B, A ∪ B ⊆ S, and A ∩ B = ∅


bread?”
A = {milk,eggs},
B = {bread}
Thus, A ⇒ B, A ∪ B ⊆ S, and A ∩ B = ∅
A ⇒ B is called a ”Rule”


bread?”
A = {milk,eggs},
B = {bread}
Thus, A ⇒ B, A ∪ B ⊆ S, and A ∩ B = ∅
A ⇒ B is called a ”Rule”
Association Rules in Amazon.com:
”Customers who bought this item also bought..”

Introduction to Association Rules: A Simple Example

Table: Transactional Data Sample
Transaction ID Items
1 milk, eggs
2 eggs, butter
3 peanut
4 milk, eggs, bread
5 eggs, bread

Support of A = {milk, eggs} = 2 / 5 = 0.4 = 40%


1 milk, eggs
2 eggs, butter
3 peanut
4 milk, eggs, bread
5 eggs, bread

Support of B = {bread} = 3 / 5 = 0.6 = 60%


1 milk, eggs
2 eggs, butter
3 peanut
4 milk, eggs, bread
5 eggs, bread

Support of A ⇒ B = 1/5 = 0.2 =20%


1 milk, eggs
2 eggs, butter
3 peanut
4 milk, eggs, bread
5 eggs, bread

Support of A ⇒ B = 1/5 = 0.2 =20%
A⇒B
Conﬁdence of A ⇒ B = Support of of A = 0.2 = 0.5 = 50%
Support 0.4


1 milk, eggs
2 eggs, butter
3 peanut
4 milk, eggs, bread
5 eggs, bread

Support of A ⇒ B = 1/5 = 0.2 =20%
A⇒B
Conﬁdence of A ⇒ B = Support of of A = 0.2 = 0.5 = 50%
Support 0.4
0.2
Lift of A ⇒ B = (0.4)(0.6) = 0.83

Association Rules: Formal Deﬁnition

Support(A ⇒ B) = P(A ∪ B) (1)



P(A ∪ B)
Conﬁdence(A ⇒ B) = P(B|A) = (2)
P(A)



P(A ∪ B)
P(A)

P(B|A) P(A ∪ B)
Lift(A ⇒ B) = = (3)
P(B) P(A)P(B)



P(A ∪ B)
P(A)

P(B|A) P(A ∪ B)
Lift(A ⇒ B) = = (3)
P(B) P(A)P(B)
Reliable Rule: Large Conﬁdence, Large Support, and Lift > 1

Implementation using R

Language for statistical computing, graphics


GNU General Public License ⇒ FREE!!


Over 2416 contributed packages - ARULES, GA, ANN, etc


Over 106 books published - Bayesian, Monte Carlo, Chemistry


Over 106 books published - Bayesian, Monte Carlo, Chemistry
Parallel Computation


1 Injection at (1,1), 1 Production at (5,5)

Let reservoir simulation parameter Xi such that i ∈ {1, 2, · · · , 8}.

Table: Description of Parameters

Parameter Description Units
X1 Surf. rate at inj. well stb/day
X2 Bot. hole pres. limit at the inj. well psia
X3 Liq. rate at the prod. well stb/day
X4 Bot. hole pres. limit at the prod. well psia
X5 Bot. hole pres. datum at the prod. well ft
X6 Bot. hole pres. datum at the inj. well ft
X7 Inner diameter of the prod. well ft
X8 Inner diameter of the inj. well ft
OIPt0 −OIPt
T Final oil recovery (recovery factor) OIPt0

Dataset Construction
Use Excel to generate random numbers for each parameter Xi ,
ROUND(RAND() ∗ (max(Xi ) − min(Xi )) + min(Xi ), 0)

Figure: Dataset Formation

Data Pre-processing

Table: Dataset
X1 X2 X3 X4 X5 X6 X7 X8 T
13087 9267 9774 3320 8042 8101 6 5 0.413
12082 6192 5943 3844 8058 8030 5 5 0.397
13789 5532 4941 2987 8083 8115 4 6 0.372
11671 12197 4718 2543 8080 8038 4 6 0.178
13182 6055 9507 2989 8057 8040 3 3 0.492
11810 7252 7597 4480 8036 8036 6 5 0.421
11070 10849 4887 2028 8088 8100 3 5 0.246
11861 10220 1545 3723 8117 8045 6 5 0.124
12877 6557 8863 3766 8089 8102 4 4 0.467
13905 7904 1270 4279 8027 8084 7 3 0.117
... ... ... ... ... ... ... ... ...

Data Pre-processing

Association Rules analyzes Categorical Data. ⇒ Convert it!

Data Pre-processing

Association Rules analyzes Categorical Data. ⇒ Convert it!
Split each parameters by some Xik such that
Xi = {Xi1 , Xi2 , . . . , Xik , . . . , Xi8 }. Xik can be,

Xik = mean(Xi ) (4)
Xik = median(Xi ) (5)

Thus, ∀Xi ,

High(⇑), for Xik =k > Xik
Xik =k =
Low(⇓), for Xik =k ≤ Xik

Data Pre-processing

Thus, you’ll see something like this, (use R to do this)

Table: Obtained Categorical Dataset

X1 X2 X3 X4 X5 X6 X7 X8 T
HIGH LOW LOW LOW HIGH LOW LOW LOW HIGH
LOW HIGH LOW LOW LOW HIGH LOW LOW LOW
... ... ... ... ... ... ... ... ...

The ARULES package

Use R’s ARULES package

The ARULES package

Use R’s ARULES package
Apriori algorithm,
i=1
Di = {G : G is an itemset of size 1}
while Di is not empty do
database pass:
for each set in Di , test whether it is frequent
let Fi be the collection of frequent sets from Di
candidate formation:
Let Di be those sets of size i + 1 whose all subsets are
frequent
end while

Results and Discussion

Table: Limits for the Apriori algorithm’s parameters
Lift Conﬁdence
1.5 0.9

⇒ Generated some 24098 rules (for mean-based splitting)

Table: Mean-Based Low-Target (Low Oil Recovery) Yielding

No. Parameter/Value Support Conﬁdence Lift
1 X2 ⇑, X5 ⇑ 0.148 1.00 1.80
2 X2 ⇑, X7 ⇓ 0.185 1.00 1.80
3 X1 ⇓, X2 ⇑ 0.222 1.00 1.80
4 X2 ⇑, X4 ⇑ 0.222 1.00 1.80
5 X2 ⇑, X3 ⇓ 0.185 1.00 1.80
6 X5 ⇑, X6 ⇓ 0.185 1.00 1.80
7 X3 ⇓, X7 ⇑ 0.222 1.00 1.80
8 X2 ⇑, X5 ⇑, X8 ⇓ 0.037 1.00 1.80

Table: Mean-Based High-Target (High Oil Recovery) Yielding

1 X4 ⇓, X8 ⇓ 0.111 1.00 2.25
2 X3 ⇑, X4 ⇓ 0.185 1.00 2.25
3 X3 ⇑, X6 ⇓ 0.185 1.00 2.25
4 X2 ⇓, X3 ⇑ 0.259 1.00 2.25
5 X5 ⇓, X6 ⇓ 0.259 1.00 2.25
6 X2 ⇑, X4 ⇓, X8 ⇓ 0.037 1.00 2.25
7 X2 ⇑, X6 ⇓, X8 ⇓ 0.037 1.00 2.25
8 X2 ⇑, X3 ⇑, X4 ⇓ 0.074 1.00 2.25

Table: Median-Based Low-Target (Low Oil Recovery) Yielding

1 X2 ⇑, X8 ⇑ 0.074 1.00 1.93
2 X5 ⇑, X7 ⇑ 0.037 1.00 1.93
3 X3 ⇓, X7 ⇑ 0.111 1.00 1.93
4 X2 ⇑, X4 ⇑ 0.222 1.00 1.93
5 X2 ⇑, X3 ⇓ 0.259 1.00 1.93
6 X3 ⇓, X6 ⇑ 0.259 1.00 1.93
7 X2 ⇑, X5 ⇑, X8 ⇑ 0.037 1.00 1.93
8 X3 ⇓, X5 ⇑, X8 ⇑ 0.074 1.00 1.93

Table: Median-Based High-Target (High Oil Recovery) Yielding

1 X1 ⇑, X8 ⇑ 0.074 1.00 2.08
2 X4 ⇑, X8 ⇑ 0.037 1.00 2.08
3 X2 ⇓, X7 ⇑ 0.111 1.00 2.08
4 X4 ⇓, X7 ⇑ 0.222 1.00 2.08
5 X2 ⇓, X3 ⇑ 0.259 1.00 2.08
6 X5 ⇓, X6 ⇓ 0.259 1.00 2.08
7 X2 ⇓, X5 ⇓ 0.037 1.00 2.08
8 X3 ⇑, X5 ⇑, X8 ⇑ 0.074 1.00 2.08

Summary

X2 (BHP limit, INJ) and X3 (liquid rate, PROD) frequently
showed up - clue to higher recovery!

Summary

X2 (BHP limit, INJ) and X3 (liquid rate, PROD) frequently
showed up - clue to higher recovery!
More parameters, more wells, a more legitimate study.

Mining Data from Reservoir Simulation Result

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (16)

Similar to Mining Data from Reservoir Simulation Result

Similar to Mining Data from Reservoir Simulation Result (7)

Mining Data from Reservoir Simulation Result