The high decline rate and the low and off are challenge for the oil and gas
companies as matured oil-fields operator. This research was conducted to provide a
solution to obtain information about the dominant factors which cause low and off.
The data warehouse can be a solution in analyzing data and getting information about
the dominant factors causing low and off, faster and more accurately. This research
uses the data from the operational database and daily production reports that are
already exist. The method used in this research is a nine-step methodology designed
by Kimball and Ross which is a model of systematic data warehouse in building data
warehouse. The output of this research is a prototype data warehouse that can be used
to perform an analysis in helping the management team to make decision in
production planning process. The conclusion of this research is that data warehouse
can be a solution for oil and gas company to obtain valuable information from
historical data or from other data sources that have been available, but not yet
maximally utilized, such as information of dominant factors causing low and off.
2. Sofyan and Abba Suganda Girsang
http://www.iaeme.com/IJMET/index.asp 222 editor@iaeme.com
data on the operational database to find information on the dominant factors causing low and
off. Data warehaouse can be a solution considering that Pertamina EP provides operational
databases containing production data, low and off data, and daily production reports as data
sources, as data warehouses can store historical data from operational databases for decision
making purposes [2], and data warehouse also has the ability to analyze historical data, able to
provide dynamic reports that can be viewed from various dimensions, has the ability to
display reports quickly and precisely [3], and separates the workload of the analysis of the
operational workloads [4].Table I is a list of causes of low and off at Pertamina EP which are
grouped based on the location of low and off occurrence.
Table 1 List Of Causes Low & Off
No Sub Surface Surface
1 Decline of Pressure Well head, Flow line, Trunk line, Transfer pump
2 Water cut increased Surface artificial Lift (Gas lift, ESP, SR, HPU, HJP, PCP)
3
Gas Oil Ratio (GOR)
increased
Production Facilities
4 Scale-up Gas lift compressor, gas lift injector well
5 Sand problem Power supply (Electrical, Gas engine, Diesel engine
6
Flow line production
problem
Water injection system
7 Bean down
Well program (BHP survey, Stimulasi, PES, Well Reparation,
Drilling)
8
Artificial lift (Gas lift,
ESP, SRP, HPU, HJP,
PCP)
Non technical (Theft, demontration, floods, access to locations,
sabotage, etc)
2. RELATED WORKS
2.1. Structure and Architecture of Data Warehouse
A relational database designed to query and analyze historical data from different sources is
known as Data warehouse, data warehouse has subject-oriented, integrated, time variant and
non-volatile characteristics. The data warehouse is a repository of data from an organization
that aims to facilitate the process of analysis and reporting with the aim of analyzing trends
for the purposes of the strategic planning process based on long-term data stored [5]. The data
warehouse architecture is composed from several layers consisting of back-end tier, data
warehouse tier, OLAP tier, and front-end tier.
Back-End Tier, In the back-end layer there are three processes that must be done. They are
extraction, transformation, and loading (ETL process). The ETL process is responsible for
data extraction, cleansing, customization, and finally loading or entering data into the data
warehouse. The ETL process aims to load data into the data warehouse. Source data can come
from internal data (operational database) or external an organization or can also come from
data staging area [6]. following explanation of each process
Data Warehouse Tier, The data warehouse layer consists of enterprise data warehouse, data
mart, and metadata. Enterprise data warehouse is kept centralized and include data from all
areas or departments within an organization. Data mart is specific to a particular function or
department within an organization. Metadata defined as „data about data‟ [6].
OLAP Tier, In the OLAP layer there are OLAP servers that display data in multi-
dimensional form sourced from the data warehouse.
Front-End Tier, on the front-end tier there are tools that make it easier for users to explore
content from the data warehouse. The tools can be:
3. Production Controlling for Matured Field of Oil and Gas Company
http://www.iaeme.com/IJMET/index.asp 223 editor@iaeme.com
OLAP tools, Tool that helps users to explore content from data warehouse interactively with
complex query formulations involving large amounts of data.
Reporting tools, This tool helps users in report management that can be issued/processed into
paper-based report or interactive web-based.
Statistical tools, This tool is used to analyze and visualize cube data using statistical methods.
Data mining tools, This tool allows users to analyze data to gain valuable knowledge as a
pattern and trend.
2.2. Datawarehouse Schema
The data warehouse schema consists of two elements, facts and dimensions. Facts keep
measure or measured data and dimensions are used to analyze the measure of facts through
aggregation operations, such as COUNT, SUM, and AVERAGE [7]. Each fact and dimension
are arranged in the form of tables called as fact tables and dimension tables. The fact tables
generally contain data about a particular subject and can be measured known as the measure,
fact tables have two or more foreign keys connected to the primary key in the dimension table
[8]. The dimension table contains perspective data about an entity and is defined by a single
primary key.
Data warehouse schema is a modelling used for multidimensional data and describes the
relation between dimension table with fact table and measure used. There are three kinds of
schema of data warehouse:
Star schema is a logical structure that has a fact table in the center, which consists of factual
data and surrounded by several dimension tables containing reference data.
Snowflake schema is a variation of star schema. The difference lies in the normalization of
dimension tables so it is not directly related to fact tables, but relates to other dimension
tables.
Fact constellation schema consists of several fact tables that use one or more dimension tables
simultaneously.
2.3. Online Analytical Processing (OLAP)
OLAP (Online Analytical Processing) is one of the tools of data warehouse that can be used
to analyze data and provide information based on multidimensional data model [8]. The
OLAP system enables users to query with large load queries and automatically aggregate data
from the data warehouse. OLAP focuses on oriented analytical queries to analyze data to
support decision making [9]
3. PROPOSED METHOD
3.1. Determining the Process
This stage aims to determine kinds of business processes which going to be selected from
many business processes in Pertamina EP. Based on the scope of the case study that has been
defined, the production process is defined as a selected business process which the
information generated by the data warehouse production is closely related to the production
process.
3.2. Choosing the Grain
At this stage, we select the data which going to be the fact to be analyzed; the selected grain
will be presented in fact table [9]. To help determining the accurate grain of this case study,
daily production reports and daily low and off reports are used as sources of data to be
4. Sofyan and Abba Suganda Girsang
http://www.iaeme.com/IJMET/index.asp 224 editor@iaeme.com
analyzed. Table II shows the Source Table taken from operatonal database used in this case
study
Table 2 Source Tables
No Table Name Description
1 Well Contains well information
2 WellProd Contains production data
3 WellProblem Contains data problems that occur in the well
4 LowOffGain Contains information about the low-off-gain occurring in the well
3.3. Identifying and Conforming The Dimensions
Identify the dimensions to make sure the selected dimension matches the data warehouse
needs to be developed. If the dimension is used in more than one business process, then the
attribute in the dimension must be conformed so that the dimension can be used together.
Table 3 shows a list of dimensions of identification result.
Table 3 Dimension Table Selected
N
o
Process Dimension Dimension Table
1 Production
Date
Well
Dim_date
Dim_well
2 Low and Off
Date
Well
Problem
Dim_date
Dim_well
Dim_wellproblem
3.4. Determining the Facts
Facts are based on predetermined business processes and grains. The facts should be additive,
which means it can be summarized in all dimensions [10]. Based on the business processes,
grains and dimensions set in step one to three, the gross, oil and gas variables are „Facts” in the
Facts_Of Production Table as shown in Table 4 and the low variable and off variable are „Facts‟ in the
Facts Of Low And Off Table as shown in Table 5.
Table 4 Facts Of Production
N
o
Facts Description
1 Gross Volume of fluid of Produced
2 Oil Volume of oil produced
3 Gas Volume of gas produced
Table 5 Facts OF Low and Off
No Facts Description
1 Low
Potential oil production loss because well failed to
work less than 24 housr
2 Off
Potential oil production loss because well failed to
work more than 24 housr
5. Production Controlling for Matured Field of Oil and Gas Company
http://www.iaeme.com/IJMET/index.asp 225 editor@iaeme.com
3.5. Storing Pre-calculation in the Fact Table
Here, the pre-defined facts will be reviewed to see if there are new facts which are the result
of calculations from the facts contained in the previous fact table. The review of this process
resulted new facts in form of aggregation of low variable and off variable. The total low and off is
shown in Table 6 .
Table 6 Pre-calculation
No Pre-calculation Description
1 Total Low and Off Total low and off oil
3.6. Rounding Out the Dimension Table
Rounding out the dimension table is done by describing or defining attributes in the
dimension table. It gives a complete description and is easily understood by the user [11], as
shown in Table 7, Table 8 and Table 9.
Table 7 Date Dimension Attributes
Dimensi
on
Attribute
Data
Type
Description
Date
sk_date Integer Surrogate key of date dimension table
Date Date yyyy:mm:dd
DayOfDat
e
Integer [1..31].
MonthOf
Date
Integer [1..12].
YearOfDa
te
Integer yyyy.
Table 8 Well Dimension Attributes
Dimensi
on
Attribute
Data
Type
Description
well
Sk_well Integer
surrogate key of Well dimension
table
Well_name Varchar Well Name
Field_name Varchar Field Name
Asset_name Varchar Asset Name
Table 9 Wellproblem Dimension Attributes
Dimension Attribute Data Type Description
wellproblem
Sk_problem Integer
surrogate key of Wellproblem
dimension table
Problem_ group Varchar Problem groups
Problem_ type Varchar Problem types
3.7. Choosing the Duration of the Database
In this researh, the duration of the database will be loaded to data warehouse is from January
1st 2016 to March 31st 2017.
6. Sofyan and Abba Suganda Girsang
http://www.iaeme.com/IJMET/index.asp 226 editor@iaeme.com
3.8. Tracking Slowly Change the Dimension Table
This stage aims to determine the response if there is a change in the value of each record in
the dimension table. There are three types of responses to the value changes in each record in
the dimension table [12], as shown in the Table 10.
Table 10 Slowly Change The Dimension Table (SCD) Types
Dimension
SCD
Type
Dimension table Action Impact on Fact Analysis
Dim_date 0
Static, no change to attribut
value
Retain original, no impact
Dim_well
1 Overite attribute value Change the value of the records
Dim_ wellproblem 2 Add new dimension record
Facts associated with attribute value
in effected when fact occured
3.9. Deciding the Physical Design
The next step is to make the physical design of the data warehouse to be built. At this stage
the making of administrative procedures, backup procedures, and security in the data
warehouse are not discussed because the scope of the research focuses only on developing
data warehouses that can be used to help Pertamina EP analyze existing data to obtain
information on production achievements, Low and Off levels occurring and the dominant
factors causing low and off.
Figure 1 Star Schema Production
Figure 1. Star Schema Low and Off
PK Sk_Well PK Sk_Wellprod PK Sk_Date
Well_name FK Sk_Date Date
Field_name FK Sk_Well DayOfDate
Asset_name Gross MonthOfDate
Oil YearOfDate
Gas
Dim_DateFact_ProductionDim_Well
PK Sk_Well PK Sk_LowOffGain PK Sk_Date
Well_Name FK Sk_Date Date
Field_name FK Sk_Well DayOfDate
Asset_Name FK Sk_Wellproblem MonthOfDate
Potency YearOfDate
Off
Low
PK Sk_WellProblem
Problem_group
Problem_type
Dim_Date
Dim_WellProblem
Dim_Well Fact_LowOffGain
7. Production Controlling for Matured Field of Oil and Gas Company
http://www.iaeme.com/IJMET/index.asp 227 editor@iaeme.com
Figure 2. Fact Constellation Schema Production and Low and Off
Using the nine-step design methodology in designing data warehouse, we obtain two star
schemas, shown in Figure 1 and Figure 2, and we obtain one fact constellation schema as we
combine the two schemas, shown in Figure 3.
4. RESULTS AND DISCUSSIONS
4.1. Extract, Transform and Load (ETL)
By ETL process the data On-Line Transactional Processing (OLTP) will be transformed into
data On-Line Analytical Processing (OLAP) [13]. The ETL process is performed to extract,
transform and load the necessary data into each of the dimension tables and fact tables that
have been planned.
Pentaho Data Integration community edition is used for ETL process. Figure 4
summarizes the ETL process in this case study.
Figure. 3. ETL Process Sequence
Time dimension ETL Process (dim_date) aims to form time variable in the form of: year,
month, and day which represents the duration of the database to be used. The sequences of
process are shown in Figure 5.
Figure 4 Time Dimension ETL Process
Well Dimension ETL Process. The purpose of well dimension ETL process is to form
dim_well tables containing well property information.
PK Sk_Wellprod
FK Sk_Date
FK Sk_Well
PK Sk_Well Gross PK Sk_Date
Well_Name Oil Date
Field_name Gas DayOfDate
Asset_Name MonthOfDate
YearOfDate
PK Sk_LowOffGain
FK Sk_Date
FK Sk_Well
FK Sk_Wellproblem
Potency PK Sk_WellProblem
Off Problem_group
Low Problem_type
Dim_WellProblem
Fact_Production
Dim_Well Dim_Date
Fact_LowOffGain
8. Sofyan and Abba Suganda Girsang
http://www.iaeme.com/IJMET/index.asp 228 editor@iaeme.com
Figure.5. ETL Well Dimension Process
Problem dimension ETL Process. The purpose of dimension process ETL problem is to
form dim_problem dimension table in data warehouse by transforming data from wellproblem
table in operational database. The ETL process for dim_problem can be seen in figure 7.
Figure. 6. Problem Dimension ETL Process.
Facts Production ETL Process. ETL process at this stage aims to form fact table
fact_production that will store fact production data such as the amount of gross production,
the amount of oil production and the amount of gas production. The ETL process for the
fact_production table is illustrated in Figure 8.
Figure. 7. Fact_production ETL Process
Fact Lowandoff ETL Process. The fact_lowandoff ETL process aims to sort and load low,
off, and problem facts in the operational database into fact table in the data warehouse. The
ETL process for the fact_lowandoff table is illustrated in Figure 9.
Figure. 8.Fact_lowandoff ETL Process
4.2. Designing OLAP Cube Schema
After the ETL process is done, the next stage is to analyze the historical data that has been
stored in the data warehouse to obtain information about the achievement of production and
the dominant cause of the low and off multi-dimensionally, therefore the design of OLAP
Cube schema that describes the relationship between dimension tables with fact tables and the
measure used is needed.
In this case study, the OLAP Cube schema is made using Mondrian Schema Workbench.
The result of OLAP will be analyzed furthermore using Pivot4J Analityc from Pentaho
9. Production Controlling for Matured Field of Oil and Gas Company
http://www.iaeme.com/IJMET/index.asp 229 editor@iaeme.com
Business Analytics. In this case study, a schema consisting of two pieces of cube will be
made, they are cube production and cube low and off. Cube is used to model the data so that it
can be viewed multidimensi.
Cube production is a multi dimensional data model that will be used for the analysis
process of oil, gas and fluid production. The fact table used in the production cube formation
is fact_production table, and dimension table used is time dimension table and well dimension
table, as shown in figure 10.
Figure 9 Cube Production
Cube lowandoff is a multi dimensional data model that will be used for low and off
analysis process. The fact table used in the formation of lowandoff cube is fact_lowandoff
table, and dimension table used is dim_date table, dim_well table, and dim_problem table, as
shown in Figure 11.
Figure 10 Cube Lowandoff
4.3. Dimensional Hierarchy
The well-dimensional hierarchy or dim_well is used on the formation of production cube and
lowandoff cube. The well dimension type is standard dimension type. The hierarchy in this
dimension consists of three levels, i.e the asset level as the highest hierarchy, then the field
level, and the well level as the lowest level, as shown in Figure 12.
10. Sofyan and Abba Suganda Girsang
http://www.iaeme.com/IJMET/index.asp 230 editor@iaeme.com
Figure. 11 Well Dimension Hierarchy
With the well dimension, production data or low and off data can be drilled down or rolled
up by the users based on assets, fields and wells
The time dimension hierarchy (dim_date) is used in the formation of production cube and
lowandoff cube. The time dimension hierarchy is the dimension of the time dimension type
consisting of three hierarchies, i.e year level as the highest level, month level, and day level as
the lowest hierarchy as shown Figure 13. Using time dimensions production data and Low
and Off data can be drilled down or rolled up by the users on an annually, monthly and daily
basis.
Figure. 12 Time Dimension Hierarchy.
Problem dimension hierarchy (dim_problem) as shown in Figure 14, is only used in the
formation of cube lowandoff. Hierarchy problem dimension type is standard dimension which
consists of two hierarchy, problem group level as highest level an problem type level, using
problem dimension allows users to drill down or roll up low and off data based on group of
problem and type of problem that happen.
11. Production Controlling for Matured Field of Oil and Gas Company
http://www.iaeme.com/IJMET/index.asp 231 editor@iaeme.com
Figure. 13 Problem Dimension Hierarchy.
4.4. OLAP Analysis Design Results
OLAP Analysis created using this Analytics Pivot4J helps analysts to explore production data
and low and off data from multiple dimensions. Analysts can do drill down or roll up data to
assist the process of data analysis as part of the production planning process. OLAP Analysis
Production based on well dimension and time dimension allows the analyst to obtain more
detailed information in the form of total oil, total gas, and total gross based on well dimension
level and time dimension level at the same time, as shown in Figure 15.
Figure. 14 OLAP Analysis Production Result Based on Time Dimension, and Well Dimension
Figure. 15. OLAP Analysis Low and Off Result Based on Time Dimension, Well Dimension and
Problem Dimension
12. Sofyan and Abba Suganda Girsang
http://www.iaeme.com/IJMET/index.asp 232 editor@iaeme.com
OLAP Analysis Low and Off based on time dimension, problem dimension and well
dimension yield information on which factors are the most dominant cause low and off on a
oil field, based on time span, problem types, and the amount of loss of production potential
caused by the well problem type, as shown in Figure 16.
4.5. Low and Off Report
The Low and off Report is created to provide a summary of informatian as the result of OLAP
Analysis low and off based on time dimensions, well dimensions, and problem dimensions.
Low and Off Report is built using Pentaho Report Designer (PRD) and is integrated with
Pentaho Bussines Analytics so that this report can be accessed by users from the Pentaho
Bussines Analytics portal.
Figure 17. presents a summary of low and off information on a oil field in January 2017.
In January 2017 the potential loss of oil production due to low and off incident 4,596 barrels
and the number of low and off incidents are 659 times. The dominant cause of the low and off
in January 2017 is system of Artificial Lift (ESP / SRP / HPU / HJP / PCP) and contributed to
the potential loss of oil production by 66% of the total 4,596 barrels of oil potential loss due
to low and off in January 2017.
Figure. 16. Sample Low and Off Report .
4.6. Production Dashborad
Production Dashboard created with Pentaho Community Dashboard Editor (CDE), this
dashboard provides graphical information about production achievements as well as a
summary of OLAP Analysis results on low and off based on time and well dimension in a
single web page accessed online from the Pentaho Business Analytics portal as shown in
Figure 18.
13. Production Controlling for Matured Field of Oil and Gas Company
http://www.iaeme.com/IJMET/index.asp 233 editor@iaeme.com
Figure. 17. Sample Of Production Dashboard
To ensure that the solution to be used has a good performance, it is necessary to perform
performance tests. the process of testing are updating the database, loading to the main page,
displaying the reports and dashboard. Each process is tested three times to obtain the
maximum time required to run the process. The conclusion of the performance test results
show the total time required to run all the processes start from updating database until
obtaining all reports required, takes the longest 7 minutes 35 seconds 11 milliseconds. Details
of performance test results for each process are shown in the Table 11.
Table 11 Performance Test Result
Process Name Test Result (mm:ss:ms)
I II III
Database Update 05:41:65 05:12:41 05:20:66
Access to main page 00:17:92 00:16:40 00:16:52
Report type 1 00:10:37 00:09:84 00:09:12
Report type 2 00:09:18 00:09:17 00:09:21
Report type 3 00:12:42 00:10:92 00:11:08
Report type 4 00:12:73 00:11:38 00:11:22
Dashboard type 1 00:13:00 00:12:36 00:12:80
Dashboard type 2 00:13:66 00:09:56 00:11:05
Dashboard type 3 00:12:42 00:10:92 00:11:08
Dashboard type 4 00:12:73 00:11:38 00:11:22
5. CONCLUSIONS
Based on the results of data warehouse design conducted, it can be summarized as follows.
Data warehouses are developed by utilizing existing operational database, can be a solution to
provide low and off information as required by the executive in obtaining the multi-
dimensional report quickly and and accessible online. For the Company, the utilization of data
warehouse can provide advantages such as time efficiency, cost efficiency. The consistency,
completeness and validity of the data source are important factors to be considered for the
success of the ETL Process. Low & Off Report and Dashboard production can be further
developed to provide decline rate analysis information in the production planning process.
.