SlideShare a Scribd company logo
1 of 40
Recipes of Data Warehouse and 
Business Intelligence 
The Descriptions Management
Introduction 
The Descriptions Management 
The descriptions management, is a subject little discussed within the Data Warehouse and Business Intelligence community. It speaks little in books and 
articles, although it is a crucial issue because it affects the way we see the information. 
So, we'll talk about descriptions. In particular, the descriptions of the codes that we can found in the dimension tables of a Data Warehouse. I think it's 
important to show the complexity that is hidden behind what seem trivial. And it is necessary for all, (end users, project managers, data architects and the 
other stakeolders) know that the descriptions management is a particularly tricky topic. 
A real life example 
The best way to introduce this topic, is to start with a real example. Suppose we observe an end-user who is using a Business Intelligence interface. The 
user needs to extract numerical data, for example, the operations of an account, and he wants to filter only those customers who have a particular 
profession, for example, only employees. 
In order to meet this requirement, the Data Warehouse Team has just prepared a menu with a list of choices of values, (that is, codes and descriptions) on 
which the end user can choose the line "Employee". The following figure shows this case. 
This figure shows the visual world. Now we enter in the world of descriptions management , with a number of considerations that describe the logical and 
the physical world that hides behind the little word "just".
The Descriptions Management 
The vision of the end-user 
The description "Employee" associated with the code "10" is a description wanted by the end user. It is very important to show a description that is in 
tune with the daily nomenclature of the end users. This need has a very important consequence. 
It is very unlikely that the code description, which comes from the source systems (we call it "external"), it is the same description of the end user, in fact, 
usually, the description will be different. This is because the description is present in the OLTP systems, not in the synthesis systems. 
So you need to always have in the Data Warehouse two descriptions. That of OLTP systems, let's call it "External description" and the final one, let's call it 
"Business description“ . 
To have both is important because it is the only link between two systems that have different purposes. 
The extended description 
As seen in the example, in the list of choices, must not appear only the description, but also the code. Often this does not happen, and it is shown only 
the description. My suggestion is to show them both. 
We define this new description as "extended description", and it must be obtained by concatenating the value of the code with a "-" (minus) and the 
description of the value. It is useful to do this, because end users are not all equal. 
Some user think using descriptions, but others think using codes. Often the end users have so many years working in a certain environment, that it is 
easier for them to find customers with profession "10" rather than those with the profession "Employee". Moreover, the descriptions, in time, may vary, 
so the presence of the code is undoubtedly safer. The best thing is to show them both, code and description, to meet all types of users. 
So "10 - Employee" 
Because we have an external description and a business description, we have to decide which of the two use to build the extended description. In 
general, it is appropriate to use a single description, namely the business description, but it can be an acceptable choice also have two extended 
descriptions. 
Let's now look at the physical world behind the list of values displayed to the end user. Because as we know, the logical considerations, always have 
physical consequences.
The Descriptions Management 
How to see the extended descriptions 
When a user requests a choice of values on the Business Intelligence interface, the display of the list is almost always the result of the generation of a 
query on the database of the Data Warehouse. Generally, it is very likely that starts a "select distinct" of the values of the extended description, on the 
dimension table. 
The extended description can be implemented without taking up space, with the characteristic of the virtual columns. The virtual columns are defined on 
Oracle, but you can always create views that create those columns. We will show later how to define it. 
Here's an example, very simplified, of the columns of a possible customers dimension table, which will be the base of the list of values: 
• CUST_KEY - Artificial key associated to the customer 
• CUST_COD - Natural key (customer code) 
• PROF_COD - Profession code 
• PROF_HDS - External Description (OLTP description) 
• PROF_BDS - Business description 
• PROF_XDS - Extended description. Obtained by concatenating the code with the business description 
Where to find the extended descriptions 
We have seen that a simple "click" on a list of choices, hides various considerations. But the most surprising thing is that the above is just the tip of the 
iceberg. In this case, the similarity is very apt. The tip is what the end user sees. But what happens below the surface? It 'so trivial to get descriptions that 
come (maybe) from source systems, and insert them in the dimension tables? 
As always happens in a Data Warehouse, nothing is trivial and everything is more complicated than it seems. Indeed, code descriptions can be located, in 
my experience, in seven different places. 
1. The code description is already in the staging table (which will become the dimension table) 
2. The code description is present in a table of descriptions, exclusively for that code. 
3. The code description is present in a table of descriptions for all of the codes. (or nearly all) 
4. The code description is present in a sheet excel / csv 
5. The code description is present in a Word document or pdf or xml 
6. The code description is in an email. 
7. The code description is hard-coded in the code of the OLTP forms , so, after a phone call to some referent of OLTP systems, we return to one of the 4 
or 5 or 6 point.
Without thinking where are the descriptions and the way in which we will acquire, how can we manage the descriptions in our Data Warehouse? My 
advice is to always have a metadata table as that of the point 3 and reduce all other cases to it. But to do so, we need to introduce the concept of domain 
of values. 
The concept is very simple. Just give a name to the set of values associated with a certain code. For example, if we know that the set of possible values of 
the code profession is: 
10 = Employee 
20 = Free Lance 
30 = Tradesman 
Associate the "PROF" name to this domain of values and insert this data into a metadata table that we call MEF_DOM_CFT (MEF Domains Configuration). 
The table can have a very simple structure with the following information: 
• DOM_COD - Domain code 
• COD_TXT - Value of the code 
• HDS_TXT - External description 
• BDS_TXT - Business description 
The Descriptions Management 
Descriptions as metadata 
The Naming Convention of the descriptions 
We adopt, as always, a naming convention. On this issue I have already written a lot on my blog, in this case we assume that all the fields of descriptive 
codes, ie codes that makes sense to have a description, must be called like * _COD, the external descriptions like * _HDS, the business descriptions like 
*_BDS *, the extended descriptions like *_XDS. 
Who has experience of Data Warehouse knows that, unfortunately, if we wait for the end of the analysis of the domains, we lose a lot of time, so we must 
act preventively. 
My advice is to fill in the automatic way the domains table with the values of the codes that we found in the staging tables, giving them a unique name, 
and a default description like "nd" (not defined). In a second step we can take action on generic domains by changing their default name . 
The default domain name that we will adopt will be <source >. <Column name>.
The Descriptions Management 
Other considerations 
Descriptions are names that rarely vary. It would be a waste of time to set a daily loading/update of the domains table. Do an initial massive load and 
manage, when it is necessary, any changes and additions. 
Often you get, in the source files, short and long descriptions. Obviously, if they are present in the source, you should load them, but you should always 
use the short description as "external description". Now the trend is to shorten the information (see the language of SMS messages) so, in my opinion, 
you can ignore long descriptions that can create problems of sizing on graphical interfaces (PC,Tablet and smartphones) 
The business description is necessarily manual. It will be the end user who will decide which description prefers to see. The initial loading, however, must 
initialize the business description equal to the external description, if,and only if, the business description is empty. Then, from time to time, we will 
proceed to the changes. 
Another point that should not be overlooked is the following: how much are reliable, codes and descriptions that we find in the points 2-6? 
As often happens, the documentation is not always up to date, so we risk to have not the full domain of the code. On this point, we have to be proactive 
in the sense that we can not ask generically an update of the documentation of the source systems. 
We must be the ones to ask precisely the descriptions of the codes that we do not find in the documentation. 
The importance of the domains 
Finally, the main reason for the existence of the domains table. The reuse of the description. In the tables of a Data Warehouse the same domain can be 
applied to so many columns. 
In fact they have different names, but refer to the same domain of values. 
For example, a currency code, can be present in a table with name CURTRAT (currency of treatment) and CURTRA (currency of trading), but both must 
have the same domain, let's call it "CUR".
The Descriptions Management - Stage 1 
The work plan 
At this point, we explore the descriptions management with a real simulation, obviously very simplified, after creating the simulation environment. The 
work plan is as follows. 
1. Create the environment 
2. Load the domains table 
3. Load the staging table 
4. Update the domains table on the basis of the codes that we find in the table of Staging Area 
5. Load the dimension table 
6. Update the dimension table with descriptions 
Create the environment 
We prepare a simulation environment, creating the necessary tables with only the information they we need, which is the table of the staging area, the 
dimension table and the domains table. As we can see, the dimension table contains both the external description and the business description. The 
extended description is built using the syntax of the virtual columns. 
DROP TABLE CUST_STA CASCADE CONSTRAINTS; 
CREATE TABLE CUST_STA ( 
CUST_COD VARCHAR2(9), 
PROF_COD VARCHAR2(30) 
); 
DROP TABLE MEF_DOM_CFT CASCADE CONSTRAINTS; 
CREATE TABLE MEF_DOM_CFT ( 
DOM_COD VARCHAR2(61) NOT NULL 
,COD_TXT VARCHAR2(60) 
,HDS_TXT VARCHAR2(255) 
,BDS_TXT VARCHAR2(255) 
);
The Descriptions Management - Stage 1 
DROP TABLE CUST_DIT CASCADE CONSTRAINTS; 
CREATE TABLE CUST_DIT ( 
CUST_KEY NUMBER 
,CUST_COD VARCHAR2(9) 
,PROF_COD VARCHAR2(30) 
,PROF_HDS VARCHAR2(60) 
,PROF_BDS VARCHAR2(60) 
,PROF_XDS VARCHAR2(93) GENERATED ALWAYS AS (PROF_COD||' - '||PROF_BDS) VIRTUAL 
); 
Load the domains table 
The loading of this table will necessarily be a mix of automatic loading and manual loading. We refer to the 7 situations seen before. 
If the domains are already present in a specific table or in a common table, we will use just a SQL statement to load it. 
If you are in an Excel spreadsheet, convert it to a CSV file, point it with an Oracle external table and then load them with a SQL statement. 
Otherwise, you have to load them manually one by one. 
Suppose to be in the latter case and load the domain of the profession code. We note the difference between external description and business 
description. 
If, as often happens, the feeding system is a mainframe, the descriptions will come all in uppercase and they will not be very user-friendly for a directional 
reporting. 
INSERT INTO MEF_DOM_CFT VALUES ('PROF','10','EMPL.' ,'Employee'); 
INSERT INTO MEF_DOM_CFT VALUES ('PROF','20','FREE L.' ,'Free Lance'); 
INSERT INTO MEF_DOM_CFT VALUES ('PROF','30','TRADESMAN','Tradesman'); 
COMMIT; 
After the insertion, the contents of the domains table will therefore be:
The Descriptions Management - Stage 1 
Obviously, if we have not yet considered the analysis of the domains or we are still waiting for documentation, this step can be ignored. The table remains 
empty and will be loaded automatically in a second step. 
Load the staging table 
The staging table loading is a job that will be part of the ETL process. For simplicity we simulate its load with the following SQL statements. The '?' as value 
code, substitutes NULL. 
From the methodological point of view, I consider important to avoid NULL values in the Data Warehouse, and replace them immediately in the Staging 
Area with a default value. 
Usually I use the question mark. [for a detailed discussion refer to [http://www.slideshare.net/jackbim/recipes-5-of-data-warehouse-the-null-values-management- 
in-the-etl-process] 
INSERT INTO CUST_STA VALUES ('Larry','10'); 
INSERT INTO CUST_STA VALUES ('Tom' ,'15'); 
INSERT INTO CUST_STA VALUES ('Brown','20'); 
INSERT INTO CUST_STA VALUES ('Steve','40'); 
INSERT INTO CUST_STA VALUES ('Jack' ,'?'); 
COMMIT;
The Descriptions Management - Stage 1 
After the insertion, the content of the staging table will be: 
Update the domains table 
As you can see from the staging area contents, I purposely simulated a situation of domain not updated, in fact are not present all the 
values that could arrive (missing the "15", "40" and a "?" ). In order to keep the domains table updated, we perform a procedure that we 
can run as post-processing in the Staging Area loading. The procedure is the following. 
create or replace procedure p_dom_prof(p_dom_cod varchar2) is 
begin 
for r in (select distinct p_dom_cod dom_cod 
,a.prof_cod 
,b.cod_txt 
,nvl(b.hds_txt,'nd') hds_txt 
,nvl(b.bds_txt,'nd') bds_txt
from cust_sta a 
left outer join mef_dom_cft b 
on (a.prof_cod = b.cod_txt 
and b.dom_cod = p_dom_cod)) loop 
if (r.cod_txt is null) then 
insert into mef_dom_cft (dom_cod,cod_txt,hds_txt,bds_txt) 
values(p_dom_cod,r.prof_cod,r.hds_txt,r.bds_txt); 
end if; 
end loop; 
commit; 
end; 
/ 
sho errors 
exec p_dom_prof('PROF'); 
The Descriptions Management - Stage 1 
The procedure performs a simple loop on the values of the profession code of the staging table. There is the link with the domains table,using domain 
name that receives as input. As we can see, the procedure will fill the domain table even if the table is initially empty (by means of the outer join). Also 
always acts as an insert, then does not alter or delete descriptions that have been entered manually. 
The procedure also inserts the missing codes with the default description "nd". If now we query the domains table, we will get:
The Descriptions Management - Stage 1 
Load the dimension table 
The loading of the dimension table, is one of the most challenging and delicate component of the entire ETL process. From my experience, almost all 
dimension tables should take account of the "Slowly Changing Dimension" of type 2, on which I will not go into detail now. Precisely because of this 
complexity, I preferred not to add complexity to complexity. I prefer manage the descriptions as post-processing of the loading of the dimension tables. 
So, first I load the dimension table by taking the data from the staging table, then I update the descriptive components. We load the dimension table in a 
very simply way with the following SQL statement: 
INSERT INTO CUST_DIT(CUST_KEY,CUST_COD,PROF_COD) 
SELECT ROWNUM,CUST_COD,PROF_COD 
FROM CUST_STA; 
COMMIT; 
Situation after loading:
The Descriptions Management - Stage 1 
Update the dimension table 
To update the dimension table, we create and execute the following procedure. 
create or replace procedure p_dit_prof(p_dom_cod varchar2) as 
begin 
for r in (select distinct prof_cod from cust_dit) loop 
update cust_dit d 
set (prof_hds,prof_bds) = (select hds_txt,bds_txt 
from mef_dom_cft s 
where d.prof_cod = s.cod_txt 
and s.dom_cod = p_dom_cod) 
where prof_cod = r.prof_cod 
and exists (select 1 
from mef_dom_cft s 
where d.prof_cod = s.cod_txt 
and s.dom_cod = p_dom_cod); 
end loop; 
commit; 
end; 
/ 
sho errors 
exec p_dit_prof('PROF'); 
The logic of the procedure is very simple. For each value of the profession code, it performs the update of the description taken from the domains table. If 
now we query the dimension table we will get:
The Descriptions Management - Stage 1 
After the nightly loading of the Data Warehouse, we can query the domain table (or the dimension table), and all situations, in which there is the 
description "nd", must be the object of attention for all the people who are concerned with the quality control of the data . In practice, for each code that 
is "nd", we must send a request for a description to the feeding systems.
The Descriptions Management - Stage 2 
The Phase 1 is helpful to us, to be aware of the problems and difficulties associated with descriptions management . The solution shown, although 
functional, is practically unusable in a real Data Warehouse. 
The procedures that update the domains table and the dimension table, are specific for the profession code, and such procedure should be repeated, with 
appropriate modifications, for all other codes that require a description. In a Data Warehouse, we can have dozens of analytical dimensions, each with 
dozens of descriptive codes. It is not thinkable write hundreds of procedures for the descriptions management. 
So let's see how to implement a single procedure that handles all automatically. To achieve the result indicated, we must necessarily get help from some 
configuration tables. In particular, a configuration table of the input stream, which indicate me the name of the staging table and of the dimension table. 
And we need a configuration table of all columns that need a description. 
Environment creation 
We enrich the staging table assuming the presence of three codes. The first, already seen in the previous phase is the profession code.The second is the 
status code of the customer. The third is the type code of the customer. We adapt also the dimension table with the new codes. 
DROP TABLE CUST2_STA CASCADE CONSTRAINTS; 
CREATE TABLE CUST2_STA ( 
CUST2_COD VARCHAR2(9) 
,PROF_COD VARCHAR2(30) 
,STATUS_COD VARCHAR2(1) 
,TYPE_COD VARCHAR2(1) 
); 
DROP TABLE CUST2_DIT CASCADE CONSTRAINTS; 
CREATE TABLE CUST2_DIT ( 
CUST2_KEY NUMBER 
,CUST2_COD VARCHAR2(9) 
,PROF_COD VARCHAR2(30) 
,PROF_BDS VARCHAR2(60) 
,PROF_XDS VARCHAR2(93) GENERATED ALWAYS 
AS (PROF_COD||' - '||PROF_BDS) VIRTUAL
The Descriptions Management - Stage 2 
,STATUS_COD VARCHAR2(1) 
,STATUS_BDS VARCHAR2(15) 
,STATUS_XDS VARCHAR2(19) GENERATED ALWAYS 
AS (STATUS_COD||' - '||STATUS_BDS) VIRTUAL 
,TYPE_COD VARCHAR2(1) 
,TYPE_BDS VARCHAR2(15) 
,TYPE_XDS VARCHAR2(19) GENERATED ALWAYS 
AS (TYPE_COD||' - '||TYPE_BDS) VIRTUAL 
); 
We now introduce two new configuration tables. A configuration table of the input streams (MEF_IO2_CFT) and a configuration table of the columns of 
the staging tables (MEF_STA2_CFT),which will become columns of the dimension table. 
The MEF_IO2_CFT table contains, for each identifier of a stream (IO_COD), the name of the staging table (STA_TABLE_COD) and the name of the 
dimension table (DIM_TABLE_COD) . We insert a configuration line . 
The MEF_STA2_CFT table contains, for each identifier of a stream (IO_COD), the name of the column code (COLUMN_COD) and the domain name 
associated (DOM_COD). 
This configuration table is very important, because it is the only place where we define for each code, the name of its domain. This name will allow you to 
access to the domains table to find the descriptions. For our example, we insert three lines, one for each code, assuming that the customer type has not 
yet been taken into account in the analysis of its domain. So, the domain code remains NULL. 
DROP TABLE MEF_IO2_CFT CASCADE CONSTRAINTS; 
CREATE TABLE MEF_IO2_CFT ( 
IO_COD VARCHAR2(9) 
,STT_COD VARCHAR2(30) 
,DIT_COD VARCHAR2(30) 
); 
INSERT INTO MEF_IO2_CFT VALUES ('CUST2','CUST2_STA','CUST2_DIT'); 
DROP TABLE MEF_STA2_CFT CASCADE CONSTRAINTS; 
CREATE TABLE MEF_STA2_CFT ( 
IO_COD VARCHAR2(9) 
,COLUMN_COD VARCHAR2(30) 
,DOM_COD VARCHAR2(30) 
);
The Descriptions Management - Stage 2 
INSERT INTO MEF_STA2_CFT VALUES ('CUST2','PROF_COD','PROF'); 
INSERT INTO MEF_STA2_CFT VALUES ('CUST2','STATUS_COD','STATUS'); 
INSERT INTO MEF_STA2_CFT VALUES ('CUST2','TYPE_COD',null); 
COMMIT; 
Load the domains table 
Recreate and load this table by adding to the domain already seen, that of the state of the customer: 
DROP TABLE MEF_DOM2_CFT CASCADE CONSTRAINTS; 
CREATE TABLE MEF_DOM2_CFT ( 
DOM_COD VARCHAR2(61) NOT NULL 
,COD_TXT VARCHAR2(60) 
,HDS_TXT VARCHAR2(255) NOT NULL 
,BDS_TXT VARCHAR2(255) NOT NULL 
); 
INSERT INTO MEF_DOM2_CFT VALUES ('PROF','10','EMPL.' ,'Employee'); 
INSERT INTO MEF_DOM2_CFT VALUES ('PROF','20','FREE L.' ,'Free Lance'); 
INSERT INTO MEF_DOM2_CFT VALUES ('PROF','30','TRADESMAN','Tradesman'); 
INSERT INTO MEF_DOM2_CFT VALUES ('STATUS','0','NO','Not active'); 
INSERT INTO MEF_DOM2_CFT VALUES ('STATUS','1','YES' ,'Active'); 
COMMIT; 
The table content will be the following:
The Descriptions Management - Stage 2 
Load the staging table 
The loading of this table will be similar to what has been seen 
INSERT INTO CUST2_STA VALUES ('Larry','10','0','P'); 
INSERT INTO CUST2_STA VALUES ('Tom' ,'15','1','P'); 
INSERT INTO CUST2_STA VALUES ('ACME' ,'20','1','L'); 
INSERT INTO CUST2_STA VALUES ('Steve','40','?','P'); 
INSERT INTO CUST2_STA VALUES ('Jack' ,'?','1','P'); 
COMMIT; 
Update the domains table 
We will use a single procedure. It, using the information of the configuration tables, dynamically creates the SQL statement to do the update. 
Furthermore, in case that some domains are not defined , associates automatically the default domain code as already described. 
Because sometimes it can be useful to call a field "* _COD" although it is not sensible to associate to it a description, we assume that, if we set the code 
domain equal to the word «NULL» (not the SQL null), this field will not be taken into consideration in the treatment of the domains.
The Descriptions Management - Stage 2 
create or replace procedure p_dom2(p_io_cod varchar2 default '%' 
,p_column_cod varchar2 default '%') is 
v_sql varchar2(4000); 
v_dom_cod varchar2(61); 
v_hds_txt varchar2(200); 
v_bds_txt varchar2(200); 
v_cod_txt varchar2(200); 
v_column_cod varchar2(200); 
type t_rc is ref cursor; 
v_cur t_rc; 
cr char(1) := chr(10); 
begin 
for r in (select a.STT_COD 
,b.column_cod 
,nvl(b.dom_cod,a.io_cod||'.'||b.column_cod) dom_cod 
from mef_io2_cft a 
inner join mef_sta2_cft b 
on (a.io_cod = b.io_cod 
and b.column_cod like p_column_cod 
and b.column_cod like '%_COD' 
and nvl(b.dom_cod,'*') <> 'NULL') 
where a.io_cod like p_io_cod) loop 
v_sql := 'select distinct '||''''||r.dom_cod||''''||cr|| 
' ,a.'||r.column_cod||cr|| 
' ,b.cod_txt'||cr|| 
' ,nvl(b.hds_txt,''nd'') hds_txt '||cr|| 
' ,nvl(b.bds_txt,''nd'') bds_txt '||cr|| 
'from '||r.STT_COD||' a '||cr|| 
'left outer join MEF_DOM2_CFT b '||cr|| 
'on (a.'||r.column_cod||' = b.cod_txt '||cr|| 
' and dom_cod='||''''||r.dom_cod||''''||')'; 
dbms_output.put_line(v_sql); 
open v_cur for v_sql;
The Descriptions Management - Stage 2 
loop 
fetch v_cur into v_dom_cod,v_column_cod,v_cod_txt,v_hds_txt,v_bds_txt; 
exit when v_cur%notfound; 
if (v_cod_txt is null) then 
insert into MEF_DOM2_CFT (dom_cod,cod_txt,hds_txt,bds_txt) 
values(v_dom_cod,v_column_cod,v_hds_txt,v_bds_txt); 
end if; 
end loop; 
close v_cur; 
end loop; 
commit; 
end; 
/ 
sho errors 
set serveroutput on 
exec p_dom2('CUST2'); 
Through the command "set serveroutput on" we can see an example of the SQL statement that is generated automatically for each source / column.
The Descriptions Management - Stage 2 
After the procedure execution, the content of the table will be the following. Note the power of this procedure. Although we have not yet thought about 
the domains management, if you call it without parameters, it will be able to initialize the domains table of for all code fields (that have been configured) 
throughout the Data Warehouse.
The Descriptions Management - Stage 2 
Load the dimension table 
The loading is similar to that of stage 1 
INSERT INTO CUST2_DIT(CUST2_KEY,CUST2_COD,PROF_COD,STATUS_COD,TYPE_COD) 
SELECT ROWNUM,CUST2_COD,PROF_COD,STATUS_COD,TYPE_COD 
FROM CUST2_STA; 
COMMIT; 
We see now the content of the dimension table before the next update 
Update the dimension table 
Also in this case we have to use a dynamic procedure 
create or replace procedure p_dit2( 
p_DIT_COD varchar2 default '%' 
,p_col varchar2 default '%' 
,p_force_cod number default 0) as 
type t_rc is ref cursor; 
v_cur t_rc;
The Descriptions Management - Stage 2 
v_sql varchar2(2000); 
v_value_cod varchar2(60); 
v_where varchar2(80); 
v_count number; 
cr char(1) := chr(10); 
begin 
for s in (select io_cod,DIT_COD 
from MEF_IO2_CFT 
where DIT_COD like p_DIT_COD) loop 
for r in (select b.column_cod 
,replace(b.column_cod,'_COD','_BDS') column_bds 
,nvl(dom_cod,s.io_cod||'.'||column_cod) dom_cod 
from cols a 
inner join MEF_STA2_CFT b 
on (a.column_name = b.column_cod 
and b.io_cod = s.io_cod) 
where a.table_name = s.DIT_COD 
and a.column_name like p_col 
and a.column_name like '%_COD' 
and nvl(b.dom_cod,'*') <> 'NULL') loop 
if (p_force_cod = 0) then 
v_where := ' where '||r.column_bds||' is null'; 
else 
v_where := null; 
end if; 
v_sql := 'select distinct '||r.column_cod||' value_cod'||cr|| 
' from '||s.DIT_COD||v_where; 
dbms_output.put_line(v_sql); 
open v_cur for v_sql; 
loop 
fetch v_cur into v_value_cod; 
exit when v_cur%notfound; 
-- usare using perchè se il valore contiene degli apici, l'sql fallisce 
v_sql :=
The Descriptions Management - Stage 2 
'update '||s.DIT_COD||' d'||cr|| 
'set ('||r.column_bds||') = ('||cr||'select bds_txt '||cr|| 
'from MEF_DOM2_CFT s where d.'||r.column_cod||' = s.cod_txt '|| 
'and s.dom_cod = '||''''||r.dom_cod||''''||')'||cr|| 
'where '||r.column_cod||' = :1 '|| 
' and exists ('||cr|| 
'select 1 from MEF_DOM2_CFT s where d.'||r.column_cod||' = s.cod_txt '|| 
'and s.dom_cod = '||''''||r.dom_cod||''''||')'; 
dbms_output.put_line(v_sql); 
execute immediate v_sql using v_value_cod; 
end loop; 
close v_cur; 
end loop; 
end loop; 
commit; 
end; 
/ 
sho errors 
set serveroutput on 
exec p_dit2('CUST2_DIT'); 
Some notes on this update procedure. 
• The first loop is used to identify the name of all the dimension tables, if you do not provide a specific input. 
• The second loop is used to identify values and descriptions associated with the domains. They will be the basis for the subsequent updates. 
• As we know, the UPDATE operations are, in general, slow operations at the Database level. But we perform the update only for the rows that have not 
the description valued (see, for example, the clause "where '|| || r.column_bds' is null"). This greatly reduces the weight of the update operation. In fact, 
in the case of the Slowly Changing Dimensions of type 2, we expect that new keys or new keys versioning, will involve only between the 5 and 10 percent 
of the total lines (this is why we call slowly dimensions). Then, with appropriate indexing, the weight of the update should not be excessive. 
• We note again that the condition (on NULL) seen before, is conditioned by an input parameter to the procedure. So, in the face of a massive update of 
the descriptions in the domains table, it is always possible to force a massive update of all dimension tables (off-line, of course). 
• We used the clause "using" in the dynamic execution, because if the value contains quotes, it could cause problems. 
The dynamic SQL used in the procedure, will produce statements of type:
The Descriptions Management - Stage 2 
select distinct STATUS_COD value_cod 
from CUST2_DIT where STATUS_BDS is null; 
update CUST2_DIT d 
set (STATUS_BDS) = ( 
select bds_txt 
from MEF_DOM2_CFT s where d.STATUS_COD = s.cod_txt and s.dom_cod = 'STATUS') 
where STATUS_COD = :1 and exists ( 
select 1 from MEF_DOM2_CFT s where d.STATUS_COD = s.cod_txt and s.dom_cod = 'STATUS') 
At the end of the update, the dimension table will be:
The Descriptions Management - Stage 3 
The previous phases have produced a fairly accurate descriptions management. Phase two has made it dynamic and reusable for all the dimension tables 
of a Data Warehouse. Unfortunately, we are still quite far from reality. 
We have not taken into account a very important factor: the dependence between codes. 
In a medium / large Data Warehouse and especially in an Enterprise Data Warehouse, where the dependence between codes exists, the phase 2 does not 
work anymore because we have to deal with the following situations. 
Example 1 
Suppose to be in a bank that has acquired other banks. The data warehouse will contain the data of the whole group, so that even the customer table 
must be unique, with a company code that differentiates the customers. We apply now conceptually this fact to our example of CUST_DIT. We are quite 
sure that the meaning "emplyee" (like description) of the profession code "10" for the B1 bank , has the same meaning for the B2 bank ? 
The question is obviously rhetorical. The answer is NO. We cannot have this guarantee, nor can we hope that the phase of data migration from the bank 
acquired at the acquiring bank has normalized all tables and meanings. This is the problem of the dependence between codes. A code no longer has a 
unique description, but has a different description depending on the value of another code. 
Example 2 
It must be clear that this problem does not occur only in the case of company mergers, but also on tables of a single company. Another example can be 
the causal of a banking operation. A causal code "01" may have a meaning "Descr XXX" if the movement takes place by the bank desk, but it can have a 
different description "Descr YYY" with the same code "01" if the transaction takes place via the Web. In this case the causal code is dependent from the 
transmission channel code. 
Esempio 3 
When I speak of dependence between codes, does not mean between two codes, but also among many codes. An example is the table of the financial 
instruments. Because the instruments are very numerous, it appears almost always a hierarchy that classifies financial instruments associating them to 
categories, groups and subgroups. 
As we can see from the example below, the subgroup with code "1" has three different descriptions depending on the group to which it belongs. In turn, 
the group code "2" has two different descriptions depending on the category to which it belongs. 
Category Group Subgroup 
4 COMMON FOUNDS 3 FOR.INV.FOUNDS 1 LUSS. MOB. 
7 OPTIONS 2 PUT ITALY 1 EQUITIES 
3 WARRANTS 2 FOREIGN WARRANTS 1 ORD. STOCK BUY
The Descriptions Management - Stage 3 
So, if we have dependence between codes, the management of the descriptions shown in step 2 does not work. You have to manage it in a different way. 
Change the configuration table 
The first thing to do is to change the configuration table of the columns by adding the information on the dependencies between codes. We assume no 
more than 5 dependencies. The new example table will be the dimensionale table of the financial instruments. 
DROP TABLE MEF_STA3_CFT CASCADE CONSTRAINTS; 
CREATE TABLE MEF_STA3_CFT ( 
IO_COD VARCHAR2(9) 
,COLUMN_COD VARCHAR2(30) 
,DOM_COD VARCHAR2(30) 
,DC1_COD VARCHAR2(30) 
,DC2_COD VARCHAR2(30) 
,DC3_COD VARCHAR2(30) 
,DC4_COD VARCHAR2(30) 
,DC5_COD VARCHAR2(30) 
); 
INSERT INTO MEF_STA3_CFT VALUES ('FIN','CAT_COD','CAT','','','','',''); 
INSERT INTO MEF_STA3_CFT VALUES ('FIN','GRP_COD','GRP','CAT_COD','','','',''); 
INSERT INTO MEF_STA3_CFT VALUES ('FIN','SUBGRP_COD','SUBGRP','GRP_COD','CAT_COD','','',''); 
If we see its contents, we can clearly see the dependencies between codes that we described earlier. The sub-group depends on the group and category, 
the group depends on the category, the category has no dependencies
The Descriptions Management - Stage 3 
Also configure the new source/dimension 
DELETE MEF_IO2_CFT WHERE IO_COD='FIN'; 
INSERT INTO MEF_IO2_CFT VALUES ('FIN','FIN_STA','FIN_DIT'); 
Create and load the domains table 
The domains table must contain the value of the dependence codes. These dependencies must be studied and analyzed. As in the previous stage, even if 
they are not known, all combinations of dependencies will be initialized automatically after the loading of the staging area. 
The DCLIST_TXT field is only informative. It is useful to know what are the names of the columns to which the DV * codes refer themselves. 
DROP TABLE MEF_DOM3_CFT CASCADE CONSTRAINTS; 
CREATE TABLE MEF_DOM3_CFT ( 
DOM_COD VARCHAR2(61) NOT NULL 
,COD_TXT VARCHAR2(60) NOT NULL 
,HDS_TXT VARCHAR2(255) 
,BDS_TXT VARCHAR2(255) 
,DV1_COD VARCHAR2(30 BYTE) 
,DV2_COD VARCHAR2(30 BYTE) 
,DV3_COD VARCHAR2(30 BYTE) 
,DV4_COD VARCHAR2(30 BYTE) 
,DV5_COD VARCHAR2(30 BYTE) 
,DCLIST_TXT VARCHAR2(160 BYTE) 
); 
INSERT INTO MEF_DOM3_CFT VALUES ('SUBGRP','1','LUSS. MOB.','LUSS. MOB.','3','4','','','','GRP_COD,CAT_COD'); 
INSERT INTO MEF_DOM3_CFT VALUES ('SUBGRP','1','EQUITIES','EQUITIES','2','7','','','','GRP_COD,CAT_COD'); 
INSERT INTO MEF_DOM3_CFT VALUES ('SUBGRP','1','ORD. STOCK BUY','ORD. STOCK BUY','2','3','','','','GRP_COD,CAT_COD'); 
INSERT INTO MEF_DOM3_CFT VALUES ('GRP','3','FOR.INV.FOUNDS','FOR.INV.FOUNDS','4','','','','','CAT_COD'); 
INSERT INTO MEF_DOM3_CFT VALUES ('GRP','2','PUT ITALY','PUT ITALY','7','','','','','CAT_COD'); 
INSERT INTO MEF_DOM3_CFT VALUES ('GRP','2','FOREIGN WARRANTS','FOREIGN WARRANTS','3','','','','','CAT_COD'); 
INSERT INTO MEF_DOM3_CFT VALUES ('CAT','4','COMMON FOUNDS','COMMON FOUNDS','','','','','',''); 
INSERT INTO MEF_DOM3_CFT VALUES ('CAT','7','OPTIONS','OPTIONS','','','','','',''); 
INSERT INTO MEF_DOM3_CFT VALUES ('CAT','3','WARRANTS','WARRANTS','','','','','',''); 
COMMIT;
The Descriptions Management - Stage 3 
See the domains table: 
Load the staging table 
The new staging table of financial instruments will be: 
DROP TABLE FIN_STA CASCADE CONSTRAINTS; 
CREATE TABLE FIN_STA 
( 
FIN_COD VARCHAR2(12) 
,CAT_COD VARCHAR2(3) 
,GRP_COD VARCHAR2(3) 
,SUBGRP_COD VARCHAR2(3) 
); 
INSERT INTO FIN_STA VALUES ('LU0108803940','4','3','1'); 
INSERT INTO FIN_STA VALUES ('IT0002693197','7','2','1'); 
INSERT INTO FIN_STA VALUES ('CH0014852120','3','2','1'); 
INSERT INTO FIN_STA VALUES ('IT0001119848','3','1','1'); 
INSERT INTO FIN_STA VALUES ('IT0000566056','1','1','1'); 
COMMIT;
The Descriptions Management - Stage 3 
As we can see, there's two combinations (1,1,1 and 3,1,1) not present in the domain. 
Update the domain table 
The update procedure of the domains is certainly more complicated than the previous case. The code is implemented only for demonstration, and surely 
can be implemented more efficiently. 
create or replace procedure p_dom3(p_io_cod varchar2 default '%' 
,p_column_cod varchar2 default '%') is 
v_sql varchar2(4000); 
v_dom_cod varchar2(61); 
v_hds_txt varchar2(200); 
v_bds_txt varchar2(200); 
v_cod_txt varchar2(200); 
v_column_cod varchar2(200); 
type t_rc is ref cursor; 
v_cur t_rc; 
cr char(1) := chr(10); 
v_list varchar2(200); 
v_list2 varchar2(200); 
v_c1 varchar2(200); 
v_c2 varchar2(200); 
v_c3 varchar2(200); 
v_c4 varchar2(200); 
v_c5 varchar2(200); 
v_count number; 
begin 
for r in (select a.STT_COD 
,b.column_cod 
,nvl(b.dom_cod,a.io_cod||'.'||b.column_cod) dom_cod 
,b.dc1_cod,b.dc2_cod,b.dc3_cod,b.dc4_cod,b.dc5_cod 
from mef_io2_cft a 
inner join mef_sta3_cft b 
on (a.io_cod = b.io_cod 
and column_cod like p_column_cod 
and b.column_cod like '%_COD' 
and nvl(b.dom_cod,’*’) <> 'NULL') 
where a.io_cod like p_io_cod) loop 
v_list := null; 
v_list2 := null; 
v_count := 0;
The Descriptions Management - Stage 3 
if (r.dc1_cod is not null) then 
v_list := v_list||','||r.dc1_cod; 
v_list2 := v_list2||' and '||r.dc1_cod||' = dv1_cod'; 
v_count := v_count+1; 
end if; 
if (r.dc2_cod is not null) then 
v_list := v_list||','||r.dc2_cod; 
v_list2 := v_list2||' and '||r.dc2_cod||' = dv2_cod'; 
v_count := v_count+1; 
end if; 
if (r.dc3_cod is not null) then 
v_list := v_list||','||r.dc3_cod; 
v_list2 := v_list2||' and '||r.dc3_cod||' = dv3_cod'; 
v_count := v_count+1; 
end if; 
if (r.dc4_cod is not null) then 
v_list := v_list||','||r.dc4_cod; 
v_list2 := v_list2||' and '||r.dc4_cod||' = dv4_cod'; 
v_count := v_count+1; 
end if; 
if (r.dc5_cod is not null) then 
v_list := v_list||','||r.dc5_cod; 
v_list2 := v_list2||' and '||r.dc5_cod||' = dv5_cod'; 
v_count := v_count+1; 
end if; 
v_sql := 'select distinct '||''''||r.dom_cod||''''||cr|| 
' ,a.'||r.column_cod||cr|| 
' ,b.cod_txt'||cr|| 
' ,nvl(b.hds_txt,''nd'') hds_txt '||cr|| 
' ,nvl(b.bds_txt,''nd'') bds_txt '||cr|| 
v_list||cr|| 
'from '||r.STT_COD||' a '||cr|| 
'left outer join mef_dom3_cft b '||cr|| 
'on (a.'||r.column_cod||' = b.cod_txt '||cr|| 
' and dom_cod='||''''||r.dom_cod||''''||cr|| 
v_list2||cr|| 
')'; 
dbms_output.put_line(v_sql);
The Descriptions Management - Stage 3 
open v_cur for v_sql; 
loop 
if (v_count=0) then 
fetch v_cur into v_dom_cod,v_column_cod,v_cod_txt 
,v_hds_txt,v_bds_txt; 
elsif (v_count=1) then 
fetch v_cur into v_dom_cod,v_column_cod,v_cod_txt 
,v_hds_txt,v_bds_txt,v_c1; 
elsif (v_count=2) then 
fetch v_cur into v_dom_cod,v_column_cod,v_cod_txt 
,v_hds_txt,v_bds_txt,v_c1,v_c2; 
elsif (v_count=3) then 
fetch v_cur into v_dom_cod,v_column_cod,v_cod_txt 
,v_hds_txt,v_bds_txt,v_c1,v_c2,v_c3; 
elsif (v_count=4) then 
fetch v_cur into v_dom_cod,v_column_cod,v_cod_txt 
,v_hds_txt,v_bds_txt,v_c1,v_c2,v_c3,v_c4; 
else 
fetch v_cur into v_dom_cod,v_column_cod,v_cod_txt 
,v_hds_txt,v_bds_txt,v_c1,v_c2,v_c3,v_c4,v_c5; 
end if; 
exit when v_cur%notfound; 
if (v_cod_txt is null) then 
insert into mef_dom3_cft (dom_cod,cod_txt,hds_txt,bds_txt 
,dv1_cod, dv2_cod,dv3_cod, dv4_cod, dv5_cod,dclist_txt) 
values(v_dom_cod,v_column_cod,v_hds_txt,v_bds_txt 
,v_c1,v_c2,v_c3,v_c4,v_c5,substr(v_list,2)); 
end if; 
end loop; 
close v_cur; 
end loop; 
commit; 
end; 
/ 
sho errors 
set serveroutput on 
exec p_dom3('FIN');
The Descriptions Management - Stage 3 
Compared to the version of phase 2, there is the need to acquire the name of the columns of the dependence codes. After the execution of the procedure, 
which, remember, is to be performed as post-processing of the Staging Area loading, the table of the domains will have the following values. 
Creation and load of the dimension table 
We create the new dimension table and load it from the staging table.
The Descriptions Management - Stage 3 
DROP TABLE FIN_DIT CASCADE CONSTRAINTS; 
CREATE TABLE FIN_DIT ( 
FIN_KEY NUMBER 
,FIN_COD VARCHAR2(12) 
,CAT_COD VARCHAR2(30) 
,CAT_BDS VARCHAR2(60) 
,CAT_XDS VARCHAR2(93) GENERATED ALWAYS 
AS (CAT_COD||' - '||CAT_BDS) VIRTUAL 
,GRP_COD VARCHAR2(30) 
,GRP_BDS VARCHAR2(60) 
,GRP_XDS VARCHAR2(93) GENERATED ALWAYS 
AS (GRP_COD||' - '||GRP_BDS) VIRTUAL 
,SUBGRP_COD VARCHAR2(30) 
,SUBGRP_BDS VARCHAR2(60) 
,SUBGRP_XDS VARCHAR2(93) GENERATED ALWAYS 
AS (SUBGRP_COD||' - '||SUBGRP_BDS) VIRTUAL 
); 
INSERT INTO FIN_DIT(FIN_KEY,FIN_COD,CAT_COD,GRP_COD,SUBGRP_COD) 
SELECT ROWNUM,FIN_COD,CAT_COD,GRP_COD,SUBGRP_COD 
FROM FIN_STA; 
COMMIT; 
We see the dimension table before the upgrade of the descriptions.
The Descriptions Management - Stage 3 
Update the dimension table 
Even the update procedure will be more complex than the previous, in order to manage the dependencies 
create or replace procedure p_dit3( 
p_DIT_COD varchar2 default '%' 
,p_col varchar2 default '%' 
,p_force_cod number default 0) as 
type t_rc is ref cursor; 
v_cur t_rc; 
v_sql varchar2(2000); 
v_sql2 varchar2(2000); 
v_value_cod varchar2(60); 
v_column_cod varchar2(30); 
v_dom_cod varchar2(61); 
v_exec_cnt number; 
v_where varchar2(80); 
v_count number; 
cr char(1) := chr(10); 
v_c1 varchar2(200); 
v_c2 varchar2(200); 
v_c3 varchar2(200); 
v_c4 varchar2(200); 
v_c5 varchar2(200); 
v_ditc1 varchar2(200); 
v_ditc2 varchar2(200); 
v_ditc3 varchar2(200); 
v_ditc4 varchar2(200); 
v_ditc5 varchar2(200); 
v_list varchar2(2000); 
v_list2 varchar2(2000); 
begin 
for s in (select io_cod,DIT_COD 
from MEF_IO2_CFT 
where DIT_COD like p_DIT_COD) loop 
for r in (select b.column_cod 
,replace(b.column_cod,'_COD','_BDS') column_bds 
,nvl(dom_cod,s.io_cod||'.'||column_cod) dom_cod 
,dc1_cod 
,dc2_cod 
,dc3_cod 
,dc4_cod 
,dc5_cod
The Descriptions Management - Stage 3 
from cols a 
inner join MEF_STA3_CFT b 
on (a.column_name = b.column_cod 
and b.io_cod = s.io_cod 
and b.column_cod like '%_COD' 
and nvl(b.dom_cod,'*') <> 'NULL') 
where a.table_name = s.DIT_COD 
and a.column_name like p_col 
) loop 
v_list := null; 
v_list2 := null; 
v_count := 0; 
dbms_output.put_line(r.column_bds); 
if (r.dc1_cod is not null) then 
v_list := v_list||','||r.dc1_cod; 
v_count := v_count+1; 
end if; 
if (r.dc2_cod is not null) then 
v_list := v_list||','||r.dc2_cod; 
v_count := v_count+1; 
end if; 
if (r.dc3_cod is not null) then 
v_list := v_list||','||r.dc3_cod; 
v_count := v_count+1; 
end if; 
if (r.dc4_cod is not null) then 
v_list := v_list||','||r.dc4_cod; 
v_count := v_count+1; 
end if; 
if (r.dc5_cod is not null) then 
v_list := v_list||','||r.dc5_cod; 
v_count := v_count+1; 
end if; 
if (p_force_cod = 0) then 
v_where := ' where '||r.column_bds||' is null'; 
else 
v_where := null; 
end if; 
dbms_output.put_line(v_count); 
v_sql := 'select distinct '||r.column_cod||' value_cod'||cr|| 
v_list||' from '||s.DIT_COD||v_where; 
dbms_output.put_line(v_sql); 
open v_cur for v_sql; 
loop 
if (v_count=0) then 
fetch v_cur into v_value_cod;
The Descriptions Management - Stage 3 
v_list2 := null; 
elsif (v_count=1) then 
fetch v_cur into v_value_cod,v_ditc1; 
v_list2 := ' and d.'||r.dc1_cod||' = s.dv1_cod'; 
elsif (v_count=2) then 
fetch v_cur into v_value_cod,v_ditc1,v_ditc2; 
v_list2 := ' and d.'||r.dc1_cod||' = s.dv1_cod'|| 
' and d.'||r.dc2_cod||' = s.dv2_cod'; 
elsif (v_count=3) then 
fetch v_cur into v_value_cod,v_ditc1,v_ditc2,v_ditc3; 
v_list2 := ' and d.'||r.dc1_cod||' = s.dv1_cod'|| 
' and d.'||r.dc2_cod||' = s.dv2_cod'|| 
' and d.'||r.dc3_cod||' = s.dv3_cod'; 
elsif (v_count=4) then 
fetch v_cur into v_value_cod,v_ditc1,v_ditc2,v_ditc3,v_ditc4; 
v_list2 := ' and d.'||r.dc1_cod||' = s.dv1_cod'|| 
' and d.'||r.dc2_cod||' = s.dv2_cod'|| 
' and d.'||r.dc3_cod||' = s.dv3_cod'|| 
' and d.'||r.dc4_cod||' = s.dv4_cod'; 
else 
fetch v_cur into v_value_cod,v_ditc1,v_ditc2,v_ditc3,v_ditc4,v_ditc5; 
v_list2 := ' and d.'||r.dc1_cod||' = s.dv1_cod'|| 
' and d.'||r.dc2_cod||' = s.dv2_cod'|| 
' and d.'||r.dc3_cod||' = s.dv3_cod'|| 
' and d.'||r.dc4_cod||' = s.dv4_cod'|| 
' and d.'||r.dc5_cod||' = s.dv5_cod'; 
end if; 
exit when v_cur%notfound; 
v_sql2 := 
'update '||s.DIT_COD||' d'||cr|| 
'set ('||r.column_bds||') = ('||cr||'select bds_txt '||cr|| 
'from mef_dom3_cft s where s.dom_cod = '||''''||r.dom_cod||''''|| 
' and s.cod_txt = :1 '|| 
v_list2||cr|| 
')'||cr|| 
'where '||r.column_cod||' = :2 '||cr|| 
' and exists (select 1 from MEF_DOM3_CFT s '|| 
'where s.dom_cod = '||''''||r.dom_cod||''''||' and s.cod_txt = :3 '|| 
v_list2||')'; 
dbms_output.put_line(v_sql2); 
execute immediate v_sql2 using v_value_cod,v_value_cod,v_value_cod; 
end loop; 
close v_cur; 
end loop; 
end loop;
The Descriptions Management - Stage 3 
commit; 
end; 
/ 
sho errors 
set serveroutput on 
exec p_dit3('FIN_DIT'); 
After the execution of the procedure, the dimension table of financial instruments will be: 
The first three lines have combinations of codes that are present in the table of the domains, so they have been updated the corresponding descriptions 
for all codes. 
The fourth line has only updated the description of the category, because the category has no other dependencies. 
The fifth line has all descriptions indefinite because it is a combination of unknown dependencies.
The Descriptions Management - Stage 4 
Many of you will be amazed to see represented a further stage 4. It seems incredible, but we have not again said all about the descriptions management. 
We summarize the analysis that we have done so far. 
• In Phase 1 we highlighted the problem, we have introduced the domains table, a configuration table necessary to operate, we have defined a naming 
convention. 
• In Phase 2, we have automated the process, we have introduced other configuration tables in support of a dynamic management of the descriptions. 
We have shown how to pre-load the domains table, before to receive that information from the source systems. 
• In Phase 3, we have handled the dependence between codes. 
What is still missing? 
The answer depends on the international feature of your Data Warehouse. If you are sure that the Data Warehouse is internal and used only by users of 
your native language, you can stop at step 3, otherwise you have to take into account the multilanguage management. 
I will not make a complete example as the other phases. The action is a change to the domains table, adding the language code field. (and some technical 
columns that is always useful to have). So you have to add language code as a key field in the procedures that have been implemented. With the 
multilanguage management , the domains table will increase volume as many times as the languages that you add. 
The new domains table 
The domains table, that we see below, can be considered complete, and I advise you to create it immediately with the code language even though at the 
time will not be used. 
DROP TABLE MEF_DOM4_CFT CASCADE CONSTRAINTS; 
CREATE TABLE MEF_DOM4_CFT ( 
DOM_COD VARCHAR2(61) NOT NULL 
,COD_TXT VARCHAR2(60) NOT NULL 
,HDS_TXT VARCHAR2(255) 
,BDS_TXT VARCHAR2(255) 
,DV1_COD VARCHAR2(30 BYTE) 
,DV2_COD VARCHAR2(30 BYTE) 
,DV3_COD VARCHAR2(30 BYTE) 
,DV4_COD VARCHAR2(30 BYTE) 
,DV5_COD VARCHAR2(30 BYTE) 
,DCLIST_TXT VARCHAR2(160 BYTE) 
,LANG_COD VARCHAR2(30 BYTE) 
,INS_DTS DATE 
,STAMP_DTS DATE DEFAULT SYSDATE 
,CTRL_TXT VARCHAR2(30) 
);
Conclusion 
As we have seen, the descriptions management is not a simple topic, has many implications, employ many people and it is extremely important because it 
is not a technical detail, but involves the way the end user sees the data. Determining factor for the success of the Data Warehouse. 
I hope that what I have written can serve, for all the people involved in the design of a Data Warehouse, to become aware of the problem. The 
"descriptions management" activity must become a fixed point in the Gantt of the project.

More Related Content

What's hot

Revision booklet 6957 2016
Revision booklet 6957 2016Revision booklet 6957 2016
Revision booklet 6957 2016jom1987
 
Informatica question & answer set
Informatica question & answer setInformatica question & answer set
Informatica question & answer setPrem Nath
 
Sql interview questions
Sql interview questionsSql interview questions
Sql interview questionsnagesh Rao
 
Bca examination 2015 dbms
Bca examination 2015 dbmsBca examination 2015 dbms
Bca examination 2015 dbmsAnjaan Gajendra
 
Intro to T-SQL - 1st session
Intro to T-SQL - 1st sessionIntro to T-SQL - 1st session
Intro to T-SQL - 1st sessionMedhat Dawoud
 
Bank mangement system
Bank mangement systemBank mangement system
Bank mangement systemFaisalGhffar
 
Database queries
Database queriesDatabase queries
Database queriesIIUM
 
Bca examination 2017 dbms
Bca examination 2017 dbmsBca examination 2017 dbms
Bca examination 2017 dbmsAnjaan Gajendra
 
Dbms relational model
Dbms relational modelDbms relational model
Dbms relational modelChirag vasava
 
COMPUTERS Database
COMPUTERS Database COMPUTERS Database
COMPUTERS Database Rc Os
 
08. Object Oriented Database in DBMS
08. Object Oriented Database in DBMS08. Object Oriented Database in DBMS
08. Object Oriented Database in DBMSkoolkampus
 
Relational Data Model Introduction
Relational Data Model IntroductionRelational Data Model Introduction
Relational Data Model IntroductionNishant Munjal
 
NOSQL and MongoDB Database
NOSQL and MongoDB DatabaseNOSQL and MongoDB Database
NOSQL and MongoDB DatabaseTariqul islam
 
Day Of Dot Net Ann Arbor 2007
Day Of Dot Net Ann Arbor 2007Day Of Dot Net Ann Arbor 2007
Day Of Dot Net Ann Arbor 2007David Truxall
 

What's hot (20)

Revision booklet 6957 2016
Revision booklet 6957 2016Revision booklet 6957 2016
Revision booklet 6957 2016
 
Informatica question & answer set
Informatica question & answer setInformatica question & answer set
Informatica question & answer set
 
Sql interview questions
Sql interview questionsSql interview questions
Sql interview questions
 
Bca examination 2015 dbms
Bca examination 2015 dbmsBca examination 2015 dbms
Bca examination 2015 dbms
 
Intro to T-SQL - 1st session
Intro to T-SQL - 1st sessionIntro to T-SQL - 1st session
Intro to T-SQL - 1st session
 
Bank mangement system
Bank mangement systemBank mangement system
Bank mangement system
 
Bc0041
Bc0041Bc0041
Bc0041
 
ch4
ch4ch4
ch4
 
Database queries
Database queriesDatabase queries
Database queries
 
Sql Basics And Advanced
Sql Basics And AdvancedSql Basics And Advanced
Sql Basics And Advanced
 
Bca examination 2017 dbms
Bca examination 2017 dbmsBca examination 2017 dbms
Bca examination 2017 dbms
 
Dbms relational model
Dbms relational modelDbms relational model
Dbms relational model
 
COMPUTERS Database
COMPUTERS Database COMPUTERS Database
COMPUTERS Database
 
08. Object Oriented Database in DBMS
08. Object Oriented Database in DBMS08. Object Oriented Database in DBMS
08. Object Oriented Database in DBMS
 
Bt0066 dbms
Bt0066 dbmsBt0066 dbms
Bt0066 dbms
 
uptu web technology unit 2 Xml2
uptu web technology unit 2 Xml2uptu web technology unit 2 Xml2
uptu web technology unit 2 Xml2
 
Relational Data Model Introduction
Relational Data Model IntroductionRelational Data Model Introduction
Relational Data Model Introduction
 
Ankit
AnkitAnkit
Ankit
 
NOSQL and MongoDB Database
NOSQL and MongoDB DatabaseNOSQL and MongoDB Database
NOSQL and MongoDB Database
 
Day Of Dot Net Ann Arbor 2007
Day Of Dot Net Ann Arbor 2007Day Of Dot Net Ann Arbor 2007
Day Of Dot Net Ann Arbor 2007
 

Similar to Recipes 10 of Data Warehouse and Business Intelligence - The descriptions management

Recipes 8 of Data Warehouse and Business Intelligence - Naming convention tec...
Recipes 8 of Data Warehouse and Business Intelligence - Naming convention tec...Recipes 8 of Data Warehouse and Business Intelligence - Naming convention tec...
Recipes 8 of Data Warehouse and Business Intelligence - Naming convention tec...Massimo Cenci
 
An Seo’s Intro to Web Dev, HTML, CSS and JavaScript
An Seo’s Intro to Web Dev, HTML, CSS and JavaScriptAn Seo’s Intro to Web Dev, HTML, CSS and JavaScript
An Seo’s Intro to Web Dev, HTML, CSS and JavaScriptTroyfawkes
 
Identify SQL Tuning Opportunities
Identify SQL Tuning OpportunitiesIdentify SQL Tuning Opportunities
Identify SQL Tuning OpportunitiesCuneyt Goksu
 
OrientDB Intro & HTTP REST API Tutorial
OrientDB Intro & HTTP REST API TutorialOrientDB Intro & HTTP REST API Tutorial
OrientDB Intro & HTTP REST API TutorialAnthony Blatner
 
Design your own database
Design your own databaseDesign your own database
Design your own databaseFrank Katta
 
Lotusphere 2007 AD507 Leveraging the Power of Object Oriented Programming in ...
Lotusphere 2007 AD507 Leveraging the Power of Object Oriented Programming in ...Lotusphere 2007 AD507 Leveraging the Power of Object Oriented Programming in ...
Lotusphere 2007 AD507 Leveraging the Power of Object Oriented Programming in ...Bill Buchan
 
Cs121 Unit Test
Cs121 Unit TestCs121 Unit Test
Cs121 Unit TestJill Bell
 
Bca winter 2013 2nd sem
Bca winter 2013 2nd semBca winter 2013 2nd sem
Bca winter 2013 2nd semsmumbahelp
 
2009 10-08 soa-og_itil_does service in it service rhyme with service as in so...
2009 10-08 soa-og_itil_does service in it service rhyme with service as in so...2009 10-08 soa-og_itil_does service in it service rhyme with service as in so...
2009 10-08 soa-og_itil_does service in it service rhyme with service as in so...Peter Rosenberg
 
Essential html tweaks for accessible themes
Essential html tweaks for accessible themesEssential html tweaks for accessible themes
Essential html tweaks for accessible themesMartin Stehle
 
Heterogeneous Data - Published
Heterogeneous Data - PublishedHeterogeneous Data - Published
Heterogeneous Data - PublishedPaul Steffensen
 
Using pandas library for data analysis in python
Using pandas library for data analysis in pythonUsing pandas library for data analysis in python
Using pandas library for data analysis in pythonBruce Jenks
 
Excel for SEO -from Distilled UK
Excel for SEO -from Distilled UKExcel for SEO -from Distilled UK
Excel for SEO -from Distilled UKEldad Sotnick-Yogev
 

Similar to Recipes 10 of Data Warehouse and Business Intelligence - The descriptions management (20)

Recipes 8 of Data Warehouse and Business Intelligence - Naming convention tec...
Recipes 8 of Data Warehouse and Business Intelligence - Naming convention tec...Recipes 8 of Data Warehouse and Business Intelligence - Naming convention tec...
Recipes 8 of Data Warehouse and Business Intelligence - Naming convention tec...
 
An Seo’s Intro to Web Dev, HTML, CSS and JavaScript
An Seo’s Intro to Web Dev, HTML, CSS and JavaScriptAn Seo’s Intro to Web Dev, HTML, CSS and JavaScript
An Seo’s Intro to Web Dev, HTML, CSS and JavaScript
 
Database design guide
Database design guideDatabase design guide
Database design guide
 
Identify SQL Tuning Opportunities
Identify SQL Tuning OpportunitiesIdentify SQL Tuning Opportunities
Identify SQL Tuning Opportunities
 
Dwh faqs
Dwh faqsDwh faqs
Dwh faqs
 
OrientDB Intro & HTTP REST API Tutorial
OrientDB Intro & HTTP REST API TutorialOrientDB Intro & HTTP REST API Tutorial
OrientDB Intro & HTTP REST API Tutorial
 
Design your own database
Design your own databaseDesign your own database
Design your own database
 
Ad507
Ad507Ad507
Ad507
 
Lotusphere 2007 AD507 Leveraging the Power of Object Oriented Programming in ...
Lotusphere 2007 AD507 Leveraging the Power of Object Oriented Programming in ...Lotusphere 2007 AD507 Leveraging the Power of Object Oriented Programming in ...
Lotusphere 2007 AD507 Leveraging the Power of Object Oriented Programming in ...
 
Cs121 Unit Test
Cs121 Unit TestCs121 Unit Test
Cs121 Unit Test
 
Ale idoc-guide
Ale idoc-guideAle idoc-guide
Ale idoc-guide
 
Bca winter 2013 2nd sem
Bca winter 2013 2nd semBca winter 2013 2nd sem
Bca winter 2013 2nd sem
 
Ad505 dev blast
Ad505 dev blastAd505 dev blast
Ad505 dev blast
 
12363 database certification
12363 database certification12363 database certification
12363 database certification
 
2009 10-08 soa-og_itil_does service in it service rhyme with service as in so...
2009 10-08 soa-og_itil_does service in it service rhyme with service as in so...2009 10-08 soa-og_itil_does service in it service rhyme with service as in so...
2009 10-08 soa-og_itil_does service in it service rhyme with service as in so...
 
Essential html tweaks for accessible themes
Essential html tweaks for accessible themesEssential html tweaks for accessible themes
Essential html tweaks for accessible themes
 
Heterogeneous Data - Published
Heterogeneous Data - PublishedHeterogeneous Data - Published
Heterogeneous Data - Published
 
Using pandas library for data analysis in python
Using pandas library for data analysis in pythonUsing pandas library for data analysis in python
Using pandas library for data analysis in python
 
2746-2016
2746-20162746-2016
2746-2016
 
Excel for SEO -from Distilled UK
Excel for SEO -from Distilled UKExcel for SEO -from Distilled UK
Excel for SEO -from Distilled UK
 

More from Massimo Cenci

Recipe 16 of Data Warehouse and Business Intelligence - The monitoring of the...
Recipe 16 of Data Warehouse and Business Intelligence - The monitoring of the...Recipe 16 of Data Warehouse and Business Intelligence - The monitoring of the...
Recipe 16 of Data Warehouse and Business Intelligence - The monitoring of the...Massimo Cenci
 
Il controllo temporale dei data file in staging area
Il controllo temporale dei data file in staging areaIl controllo temporale dei data file in staging area
Il controllo temporale dei data file in staging areaMassimo Cenci
 
Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)
Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)
Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)Massimo Cenci
 
Recipe 14 of Data Warehouse and Business Intelligence - Build a Staging Area ...
Recipe 14 of Data Warehouse and Business Intelligence - Build a Staging Area ...Recipe 14 of Data Warehouse and Business Intelligence - Build a Staging Area ...
Recipe 14 of Data Warehouse and Business Intelligence - Build a Staging Area ...Massimo Cenci
 
Tecniche di progettazione della staging area in un processo etl
Tecniche di progettazione della staging area in un processo etlTecniche di progettazione della staging area in un processo etl
Tecniche di progettazione della staging area in un processo etlMassimo Cenci
 
Note di Data Warehouse e Business Intelligence - Il giorno di riferimento dei...
Note di Data Warehouse e Business Intelligence - Il giorno di riferimento dei...Note di Data Warehouse e Business Intelligence - Il giorno di riferimento dei...
Note di Data Warehouse e Business Intelligence - Il giorno di riferimento dei...Massimo Cenci
 
Note di Data Warehouse e Business Intelligence - Pensare "Agile"
Note di Data Warehouse e Business Intelligence - Pensare "Agile"Note di Data Warehouse e Business Intelligence - Pensare "Agile"
Note di Data Warehouse e Business Intelligence - Pensare "Agile"Massimo Cenci
 
Note di Data Warehouse e Business Intelligence - La gestione delle descrizioni
Note di Data Warehouse e Business Intelligence - La gestione delle descrizioniNote di Data Warehouse e Business Intelligence - La gestione delle descrizioni
Note di Data Warehouse e Business Intelligence - La gestione delle descrizioniMassimo Cenci
 
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...Massimo Cenci
 
Data Warehouse - What you know about etl process is wrong
Data Warehouse - What you know about etl process is wrongData Warehouse - What you know about etl process is wrong
Data Warehouse - What you know about etl process is wrongMassimo Cenci
 
Letter to a programmer
Letter to a programmerLetter to a programmer
Letter to a programmerMassimo Cenci
 
Recipes 9 of Data Warehouse and Business Intelligence - Techniques to control...
Recipes 9 of Data Warehouse and Business Intelligence - Techniques to control...Recipes 9 of Data Warehouse and Business Intelligence - Techniques to control...
Recipes 9 of Data Warehouse and Business Intelligence - Techniques to control...Massimo Cenci
 
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...Massimo Cenci
 
ata Warehouse and Business Intelligence - Recipe 7 - A messaging system for O...
ata Warehouse and Business Intelligence - Recipe 7 - A messaging system for O...ata Warehouse and Business Intelligence - Recipe 7 - A messaging system for O...
ata Warehouse and Business Intelligence - Recipe 7 - A messaging system for O...Massimo Cenci
 
Data Warehouse and Business Intelligence - Recipe 7 - A messaging system for ...
Data Warehouse and Business Intelligence - Recipe 7 - A messaging system for ...Data Warehouse and Business Intelligence - Recipe 7 - A messaging system for ...
Data Warehouse and Business Intelligence - Recipe 7 - A messaging system for ...Massimo Cenci
 
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...Massimo Cenci
 
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...Massimo Cenci
 
Oracle All-in-One - how to send mail with attach using oracle pl/sql
Oracle All-in-One - how to send mail with attach using oracle pl/sqlOracle All-in-One - how to send mail with attach using oracle pl/sql
Oracle All-in-One - how to send mail with attach using oracle pl/sqlMassimo Cenci
 
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi (pa...
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi (pa...Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi (pa...
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi (pa...Massimo Cenci
 
Recipes 6 of Data Warehouse and Business Intelligence - Naming convention tec...
Recipes 6 of Data Warehouse and Business Intelligence - Naming convention tec...Recipes 6 of Data Warehouse and Business Intelligence - Naming convention tec...
Recipes 6 of Data Warehouse and Business Intelligence - Naming convention tec...Massimo Cenci
 

More from Massimo Cenci (20)

Recipe 16 of Data Warehouse and Business Intelligence - The monitoring of the...
Recipe 16 of Data Warehouse and Business Intelligence - The monitoring of the...Recipe 16 of Data Warehouse and Business Intelligence - The monitoring of the...
Recipe 16 of Data Warehouse and Business Intelligence - The monitoring of the...
 
Il controllo temporale dei data file in staging area
Il controllo temporale dei data file in staging areaIl controllo temporale dei data file in staging area
Il controllo temporale dei data file in staging area
 
Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)
Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)
Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)
 
Recipe 14 of Data Warehouse and Business Intelligence - Build a Staging Area ...
Recipe 14 of Data Warehouse and Business Intelligence - Build a Staging Area ...Recipe 14 of Data Warehouse and Business Intelligence - Build a Staging Area ...
Recipe 14 of Data Warehouse and Business Intelligence - Build a Staging Area ...
 
Tecniche di progettazione della staging area in un processo etl
Tecniche di progettazione della staging area in un processo etlTecniche di progettazione della staging area in un processo etl
Tecniche di progettazione della staging area in un processo etl
 
Note di Data Warehouse e Business Intelligence - Il giorno di riferimento dei...
Note di Data Warehouse e Business Intelligence - Il giorno di riferimento dei...Note di Data Warehouse e Business Intelligence - Il giorno di riferimento dei...
Note di Data Warehouse e Business Intelligence - Il giorno di riferimento dei...
 
Note di Data Warehouse e Business Intelligence - Pensare "Agile"
Note di Data Warehouse e Business Intelligence - Pensare "Agile"Note di Data Warehouse e Business Intelligence - Pensare "Agile"
Note di Data Warehouse e Business Intelligence - Pensare "Agile"
 
Note di Data Warehouse e Business Intelligence - La gestione delle descrizioni
Note di Data Warehouse e Business Intelligence - La gestione delle descrizioniNote di Data Warehouse e Business Intelligence - La gestione delle descrizioni
Note di Data Warehouse e Business Intelligence - La gestione delle descrizioni
 
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
 
Data Warehouse - What you know about etl process is wrong
Data Warehouse - What you know about etl process is wrongData Warehouse - What you know about etl process is wrong
Data Warehouse - What you know about etl process is wrong
 
Letter to a programmer
Letter to a programmerLetter to a programmer
Letter to a programmer
 
Recipes 9 of Data Warehouse and Business Intelligence - Techniques to control...
Recipes 9 of Data Warehouse and Business Intelligence - Techniques to control...Recipes 9 of Data Warehouse and Business Intelligence - Techniques to control...
Recipes 9 of Data Warehouse and Business Intelligence - Techniques to control...
 
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
 
ata Warehouse and Business Intelligence - Recipe 7 - A messaging system for O...
ata Warehouse and Business Intelligence - Recipe 7 - A messaging system for O...ata Warehouse and Business Intelligence - Recipe 7 - A messaging system for O...
ata Warehouse and Business Intelligence - Recipe 7 - A messaging system for O...
 
Data Warehouse and Business Intelligence - Recipe 7 - A messaging system for ...
Data Warehouse and Business Intelligence - Recipe 7 - A messaging system for ...Data Warehouse and Business Intelligence - Recipe 7 - A messaging system for ...
Data Warehouse and Business Intelligence - Recipe 7 - A messaging system for ...
 
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
 
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
 
Oracle All-in-One - how to send mail with attach using oracle pl/sql
Oracle All-in-One - how to send mail with attach using oracle pl/sqlOracle All-in-One - how to send mail with attach using oracle pl/sql
Oracle All-in-One - how to send mail with attach using oracle pl/sql
 
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi (pa...
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi (pa...Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi (pa...
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi (pa...
 
Recipes 6 of Data Warehouse and Business Intelligence - Naming convention tec...
Recipes 6 of Data Warehouse and Business Intelligence - Naming convention tec...Recipes 6 of Data Warehouse and Business Intelligence - Naming convention tec...
Recipes 6 of Data Warehouse and Business Intelligence - Naming convention tec...
 

Recently uploaded

HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)Intelisync
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 

Recently uploaded (20)

HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 

Recipes 10 of Data Warehouse and Business Intelligence - The descriptions management

  • 1. Recipes of Data Warehouse and Business Intelligence The Descriptions Management
  • 2. Introduction The Descriptions Management The descriptions management, is a subject little discussed within the Data Warehouse and Business Intelligence community. It speaks little in books and articles, although it is a crucial issue because it affects the way we see the information. So, we'll talk about descriptions. In particular, the descriptions of the codes that we can found in the dimension tables of a Data Warehouse. I think it's important to show the complexity that is hidden behind what seem trivial. And it is necessary for all, (end users, project managers, data architects and the other stakeolders) know that the descriptions management is a particularly tricky topic. A real life example The best way to introduce this topic, is to start with a real example. Suppose we observe an end-user who is using a Business Intelligence interface. The user needs to extract numerical data, for example, the operations of an account, and he wants to filter only those customers who have a particular profession, for example, only employees. In order to meet this requirement, the Data Warehouse Team has just prepared a menu with a list of choices of values, (that is, codes and descriptions) on which the end user can choose the line "Employee". The following figure shows this case. This figure shows the visual world. Now we enter in the world of descriptions management , with a number of considerations that describe the logical and the physical world that hides behind the little word "just".
  • 3. The Descriptions Management The vision of the end-user The description "Employee" associated with the code "10" is a description wanted by the end user. It is very important to show a description that is in tune with the daily nomenclature of the end users. This need has a very important consequence. It is very unlikely that the code description, which comes from the source systems (we call it "external"), it is the same description of the end user, in fact, usually, the description will be different. This is because the description is present in the OLTP systems, not in the synthesis systems. So you need to always have in the Data Warehouse two descriptions. That of OLTP systems, let's call it "External description" and the final one, let's call it "Business description“ . To have both is important because it is the only link between two systems that have different purposes. The extended description As seen in the example, in the list of choices, must not appear only the description, but also the code. Often this does not happen, and it is shown only the description. My suggestion is to show them both. We define this new description as "extended description", and it must be obtained by concatenating the value of the code with a "-" (minus) and the description of the value. It is useful to do this, because end users are not all equal. Some user think using descriptions, but others think using codes. Often the end users have so many years working in a certain environment, that it is easier for them to find customers with profession "10" rather than those with the profession "Employee". Moreover, the descriptions, in time, may vary, so the presence of the code is undoubtedly safer. The best thing is to show them both, code and description, to meet all types of users. So "10 - Employee" Because we have an external description and a business description, we have to decide which of the two use to build the extended description. In general, it is appropriate to use a single description, namely the business description, but it can be an acceptable choice also have two extended descriptions. Let's now look at the physical world behind the list of values displayed to the end user. Because as we know, the logical considerations, always have physical consequences.
  • 4. The Descriptions Management How to see the extended descriptions When a user requests a choice of values on the Business Intelligence interface, the display of the list is almost always the result of the generation of a query on the database of the Data Warehouse. Generally, it is very likely that starts a "select distinct" of the values of the extended description, on the dimension table. The extended description can be implemented without taking up space, with the characteristic of the virtual columns. The virtual columns are defined on Oracle, but you can always create views that create those columns. We will show later how to define it. Here's an example, very simplified, of the columns of a possible customers dimension table, which will be the base of the list of values: • CUST_KEY - Artificial key associated to the customer • CUST_COD - Natural key (customer code) • PROF_COD - Profession code • PROF_HDS - External Description (OLTP description) • PROF_BDS - Business description • PROF_XDS - Extended description. Obtained by concatenating the code with the business description Where to find the extended descriptions We have seen that a simple "click" on a list of choices, hides various considerations. But the most surprising thing is that the above is just the tip of the iceberg. In this case, the similarity is very apt. The tip is what the end user sees. But what happens below the surface? It 'so trivial to get descriptions that come (maybe) from source systems, and insert them in the dimension tables? As always happens in a Data Warehouse, nothing is trivial and everything is more complicated than it seems. Indeed, code descriptions can be located, in my experience, in seven different places. 1. The code description is already in the staging table (which will become the dimension table) 2. The code description is present in a table of descriptions, exclusively for that code. 3. The code description is present in a table of descriptions for all of the codes. (or nearly all) 4. The code description is present in a sheet excel / csv 5. The code description is present in a Word document or pdf or xml 6. The code description is in an email. 7. The code description is hard-coded in the code of the OLTP forms , so, after a phone call to some referent of OLTP systems, we return to one of the 4 or 5 or 6 point.
  • 5. Without thinking where are the descriptions and the way in which we will acquire, how can we manage the descriptions in our Data Warehouse? My advice is to always have a metadata table as that of the point 3 and reduce all other cases to it. But to do so, we need to introduce the concept of domain of values. The concept is very simple. Just give a name to the set of values associated with a certain code. For example, if we know that the set of possible values of the code profession is: 10 = Employee 20 = Free Lance 30 = Tradesman Associate the "PROF" name to this domain of values and insert this data into a metadata table that we call MEF_DOM_CFT (MEF Domains Configuration). The table can have a very simple structure with the following information: • DOM_COD - Domain code • COD_TXT - Value of the code • HDS_TXT - External description • BDS_TXT - Business description The Descriptions Management Descriptions as metadata The Naming Convention of the descriptions We adopt, as always, a naming convention. On this issue I have already written a lot on my blog, in this case we assume that all the fields of descriptive codes, ie codes that makes sense to have a description, must be called like * _COD, the external descriptions like * _HDS, the business descriptions like *_BDS *, the extended descriptions like *_XDS. Who has experience of Data Warehouse knows that, unfortunately, if we wait for the end of the analysis of the domains, we lose a lot of time, so we must act preventively. My advice is to fill in the automatic way the domains table with the values of the codes that we found in the staging tables, giving them a unique name, and a default description like "nd" (not defined). In a second step we can take action on generic domains by changing their default name . The default domain name that we will adopt will be <source >. <Column name>.
  • 6. The Descriptions Management Other considerations Descriptions are names that rarely vary. It would be a waste of time to set a daily loading/update of the domains table. Do an initial massive load and manage, when it is necessary, any changes and additions. Often you get, in the source files, short and long descriptions. Obviously, if they are present in the source, you should load them, but you should always use the short description as "external description". Now the trend is to shorten the information (see the language of SMS messages) so, in my opinion, you can ignore long descriptions that can create problems of sizing on graphical interfaces (PC,Tablet and smartphones) The business description is necessarily manual. It will be the end user who will decide which description prefers to see. The initial loading, however, must initialize the business description equal to the external description, if,and only if, the business description is empty. Then, from time to time, we will proceed to the changes. Another point that should not be overlooked is the following: how much are reliable, codes and descriptions that we find in the points 2-6? As often happens, the documentation is not always up to date, so we risk to have not the full domain of the code. On this point, we have to be proactive in the sense that we can not ask generically an update of the documentation of the source systems. We must be the ones to ask precisely the descriptions of the codes that we do not find in the documentation. The importance of the domains Finally, the main reason for the existence of the domains table. The reuse of the description. In the tables of a Data Warehouse the same domain can be applied to so many columns. In fact they have different names, but refer to the same domain of values. For example, a currency code, can be present in a table with name CURTRAT (currency of treatment) and CURTRA (currency of trading), but both must have the same domain, let's call it "CUR".
  • 7. The Descriptions Management - Stage 1 The work plan At this point, we explore the descriptions management with a real simulation, obviously very simplified, after creating the simulation environment. The work plan is as follows. 1. Create the environment 2. Load the domains table 3. Load the staging table 4. Update the domains table on the basis of the codes that we find in the table of Staging Area 5. Load the dimension table 6. Update the dimension table with descriptions Create the environment We prepare a simulation environment, creating the necessary tables with only the information they we need, which is the table of the staging area, the dimension table and the domains table. As we can see, the dimension table contains both the external description and the business description. The extended description is built using the syntax of the virtual columns. DROP TABLE CUST_STA CASCADE CONSTRAINTS; CREATE TABLE CUST_STA ( CUST_COD VARCHAR2(9), PROF_COD VARCHAR2(30) ); DROP TABLE MEF_DOM_CFT CASCADE CONSTRAINTS; CREATE TABLE MEF_DOM_CFT ( DOM_COD VARCHAR2(61) NOT NULL ,COD_TXT VARCHAR2(60) ,HDS_TXT VARCHAR2(255) ,BDS_TXT VARCHAR2(255) );
  • 8. The Descriptions Management - Stage 1 DROP TABLE CUST_DIT CASCADE CONSTRAINTS; CREATE TABLE CUST_DIT ( CUST_KEY NUMBER ,CUST_COD VARCHAR2(9) ,PROF_COD VARCHAR2(30) ,PROF_HDS VARCHAR2(60) ,PROF_BDS VARCHAR2(60) ,PROF_XDS VARCHAR2(93) GENERATED ALWAYS AS (PROF_COD||' - '||PROF_BDS) VIRTUAL ); Load the domains table The loading of this table will necessarily be a mix of automatic loading and manual loading. We refer to the 7 situations seen before. If the domains are already present in a specific table or in a common table, we will use just a SQL statement to load it. If you are in an Excel spreadsheet, convert it to a CSV file, point it with an Oracle external table and then load them with a SQL statement. Otherwise, you have to load them manually one by one. Suppose to be in the latter case and load the domain of the profession code. We note the difference between external description and business description. If, as often happens, the feeding system is a mainframe, the descriptions will come all in uppercase and they will not be very user-friendly for a directional reporting. INSERT INTO MEF_DOM_CFT VALUES ('PROF','10','EMPL.' ,'Employee'); INSERT INTO MEF_DOM_CFT VALUES ('PROF','20','FREE L.' ,'Free Lance'); INSERT INTO MEF_DOM_CFT VALUES ('PROF','30','TRADESMAN','Tradesman'); COMMIT; After the insertion, the contents of the domains table will therefore be:
  • 9. The Descriptions Management - Stage 1 Obviously, if we have not yet considered the analysis of the domains or we are still waiting for documentation, this step can be ignored. The table remains empty and will be loaded automatically in a second step. Load the staging table The staging table loading is a job that will be part of the ETL process. For simplicity we simulate its load with the following SQL statements. The '?' as value code, substitutes NULL. From the methodological point of view, I consider important to avoid NULL values in the Data Warehouse, and replace them immediately in the Staging Area with a default value. Usually I use the question mark. [for a detailed discussion refer to [http://www.slideshare.net/jackbim/recipes-5-of-data-warehouse-the-null-values-management- in-the-etl-process] INSERT INTO CUST_STA VALUES ('Larry','10'); INSERT INTO CUST_STA VALUES ('Tom' ,'15'); INSERT INTO CUST_STA VALUES ('Brown','20'); INSERT INTO CUST_STA VALUES ('Steve','40'); INSERT INTO CUST_STA VALUES ('Jack' ,'?'); COMMIT;
  • 10. The Descriptions Management - Stage 1 After the insertion, the content of the staging table will be: Update the domains table As you can see from the staging area contents, I purposely simulated a situation of domain not updated, in fact are not present all the values that could arrive (missing the "15", "40" and a "?" ). In order to keep the domains table updated, we perform a procedure that we can run as post-processing in the Staging Area loading. The procedure is the following. create or replace procedure p_dom_prof(p_dom_cod varchar2) is begin for r in (select distinct p_dom_cod dom_cod ,a.prof_cod ,b.cod_txt ,nvl(b.hds_txt,'nd') hds_txt ,nvl(b.bds_txt,'nd') bds_txt
  • 11. from cust_sta a left outer join mef_dom_cft b on (a.prof_cod = b.cod_txt and b.dom_cod = p_dom_cod)) loop if (r.cod_txt is null) then insert into mef_dom_cft (dom_cod,cod_txt,hds_txt,bds_txt) values(p_dom_cod,r.prof_cod,r.hds_txt,r.bds_txt); end if; end loop; commit; end; / sho errors exec p_dom_prof('PROF'); The Descriptions Management - Stage 1 The procedure performs a simple loop on the values of the profession code of the staging table. There is the link with the domains table,using domain name that receives as input. As we can see, the procedure will fill the domain table even if the table is initially empty (by means of the outer join). Also always acts as an insert, then does not alter or delete descriptions that have been entered manually. The procedure also inserts the missing codes with the default description "nd". If now we query the domains table, we will get:
  • 12. The Descriptions Management - Stage 1 Load the dimension table The loading of the dimension table, is one of the most challenging and delicate component of the entire ETL process. From my experience, almost all dimension tables should take account of the "Slowly Changing Dimension" of type 2, on which I will not go into detail now. Precisely because of this complexity, I preferred not to add complexity to complexity. I prefer manage the descriptions as post-processing of the loading of the dimension tables. So, first I load the dimension table by taking the data from the staging table, then I update the descriptive components. We load the dimension table in a very simply way with the following SQL statement: INSERT INTO CUST_DIT(CUST_KEY,CUST_COD,PROF_COD) SELECT ROWNUM,CUST_COD,PROF_COD FROM CUST_STA; COMMIT; Situation after loading:
  • 13. The Descriptions Management - Stage 1 Update the dimension table To update the dimension table, we create and execute the following procedure. create or replace procedure p_dit_prof(p_dom_cod varchar2) as begin for r in (select distinct prof_cod from cust_dit) loop update cust_dit d set (prof_hds,prof_bds) = (select hds_txt,bds_txt from mef_dom_cft s where d.prof_cod = s.cod_txt and s.dom_cod = p_dom_cod) where prof_cod = r.prof_cod and exists (select 1 from mef_dom_cft s where d.prof_cod = s.cod_txt and s.dom_cod = p_dom_cod); end loop; commit; end; / sho errors exec p_dit_prof('PROF'); The logic of the procedure is very simple. For each value of the profession code, it performs the update of the description taken from the domains table. If now we query the dimension table we will get:
  • 14. The Descriptions Management - Stage 1 After the nightly loading of the Data Warehouse, we can query the domain table (or the dimension table), and all situations, in which there is the description "nd", must be the object of attention for all the people who are concerned with the quality control of the data . In practice, for each code that is "nd", we must send a request for a description to the feeding systems.
  • 15. The Descriptions Management - Stage 2 The Phase 1 is helpful to us, to be aware of the problems and difficulties associated with descriptions management . The solution shown, although functional, is practically unusable in a real Data Warehouse. The procedures that update the domains table and the dimension table, are specific for the profession code, and such procedure should be repeated, with appropriate modifications, for all other codes that require a description. In a Data Warehouse, we can have dozens of analytical dimensions, each with dozens of descriptive codes. It is not thinkable write hundreds of procedures for the descriptions management. So let's see how to implement a single procedure that handles all automatically. To achieve the result indicated, we must necessarily get help from some configuration tables. In particular, a configuration table of the input stream, which indicate me the name of the staging table and of the dimension table. And we need a configuration table of all columns that need a description. Environment creation We enrich the staging table assuming the presence of three codes. The first, already seen in the previous phase is the profession code.The second is the status code of the customer. The third is the type code of the customer. We adapt also the dimension table with the new codes. DROP TABLE CUST2_STA CASCADE CONSTRAINTS; CREATE TABLE CUST2_STA ( CUST2_COD VARCHAR2(9) ,PROF_COD VARCHAR2(30) ,STATUS_COD VARCHAR2(1) ,TYPE_COD VARCHAR2(1) ); DROP TABLE CUST2_DIT CASCADE CONSTRAINTS; CREATE TABLE CUST2_DIT ( CUST2_KEY NUMBER ,CUST2_COD VARCHAR2(9) ,PROF_COD VARCHAR2(30) ,PROF_BDS VARCHAR2(60) ,PROF_XDS VARCHAR2(93) GENERATED ALWAYS AS (PROF_COD||' - '||PROF_BDS) VIRTUAL
  • 16. The Descriptions Management - Stage 2 ,STATUS_COD VARCHAR2(1) ,STATUS_BDS VARCHAR2(15) ,STATUS_XDS VARCHAR2(19) GENERATED ALWAYS AS (STATUS_COD||' - '||STATUS_BDS) VIRTUAL ,TYPE_COD VARCHAR2(1) ,TYPE_BDS VARCHAR2(15) ,TYPE_XDS VARCHAR2(19) GENERATED ALWAYS AS (TYPE_COD||' - '||TYPE_BDS) VIRTUAL ); We now introduce two new configuration tables. A configuration table of the input streams (MEF_IO2_CFT) and a configuration table of the columns of the staging tables (MEF_STA2_CFT),which will become columns of the dimension table. The MEF_IO2_CFT table contains, for each identifier of a stream (IO_COD), the name of the staging table (STA_TABLE_COD) and the name of the dimension table (DIM_TABLE_COD) . We insert a configuration line . The MEF_STA2_CFT table contains, for each identifier of a stream (IO_COD), the name of the column code (COLUMN_COD) and the domain name associated (DOM_COD). This configuration table is very important, because it is the only place where we define for each code, the name of its domain. This name will allow you to access to the domains table to find the descriptions. For our example, we insert three lines, one for each code, assuming that the customer type has not yet been taken into account in the analysis of its domain. So, the domain code remains NULL. DROP TABLE MEF_IO2_CFT CASCADE CONSTRAINTS; CREATE TABLE MEF_IO2_CFT ( IO_COD VARCHAR2(9) ,STT_COD VARCHAR2(30) ,DIT_COD VARCHAR2(30) ); INSERT INTO MEF_IO2_CFT VALUES ('CUST2','CUST2_STA','CUST2_DIT'); DROP TABLE MEF_STA2_CFT CASCADE CONSTRAINTS; CREATE TABLE MEF_STA2_CFT ( IO_COD VARCHAR2(9) ,COLUMN_COD VARCHAR2(30) ,DOM_COD VARCHAR2(30) );
  • 17. The Descriptions Management - Stage 2 INSERT INTO MEF_STA2_CFT VALUES ('CUST2','PROF_COD','PROF'); INSERT INTO MEF_STA2_CFT VALUES ('CUST2','STATUS_COD','STATUS'); INSERT INTO MEF_STA2_CFT VALUES ('CUST2','TYPE_COD',null); COMMIT; Load the domains table Recreate and load this table by adding to the domain already seen, that of the state of the customer: DROP TABLE MEF_DOM2_CFT CASCADE CONSTRAINTS; CREATE TABLE MEF_DOM2_CFT ( DOM_COD VARCHAR2(61) NOT NULL ,COD_TXT VARCHAR2(60) ,HDS_TXT VARCHAR2(255) NOT NULL ,BDS_TXT VARCHAR2(255) NOT NULL ); INSERT INTO MEF_DOM2_CFT VALUES ('PROF','10','EMPL.' ,'Employee'); INSERT INTO MEF_DOM2_CFT VALUES ('PROF','20','FREE L.' ,'Free Lance'); INSERT INTO MEF_DOM2_CFT VALUES ('PROF','30','TRADESMAN','Tradesman'); INSERT INTO MEF_DOM2_CFT VALUES ('STATUS','0','NO','Not active'); INSERT INTO MEF_DOM2_CFT VALUES ('STATUS','1','YES' ,'Active'); COMMIT; The table content will be the following:
  • 18. The Descriptions Management - Stage 2 Load the staging table The loading of this table will be similar to what has been seen INSERT INTO CUST2_STA VALUES ('Larry','10','0','P'); INSERT INTO CUST2_STA VALUES ('Tom' ,'15','1','P'); INSERT INTO CUST2_STA VALUES ('ACME' ,'20','1','L'); INSERT INTO CUST2_STA VALUES ('Steve','40','?','P'); INSERT INTO CUST2_STA VALUES ('Jack' ,'?','1','P'); COMMIT; Update the domains table We will use a single procedure. It, using the information of the configuration tables, dynamically creates the SQL statement to do the update. Furthermore, in case that some domains are not defined , associates automatically the default domain code as already described. Because sometimes it can be useful to call a field "* _COD" although it is not sensible to associate to it a description, we assume that, if we set the code domain equal to the word «NULL» (not the SQL null), this field will not be taken into consideration in the treatment of the domains.
  • 19. The Descriptions Management - Stage 2 create or replace procedure p_dom2(p_io_cod varchar2 default '%' ,p_column_cod varchar2 default '%') is v_sql varchar2(4000); v_dom_cod varchar2(61); v_hds_txt varchar2(200); v_bds_txt varchar2(200); v_cod_txt varchar2(200); v_column_cod varchar2(200); type t_rc is ref cursor; v_cur t_rc; cr char(1) := chr(10); begin for r in (select a.STT_COD ,b.column_cod ,nvl(b.dom_cod,a.io_cod||'.'||b.column_cod) dom_cod from mef_io2_cft a inner join mef_sta2_cft b on (a.io_cod = b.io_cod and b.column_cod like p_column_cod and b.column_cod like '%_COD' and nvl(b.dom_cod,'*') <> 'NULL') where a.io_cod like p_io_cod) loop v_sql := 'select distinct '||''''||r.dom_cod||''''||cr|| ' ,a.'||r.column_cod||cr|| ' ,b.cod_txt'||cr|| ' ,nvl(b.hds_txt,''nd'') hds_txt '||cr|| ' ,nvl(b.bds_txt,''nd'') bds_txt '||cr|| 'from '||r.STT_COD||' a '||cr|| 'left outer join MEF_DOM2_CFT b '||cr|| 'on (a.'||r.column_cod||' = b.cod_txt '||cr|| ' and dom_cod='||''''||r.dom_cod||''''||')'; dbms_output.put_line(v_sql); open v_cur for v_sql;
  • 20. The Descriptions Management - Stage 2 loop fetch v_cur into v_dom_cod,v_column_cod,v_cod_txt,v_hds_txt,v_bds_txt; exit when v_cur%notfound; if (v_cod_txt is null) then insert into MEF_DOM2_CFT (dom_cod,cod_txt,hds_txt,bds_txt) values(v_dom_cod,v_column_cod,v_hds_txt,v_bds_txt); end if; end loop; close v_cur; end loop; commit; end; / sho errors set serveroutput on exec p_dom2('CUST2'); Through the command "set serveroutput on" we can see an example of the SQL statement that is generated automatically for each source / column.
  • 21. The Descriptions Management - Stage 2 After the procedure execution, the content of the table will be the following. Note the power of this procedure. Although we have not yet thought about the domains management, if you call it without parameters, it will be able to initialize the domains table of for all code fields (that have been configured) throughout the Data Warehouse.
  • 22. The Descriptions Management - Stage 2 Load the dimension table The loading is similar to that of stage 1 INSERT INTO CUST2_DIT(CUST2_KEY,CUST2_COD,PROF_COD,STATUS_COD,TYPE_COD) SELECT ROWNUM,CUST2_COD,PROF_COD,STATUS_COD,TYPE_COD FROM CUST2_STA; COMMIT; We see now the content of the dimension table before the next update Update the dimension table Also in this case we have to use a dynamic procedure create or replace procedure p_dit2( p_DIT_COD varchar2 default '%' ,p_col varchar2 default '%' ,p_force_cod number default 0) as type t_rc is ref cursor; v_cur t_rc;
  • 23. The Descriptions Management - Stage 2 v_sql varchar2(2000); v_value_cod varchar2(60); v_where varchar2(80); v_count number; cr char(1) := chr(10); begin for s in (select io_cod,DIT_COD from MEF_IO2_CFT where DIT_COD like p_DIT_COD) loop for r in (select b.column_cod ,replace(b.column_cod,'_COD','_BDS') column_bds ,nvl(dom_cod,s.io_cod||'.'||column_cod) dom_cod from cols a inner join MEF_STA2_CFT b on (a.column_name = b.column_cod and b.io_cod = s.io_cod) where a.table_name = s.DIT_COD and a.column_name like p_col and a.column_name like '%_COD' and nvl(b.dom_cod,'*') <> 'NULL') loop if (p_force_cod = 0) then v_where := ' where '||r.column_bds||' is null'; else v_where := null; end if; v_sql := 'select distinct '||r.column_cod||' value_cod'||cr|| ' from '||s.DIT_COD||v_where; dbms_output.put_line(v_sql); open v_cur for v_sql; loop fetch v_cur into v_value_cod; exit when v_cur%notfound; -- usare using perchè se il valore contiene degli apici, l'sql fallisce v_sql :=
  • 24. The Descriptions Management - Stage 2 'update '||s.DIT_COD||' d'||cr|| 'set ('||r.column_bds||') = ('||cr||'select bds_txt '||cr|| 'from MEF_DOM2_CFT s where d.'||r.column_cod||' = s.cod_txt '|| 'and s.dom_cod = '||''''||r.dom_cod||''''||')'||cr|| 'where '||r.column_cod||' = :1 '|| ' and exists ('||cr|| 'select 1 from MEF_DOM2_CFT s where d.'||r.column_cod||' = s.cod_txt '|| 'and s.dom_cod = '||''''||r.dom_cod||''''||')'; dbms_output.put_line(v_sql); execute immediate v_sql using v_value_cod; end loop; close v_cur; end loop; end loop; commit; end; / sho errors set serveroutput on exec p_dit2('CUST2_DIT'); Some notes on this update procedure. • The first loop is used to identify the name of all the dimension tables, if you do not provide a specific input. • The second loop is used to identify values and descriptions associated with the domains. They will be the basis for the subsequent updates. • As we know, the UPDATE operations are, in general, slow operations at the Database level. But we perform the update only for the rows that have not the description valued (see, for example, the clause "where '|| || r.column_bds' is null"). This greatly reduces the weight of the update operation. In fact, in the case of the Slowly Changing Dimensions of type 2, we expect that new keys or new keys versioning, will involve only between the 5 and 10 percent of the total lines (this is why we call slowly dimensions). Then, with appropriate indexing, the weight of the update should not be excessive. • We note again that the condition (on NULL) seen before, is conditioned by an input parameter to the procedure. So, in the face of a massive update of the descriptions in the domains table, it is always possible to force a massive update of all dimension tables (off-line, of course). • We used the clause "using" in the dynamic execution, because if the value contains quotes, it could cause problems. The dynamic SQL used in the procedure, will produce statements of type:
  • 25. The Descriptions Management - Stage 2 select distinct STATUS_COD value_cod from CUST2_DIT where STATUS_BDS is null; update CUST2_DIT d set (STATUS_BDS) = ( select bds_txt from MEF_DOM2_CFT s where d.STATUS_COD = s.cod_txt and s.dom_cod = 'STATUS') where STATUS_COD = :1 and exists ( select 1 from MEF_DOM2_CFT s where d.STATUS_COD = s.cod_txt and s.dom_cod = 'STATUS') At the end of the update, the dimension table will be:
  • 26. The Descriptions Management - Stage 3 The previous phases have produced a fairly accurate descriptions management. Phase two has made it dynamic and reusable for all the dimension tables of a Data Warehouse. Unfortunately, we are still quite far from reality. We have not taken into account a very important factor: the dependence between codes. In a medium / large Data Warehouse and especially in an Enterprise Data Warehouse, where the dependence between codes exists, the phase 2 does not work anymore because we have to deal with the following situations. Example 1 Suppose to be in a bank that has acquired other banks. The data warehouse will contain the data of the whole group, so that even the customer table must be unique, with a company code that differentiates the customers. We apply now conceptually this fact to our example of CUST_DIT. We are quite sure that the meaning "emplyee" (like description) of the profession code "10" for the B1 bank , has the same meaning for the B2 bank ? The question is obviously rhetorical. The answer is NO. We cannot have this guarantee, nor can we hope that the phase of data migration from the bank acquired at the acquiring bank has normalized all tables and meanings. This is the problem of the dependence between codes. A code no longer has a unique description, but has a different description depending on the value of another code. Example 2 It must be clear that this problem does not occur only in the case of company mergers, but also on tables of a single company. Another example can be the causal of a banking operation. A causal code "01" may have a meaning "Descr XXX" if the movement takes place by the bank desk, but it can have a different description "Descr YYY" with the same code "01" if the transaction takes place via the Web. In this case the causal code is dependent from the transmission channel code. Esempio 3 When I speak of dependence between codes, does not mean between two codes, but also among many codes. An example is the table of the financial instruments. Because the instruments are very numerous, it appears almost always a hierarchy that classifies financial instruments associating them to categories, groups and subgroups. As we can see from the example below, the subgroup with code "1" has three different descriptions depending on the group to which it belongs. In turn, the group code "2" has two different descriptions depending on the category to which it belongs. Category Group Subgroup 4 COMMON FOUNDS 3 FOR.INV.FOUNDS 1 LUSS. MOB. 7 OPTIONS 2 PUT ITALY 1 EQUITIES 3 WARRANTS 2 FOREIGN WARRANTS 1 ORD. STOCK BUY
  • 27. The Descriptions Management - Stage 3 So, if we have dependence between codes, the management of the descriptions shown in step 2 does not work. You have to manage it in a different way. Change the configuration table The first thing to do is to change the configuration table of the columns by adding the information on the dependencies between codes. We assume no more than 5 dependencies. The new example table will be the dimensionale table of the financial instruments. DROP TABLE MEF_STA3_CFT CASCADE CONSTRAINTS; CREATE TABLE MEF_STA3_CFT ( IO_COD VARCHAR2(9) ,COLUMN_COD VARCHAR2(30) ,DOM_COD VARCHAR2(30) ,DC1_COD VARCHAR2(30) ,DC2_COD VARCHAR2(30) ,DC3_COD VARCHAR2(30) ,DC4_COD VARCHAR2(30) ,DC5_COD VARCHAR2(30) ); INSERT INTO MEF_STA3_CFT VALUES ('FIN','CAT_COD','CAT','','','','',''); INSERT INTO MEF_STA3_CFT VALUES ('FIN','GRP_COD','GRP','CAT_COD','','','',''); INSERT INTO MEF_STA3_CFT VALUES ('FIN','SUBGRP_COD','SUBGRP','GRP_COD','CAT_COD','','',''); If we see its contents, we can clearly see the dependencies between codes that we described earlier. The sub-group depends on the group and category, the group depends on the category, the category has no dependencies
  • 28. The Descriptions Management - Stage 3 Also configure the new source/dimension DELETE MEF_IO2_CFT WHERE IO_COD='FIN'; INSERT INTO MEF_IO2_CFT VALUES ('FIN','FIN_STA','FIN_DIT'); Create and load the domains table The domains table must contain the value of the dependence codes. These dependencies must be studied and analyzed. As in the previous stage, even if they are not known, all combinations of dependencies will be initialized automatically after the loading of the staging area. The DCLIST_TXT field is only informative. It is useful to know what are the names of the columns to which the DV * codes refer themselves. DROP TABLE MEF_DOM3_CFT CASCADE CONSTRAINTS; CREATE TABLE MEF_DOM3_CFT ( DOM_COD VARCHAR2(61) NOT NULL ,COD_TXT VARCHAR2(60) NOT NULL ,HDS_TXT VARCHAR2(255) ,BDS_TXT VARCHAR2(255) ,DV1_COD VARCHAR2(30 BYTE) ,DV2_COD VARCHAR2(30 BYTE) ,DV3_COD VARCHAR2(30 BYTE) ,DV4_COD VARCHAR2(30 BYTE) ,DV5_COD VARCHAR2(30 BYTE) ,DCLIST_TXT VARCHAR2(160 BYTE) ); INSERT INTO MEF_DOM3_CFT VALUES ('SUBGRP','1','LUSS. MOB.','LUSS. MOB.','3','4','','','','GRP_COD,CAT_COD'); INSERT INTO MEF_DOM3_CFT VALUES ('SUBGRP','1','EQUITIES','EQUITIES','2','7','','','','GRP_COD,CAT_COD'); INSERT INTO MEF_DOM3_CFT VALUES ('SUBGRP','1','ORD. STOCK BUY','ORD. STOCK BUY','2','3','','','','GRP_COD,CAT_COD'); INSERT INTO MEF_DOM3_CFT VALUES ('GRP','3','FOR.INV.FOUNDS','FOR.INV.FOUNDS','4','','','','','CAT_COD'); INSERT INTO MEF_DOM3_CFT VALUES ('GRP','2','PUT ITALY','PUT ITALY','7','','','','','CAT_COD'); INSERT INTO MEF_DOM3_CFT VALUES ('GRP','2','FOREIGN WARRANTS','FOREIGN WARRANTS','3','','','','','CAT_COD'); INSERT INTO MEF_DOM3_CFT VALUES ('CAT','4','COMMON FOUNDS','COMMON FOUNDS','','','','','',''); INSERT INTO MEF_DOM3_CFT VALUES ('CAT','7','OPTIONS','OPTIONS','','','','','',''); INSERT INTO MEF_DOM3_CFT VALUES ('CAT','3','WARRANTS','WARRANTS','','','','','',''); COMMIT;
  • 29. The Descriptions Management - Stage 3 See the domains table: Load the staging table The new staging table of financial instruments will be: DROP TABLE FIN_STA CASCADE CONSTRAINTS; CREATE TABLE FIN_STA ( FIN_COD VARCHAR2(12) ,CAT_COD VARCHAR2(3) ,GRP_COD VARCHAR2(3) ,SUBGRP_COD VARCHAR2(3) ); INSERT INTO FIN_STA VALUES ('LU0108803940','4','3','1'); INSERT INTO FIN_STA VALUES ('IT0002693197','7','2','1'); INSERT INTO FIN_STA VALUES ('CH0014852120','3','2','1'); INSERT INTO FIN_STA VALUES ('IT0001119848','3','1','1'); INSERT INTO FIN_STA VALUES ('IT0000566056','1','1','1'); COMMIT;
  • 30. The Descriptions Management - Stage 3 As we can see, there's two combinations (1,1,1 and 3,1,1) not present in the domain. Update the domain table The update procedure of the domains is certainly more complicated than the previous case. The code is implemented only for demonstration, and surely can be implemented more efficiently. create or replace procedure p_dom3(p_io_cod varchar2 default '%' ,p_column_cod varchar2 default '%') is v_sql varchar2(4000); v_dom_cod varchar2(61); v_hds_txt varchar2(200); v_bds_txt varchar2(200); v_cod_txt varchar2(200); v_column_cod varchar2(200); type t_rc is ref cursor; v_cur t_rc; cr char(1) := chr(10); v_list varchar2(200); v_list2 varchar2(200); v_c1 varchar2(200); v_c2 varchar2(200); v_c3 varchar2(200); v_c4 varchar2(200); v_c5 varchar2(200); v_count number; begin for r in (select a.STT_COD ,b.column_cod ,nvl(b.dom_cod,a.io_cod||'.'||b.column_cod) dom_cod ,b.dc1_cod,b.dc2_cod,b.dc3_cod,b.dc4_cod,b.dc5_cod from mef_io2_cft a inner join mef_sta3_cft b on (a.io_cod = b.io_cod and column_cod like p_column_cod and b.column_cod like '%_COD' and nvl(b.dom_cod,’*’) <> 'NULL') where a.io_cod like p_io_cod) loop v_list := null; v_list2 := null; v_count := 0;
  • 31. The Descriptions Management - Stage 3 if (r.dc1_cod is not null) then v_list := v_list||','||r.dc1_cod; v_list2 := v_list2||' and '||r.dc1_cod||' = dv1_cod'; v_count := v_count+1; end if; if (r.dc2_cod is not null) then v_list := v_list||','||r.dc2_cod; v_list2 := v_list2||' and '||r.dc2_cod||' = dv2_cod'; v_count := v_count+1; end if; if (r.dc3_cod is not null) then v_list := v_list||','||r.dc3_cod; v_list2 := v_list2||' and '||r.dc3_cod||' = dv3_cod'; v_count := v_count+1; end if; if (r.dc4_cod is not null) then v_list := v_list||','||r.dc4_cod; v_list2 := v_list2||' and '||r.dc4_cod||' = dv4_cod'; v_count := v_count+1; end if; if (r.dc5_cod is not null) then v_list := v_list||','||r.dc5_cod; v_list2 := v_list2||' and '||r.dc5_cod||' = dv5_cod'; v_count := v_count+1; end if; v_sql := 'select distinct '||''''||r.dom_cod||''''||cr|| ' ,a.'||r.column_cod||cr|| ' ,b.cod_txt'||cr|| ' ,nvl(b.hds_txt,''nd'') hds_txt '||cr|| ' ,nvl(b.bds_txt,''nd'') bds_txt '||cr|| v_list||cr|| 'from '||r.STT_COD||' a '||cr|| 'left outer join mef_dom3_cft b '||cr|| 'on (a.'||r.column_cod||' = b.cod_txt '||cr|| ' and dom_cod='||''''||r.dom_cod||''''||cr|| v_list2||cr|| ')'; dbms_output.put_line(v_sql);
  • 32. The Descriptions Management - Stage 3 open v_cur for v_sql; loop if (v_count=0) then fetch v_cur into v_dom_cod,v_column_cod,v_cod_txt ,v_hds_txt,v_bds_txt; elsif (v_count=1) then fetch v_cur into v_dom_cod,v_column_cod,v_cod_txt ,v_hds_txt,v_bds_txt,v_c1; elsif (v_count=2) then fetch v_cur into v_dom_cod,v_column_cod,v_cod_txt ,v_hds_txt,v_bds_txt,v_c1,v_c2; elsif (v_count=3) then fetch v_cur into v_dom_cod,v_column_cod,v_cod_txt ,v_hds_txt,v_bds_txt,v_c1,v_c2,v_c3; elsif (v_count=4) then fetch v_cur into v_dom_cod,v_column_cod,v_cod_txt ,v_hds_txt,v_bds_txt,v_c1,v_c2,v_c3,v_c4; else fetch v_cur into v_dom_cod,v_column_cod,v_cod_txt ,v_hds_txt,v_bds_txt,v_c1,v_c2,v_c3,v_c4,v_c5; end if; exit when v_cur%notfound; if (v_cod_txt is null) then insert into mef_dom3_cft (dom_cod,cod_txt,hds_txt,bds_txt ,dv1_cod, dv2_cod,dv3_cod, dv4_cod, dv5_cod,dclist_txt) values(v_dom_cod,v_column_cod,v_hds_txt,v_bds_txt ,v_c1,v_c2,v_c3,v_c4,v_c5,substr(v_list,2)); end if; end loop; close v_cur; end loop; commit; end; / sho errors set serveroutput on exec p_dom3('FIN');
  • 33. The Descriptions Management - Stage 3 Compared to the version of phase 2, there is the need to acquire the name of the columns of the dependence codes. After the execution of the procedure, which, remember, is to be performed as post-processing of the Staging Area loading, the table of the domains will have the following values. Creation and load of the dimension table We create the new dimension table and load it from the staging table.
  • 34. The Descriptions Management - Stage 3 DROP TABLE FIN_DIT CASCADE CONSTRAINTS; CREATE TABLE FIN_DIT ( FIN_KEY NUMBER ,FIN_COD VARCHAR2(12) ,CAT_COD VARCHAR2(30) ,CAT_BDS VARCHAR2(60) ,CAT_XDS VARCHAR2(93) GENERATED ALWAYS AS (CAT_COD||' - '||CAT_BDS) VIRTUAL ,GRP_COD VARCHAR2(30) ,GRP_BDS VARCHAR2(60) ,GRP_XDS VARCHAR2(93) GENERATED ALWAYS AS (GRP_COD||' - '||GRP_BDS) VIRTUAL ,SUBGRP_COD VARCHAR2(30) ,SUBGRP_BDS VARCHAR2(60) ,SUBGRP_XDS VARCHAR2(93) GENERATED ALWAYS AS (SUBGRP_COD||' - '||SUBGRP_BDS) VIRTUAL ); INSERT INTO FIN_DIT(FIN_KEY,FIN_COD,CAT_COD,GRP_COD,SUBGRP_COD) SELECT ROWNUM,FIN_COD,CAT_COD,GRP_COD,SUBGRP_COD FROM FIN_STA; COMMIT; We see the dimension table before the upgrade of the descriptions.
  • 35. The Descriptions Management - Stage 3 Update the dimension table Even the update procedure will be more complex than the previous, in order to manage the dependencies create or replace procedure p_dit3( p_DIT_COD varchar2 default '%' ,p_col varchar2 default '%' ,p_force_cod number default 0) as type t_rc is ref cursor; v_cur t_rc; v_sql varchar2(2000); v_sql2 varchar2(2000); v_value_cod varchar2(60); v_column_cod varchar2(30); v_dom_cod varchar2(61); v_exec_cnt number; v_where varchar2(80); v_count number; cr char(1) := chr(10); v_c1 varchar2(200); v_c2 varchar2(200); v_c3 varchar2(200); v_c4 varchar2(200); v_c5 varchar2(200); v_ditc1 varchar2(200); v_ditc2 varchar2(200); v_ditc3 varchar2(200); v_ditc4 varchar2(200); v_ditc5 varchar2(200); v_list varchar2(2000); v_list2 varchar2(2000); begin for s in (select io_cod,DIT_COD from MEF_IO2_CFT where DIT_COD like p_DIT_COD) loop for r in (select b.column_cod ,replace(b.column_cod,'_COD','_BDS') column_bds ,nvl(dom_cod,s.io_cod||'.'||column_cod) dom_cod ,dc1_cod ,dc2_cod ,dc3_cod ,dc4_cod ,dc5_cod
  • 36. The Descriptions Management - Stage 3 from cols a inner join MEF_STA3_CFT b on (a.column_name = b.column_cod and b.io_cod = s.io_cod and b.column_cod like '%_COD' and nvl(b.dom_cod,'*') <> 'NULL') where a.table_name = s.DIT_COD and a.column_name like p_col ) loop v_list := null; v_list2 := null; v_count := 0; dbms_output.put_line(r.column_bds); if (r.dc1_cod is not null) then v_list := v_list||','||r.dc1_cod; v_count := v_count+1; end if; if (r.dc2_cod is not null) then v_list := v_list||','||r.dc2_cod; v_count := v_count+1; end if; if (r.dc3_cod is not null) then v_list := v_list||','||r.dc3_cod; v_count := v_count+1; end if; if (r.dc4_cod is not null) then v_list := v_list||','||r.dc4_cod; v_count := v_count+1; end if; if (r.dc5_cod is not null) then v_list := v_list||','||r.dc5_cod; v_count := v_count+1; end if; if (p_force_cod = 0) then v_where := ' where '||r.column_bds||' is null'; else v_where := null; end if; dbms_output.put_line(v_count); v_sql := 'select distinct '||r.column_cod||' value_cod'||cr|| v_list||' from '||s.DIT_COD||v_where; dbms_output.put_line(v_sql); open v_cur for v_sql; loop if (v_count=0) then fetch v_cur into v_value_cod;
  • 37. The Descriptions Management - Stage 3 v_list2 := null; elsif (v_count=1) then fetch v_cur into v_value_cod,v_ditc1; v_list2 := ' and d.'||r.dc1_cod||' = s.dv1_cod'; elsif (v_count=2) then fetch v_cur into v_value_cod,v_ditc1,v_ditc2; v_list2 := ' and d.'||r.dc1_cod||' = s.dv1_cod'|| ' and d.'||r.dc2_cod||' = s.dv2_cod'; elsif (v_count=3) then fetch v_cur into v_value_cod,v_ditc1,v_ditc2,v_ditc3; v_list2 := ' and d.'||r.dc1_cod||' = s.dv1_cod'|| ' and d.'||r.dc2_cod||' = s.dv2_cod'|| ' and d.'||r.dc3_cod||' = s.dv3_cod'; elsif (v_count=4) then fetch v_cur into v_value_cod,v_ditc1,v_ditc2,v_ditc3,v_ditc4; v_list2 := ' and d.'||r.dc1_cod||' = s.dv1_cod'|| ' and d.'||r.dc2_cod||' = s.dv2_cod'|| ' and d.'||r.dc3_cod||' = s.dv3_cod'|| ' and d.'||r.dc4_cod||' = s.dv4_cod'; else fetch v_cur into v_value_cod,v_ditc1,v_ditc2,v_ditc3,v_ditc4,v_ditc5; v_list2 := ' and d.'||r.dc1_cod||' = s.dv1_cod'|| ' and d.'||r.dc2_cod||' = s.dv2_cod'|| ' and d.'||r.dc3_cod||' = s.dv3_cod'|| ' and d.'||r.dc4_cod||' = s.dv4_cod'|| ' and d.'||r.dc5_cod||' = s.dv5_cod'; end if; exit when v_cur%notfound; v_sql2 := 'update '||s.DIT_COD||' d'||cr|| 'set ('||r.column_bds||') = ('||cr||'select bds_txt '||cr|| 'from mef_dom3_cft s where s.dom_cod = '||''''||r.dom_cod||''''|| ' and s.cod_txt = :1 '|| v_list2||cr|| ')'||cr|| 'where '||r.column_cod||' = :2 '||cr|| ' and exists (select 1 from MEF_DOM3_CFT s '|| 'where s.dom_cod = '||''''||r.dom_cod||''''||' and s.cod_txt = :3 '|| v_list2||')'; dbms_output.put_line(v_sql2); execute immediate v_sql2 using v_value_cod,v_value_cod,v_value_cod; end loop; close v_cur; end loop; end loop;
  • 38. The Descriptions Management - Stage 3 commit; end; / sho errors set serveroutput on exec p_dit3('FIN_DIT'); After the execution of the procedure, the dimension table of financial instruments will be: The first three lines have combinations of codes that are present in the table of the domains, so they have been updated the corresponding descriptions for all codes. The fourth line has only updated the description of the category, because the category has no other dependencies. The fifth line has all descriptions indefinite because it is a combination of unknown dependencies.
  • 39. The Descriptions Management - Stage 4 Many of you will be amazed to see represented a further stage 4. It seems incredible, but we have not again said all about the descriptions management. We summarize the analysis that we have done so far. • In Phase 1 we highlighted the problem, we have introduced the domains table, a configuration table necessary to operate, we have defined a naming convention. • In Phase 2, we have automated the process, we have introduced other configuration tables in support of a dynamic management of the descriptions. We have shown how to pre-load the domains table, before to receive that information from the source systems. • In Phase 3, we have handled the dependence between codes. What is still missing? The answer depends on the international feature of your Data Warehouse. If you are sure that the Data Warehouse is internal and used only by users of your native language, you can stop at step 3, otherwise you have to take into account the multilanguage management. I will not make a complete example as the other phases. The action is a change to the domains table, adding the language code field. (and some technical columns that is always useful to have). So you have to add language code as a key field in the procedures that have been implemented. With the multilanguage management , the domains table will increase volume as many times as the languages that you add. The new domains table The domains table, that we see below, can be considered complete, and I advise you to create it immediately with the code language even though at the time will not be used. DROP TABLE MEF_DOM4_CFT CASCADE CONSTRAINTS; CREATE TABLE MEF_DOM4_CFT ( DOM_COD VARCHAR2(61) NOT NULL ,COD_TXT VARCHAR2(60) NOT NULL ,HDS_TXT VARCHAR2(255) ,BDS_TXT VARCHAR2(255) ,DV1_COD VARCHAR2(30 BYTE) ,DV2_COD VARCHAR2(30 BYTE) ,DV3_COD VARCHAR2(30 BYTE) ,DV4_COD VARCHAR2(30 BYTE) ,DV5_COD VARCHAR2(30 BYTE) ,DCLIST_TXT VARCHAR2(160 BYTE) ,LANG_COD VARCHAR2(30 BYTE) ,INS_DTS DATE ,STAMP_DTS DATE DEFAULT SYSDATE ,CTRL_TXT VARCHAR2(30) );
  • 40. Conclusion As we have seen, the descriptions management is not a simple topic, has many implications, employ many people and it is extremely important because it is not a technical detail, but involves the way the end user sees the data. Determining factor for the success of the Data Warehouse. I hope that what I have written can serve, for all the people involved in the design of a Data Warehouse, to become aware of the problem. The "descriptions management" activity must become a fixed point in the Gantt of the project.