Hi,
We offer online IT trainings with placements in different platforms with real time industry consultants to provide quality training for all IT professionals, corporate clients and students etc.
Cognos is the leading supplier of business intelligence solutions that optimize the efficiency of the world’s most successful and largest organizations. Business intelligence is a category of technologies and applications for assembling, analyzing, storing, reporting on, and giving access to data to facilitate enterprise users take superior business decisions. It’s the prime developer of business intelligence software. They are biggest dealer of tools of business intelligence globally, and the company name is acknowledged as a leader in the field of BI reporting. This BI tool constantly adds direct context to business-driving details, supporting decisions and building enterprises further agile. It contributes to more than 17,000 clients in 120 countries worldwide.
Prerequisite
• Sql Basic Knowledge
• Basic C-language
• HTML
• CSS
Please Visit us for the Demo Classes, we have regular batches and weekend batches.
Other Courses we offered:
MICROSOFT: ADO .NET, ASP .NET, C# .NET, MSBI, QTP, Vb .NET, SharePoint
Data Warehouse: Informatica, OBIEE, Cognos
PROGRAMMING: Core Java, Advanced Java, J2EE, Hibernates, Strutus, Java Script, Perl Scripting, Shell Scripting, Springs, Ruby on Rails
Mobile Apps: Android, IOS Training, Cloud Computing,
Networking: CCNA, Unix Admin, Linux, Sun Solaris,
Testing Tools: Manual Testing,QA, QTP, Selenium
SalesForce CRM: SalesForce Developer, SalesForce Administrator
BUSINESS ANALYST, HADOOP, QLIKVIEW
DATABASE:SQL/PL-SQL, OBIEE
For more details of Training Session Schedule and Career guidance contact us :
3427 Vintage cir,SE Smyrna, GA-30080
Email: info@quontrasolutions.com
Call us : (404) 900-9988
Web: http://www.quontrasolutions.com
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Cognos etl process by quontra solutions
1. ETL Process in Data Warehouse
info@quontrasolutions.com
www.quontrasolutions.com Ph. (404)-900-9988
2. Outline
ETL
Ext ract ion
Transf ormat ion
Loading
info@quontrasolutions.com
www.quontrasolutions.com Ph. (404)-900-9988
3. ETL Overview
Ext ract ion Transf ormat ion Loading – ETL
To get dat a out of t he source and load it int o t he
dat a warehouse – simply a process of copying dat a
f rom one dat abase t o ot her
Dat a is ext ract ed f rom an OLTP dat abase,
t ransf ormed t o mat ch t he dat a warehouse schema
and loaded int o t he dat a warehouse dat abase
info@quontrasolutions.com
www.quontrasolutions.com Ph. (404)-900-9988
4. Many dat a warehouses also incorporat e
dat a f rom non-OLTP syst ems such as
t ext f iles, legacy syst ems, and
spreadsheet s; such dat a also requires
ext ract ion, t ransf ormat ion, and loading
When def ining ETL f or a dat a
warehouse, it is import ant t o t hink of
ETL as a process, not a physical
implement at ion
info@quontrasolutions.com
www.quontrasolutions.com Ph. (404)-900-9988
5. ETL Overview
ETL is of t en a complex combinat ion of pr ocess
and t echnology t hat consumes a signif icant
por t ion of t he dat a war ehouse development
ef f or t s and requires t he skills of business
analyst s, dat abase designer s, and applicat ion
developers
I t is not a one t ime event as new dat a is added
t o t he Dat a War ehouse per iodically – mont hly,
daily, hourly
Because ETL is an int egr al, ongoing, and
recur ring par t of a dat a war ehouse
info@quontrasolutions.com
www.quontrasolutions.com Ph. (404)-900-9988
6. ETL Staging Database
ETL oper at ions should be perf or med on a
relat ional dat abase server separ at e f rom t he
sour ce dat abases and t he dat a war ehouse
dat abase
Creat es a logical and physical separ at ion
bet ween t he sour ce syst ems and t he dat a
war ehouse
Minimizes t he impact of t he int ense per iodic
ETL act ivit y on sour ce and dat a war ehouse
dat abases
info@quontrasolutions.com
www.quontrasolutions.com Ph. (404)-900-9988
8. Extraction
The int egrat ion of all of t he disparat e syst ems across
t he ent erprise is t he real challenge t o get t ing t he
dat a warehouse t o a st at e where it is usable
Dat a is ext ract ed f rom het erogeneous dat a sources
Each dat a source has it s dist inct set of
charact erist ics t hat need t o be managed and
int egrat ed int o t he ETL syst em in order t o
ef f ect ively ext ract dat a.
info@quontrasolutions.com
www.quontrasolutions.com Ph. (404)-900-9988
9. ETL pr ocess needs t o ef f ect ively
int egrat e syst ems t hat have dif f er ent :
DBMS
Operat ing Syst ems
Hardware
Communicat ion prot ocols
Need t o have a logical dat a map bef or e
t he physical dat a can be t r ansf or med
The logical dat a map descr ibes t he
relat ionship bet ween t he ext reme st ar t ing
point s and t he ext r eme ending point s of
your ETL syst em usually present ed in a
Extraction
info@quontrasolutions.com
www.quontrasolutions.com Ph. (404)-900-9988
10. The cont ent of t he logical dat a mapping
document has been pr oven t o be t he cr it ical
element r equired t o ef f icient ly plan ETL
pr ocesses
The t able t ype gives us our queue f or t he
ordinal posit ion of our dat a load processes—
f irst dimensions, t hen f act s.
The pr imary purpose of t his document is t o
pr ovide t he ETL developer wit h a clear -cut
Target Source Transformation
Table
Name
Column Name Data Type Table Name Column Name Data Type
info@quontrasolutions.com
www.quontrasolutions.com Ph. (404)-900-9988
11. The analysis of t he source syst em
is usually broken int o t wo maj or
phases:
The dat a discovery phase
The anomaly det ect ion phase
info@quontrasolutions.com
www.quontrasolutions.com Ph. (404)-900-9988
12. Extraction - Data Discovery Phase
Dat a Discovery Phase
key crit erion f or t he success of t he
dat a warehouse is t he cleanliness and
cohesiveness of t he dat a wit hin it
Once you underst and what t he t arget
needs t o look like, you need t o ident if y
and examine t he dat a sources
info@quontrasolutions.com
www.quontrasolutions.com Ph. (404)-900-9988
13. Data Discovery Phase
It is up to the ETL team to drill down further into the data
requirements to determine each and every source system, table,
and attribute required to load the data warehouse
Collecting and Documenting Source Systems
Keeping track of source systems
Determining the System of Record - Point of originating of data
Definition of the system-of-record is important because in most
enterprises data is stored redundantly across many different systems.
info@quontrasolutions.com
www.quontrasolutions.com Ph. (404)-900-9988
14. Data Content Analysis - Extraction
Underst anding t he cont ent of t he dat a is
crucial f or det ermining t he best appr oach f or
ret r ieval
- NULL values. An unhandled NULL value can
dest r oy any ETL process. NULL values pose
t he biggest risk when t hey are in f or eign key
columns. J oining t wo or mor e t ables based on
a column t hat cont ains NULL values
will cause dat a loss! Remember , in a relat ional
dat abase NULL is not equal t o NULL. That is
why t hose j oins f ail. Check f or NULL values in
every f oreign key in t he sour ce dat abase.
When NULL values are present , you must outer
info@quontrasolutions.com
www.quontrasolutions.com Ph. (404)-900-9988
15. Dur ing t he init ial load, capt ur ing changes t o
dat a cont ent in t he sour ce dat a is
unimpor t ant because you ar e most likely
ext r act ing t he ent ir e dat a source or a pot ion
of it f r om a predet ermined point in t ime.
Lat er t he abilit y t o capt ur e dat a changes in
t he source syst em inst ant ly becomes priorit y
The ETL t eam is r esponsible f or capt uring
dat a-cont ent changes dur ing t he increment al
load.
info@quontrasolutions.com
www.quontrasolutions.com Ph. (404)-900-9988
16. Determining Changed Data
Audit Columns - Used by DB and updat ed by
t r iggers
Audit columns ar e appended t o t he end of
each t able t o st or e t he dat e and t ime a
record was added or modif ied
You must analyze and t est each of t he
columns t o ensur e t hat it is a reliable source
t o indicat e changed dat a. I f you f ind any
NULL values, you must t o f ind an alt ernat ive
info@quontrasolutions.com
www.quontrasolutions.com Ph. (404)-900-9988
17. Pr ocess of Eliminat ion
Pr ocess of eliminat ion pr eser ves exact ly one
copy of each pr evious ext ract ion in t he
st aging ar ea f or f ut ur e use.
During t he next run, t he process t akes t he
ent ir e source t able(s) int o t he st aging ar ea
and makes a compar ison against t he ret ained
dat a f rom t he last process.
Only dif f erences (delt as) are sent t o t he dat a
warehouse.
Not t he most ef f icient t echnique, but most
reliable f or capt ur ing changed dat a
Determining Changed Data
info@quontrasolutions.com
www.quontrasolutions.com Ph. (404)-900-9988
18. I nit ial and I ncrement al Loads
Creat e t wo t ables: pr evious load and curr ent
load.
The init ial pr ocess bulk loads int o t he curr ent
load t able. Since change det ect ion is
ir relevant dur ing t he init ial load, t he dat a
cont inues on t o be t ransf or med and loaded
int o t he ult imat e t arget f act t able.
When t he pr ocess is complet e, it dr ops t he
pr evious load t able, r enames t he cur rent load
t able t o pr evious load, and creat es an empt y
cur rent load t able. Since none of t hese t asks
involve dat abase logging, t hey are ver y f ast !
Det ermining Changed Dat a
info@quontrasolutions.com
www.quontrasolutions.com Ph. (404)-900-9988
20. Transformation
Main st ep where t he ETL adds value
Act ually changes dat a and provides
guidance whet her dat a can be used f or
it s int ended purposes
Perf ormed in st aging area
info@quontrasolutions.com
www.quontrasolutions.com Ph. (404)-900-9988
21. Dat a Qualit y paradigm
Cor r ect
Unambiguous
Consist ent
Complet e
Dat a qualit y checks are run at 2 places - af t er
ext r act ion and af t er cleaning and conf ir ming
addit ional check are r un at t his point
Transformation
info@quontrasolutions.com
www.quontrasolutions.com Ph. (404)-900-9988
22. Transformation - Cleaning
Data
Anomaly Det ect ion
Dat a sampling – count (*) of t he rows f or a
depart ment column
Column Pr oper t y Enf orcement
Null Values in reqd columns
Numeric values t hat f all out side of expect ed high
and lows
Cols whose lengt hs are except ionally short / long
Cols wit h cert ain values out side of discret e valid
value set s
Adherence t o a reqd pat t ern/ member of a set of
pat t ern
info@quontrasolutions.com
www.quontrasolutions.com Ph. (404)-900-9988
24. Transf ormat ion - Conf irming
St ruct ure Enf orcement
Tables have proper pr imary and f oreign
keys
Obey r ef erent ial int egrit y
Dat a and Rule value enf orcement
Simple business r ules
Logical dat a checks
info@quontrasolutions.com
www.quontrasolutions.com Ph. (404)-900-9988
27. Loading Dimensions
Physically built t o have t he minimal set s of
component s
The primary key is a single f ield cont aining
meaningless unique int eger – Surrogat e Keys
The DW owns t hese keys and never allows any ot her
ent it y t o assign t hem
De-normalized f lat t ables – all at t ribut es in a
dimension must t ake on a single value in t he presence
of a dimension primary key.
Should possess one or more ot her f ields t hat
compose t he nat ural key of t he dimension
info@quontrasolutions.com
www.quontrasolutions.com Ph. (404)-900-9988
29. The dat a loading module consist s of all t he st eps
required t o administ er slowly changing dimensions
(SCD) and writ e t he dimension t o disk as a physical
t able in t he proper dimensional f ormat wit h correct
primary keys, correct nat ural keys, and f inal
descript ive at t ribut es.
Creat ing and assigning t he surrogat e keys occur in
t his module.
The t able is def init ely st aged, since it is t he obj ect
t o be loaded int o t he present at ion syst em of t he dat a
warehouse.
info@quontrasolutions.com
www.quontrasolutions.com Ph. (404)-900-9988
30. Loading dimensions
When DW receives not if icat ion t hat an
exist ing row in dimension has changed it
gives out 3 t ypes of responses
Type 1
Type 2
Type 3
info@quontrasolutions.com
www.quontrasolutions.com Ph. (404)-900-9988
34. Loading facts
Fact s
Fact t ables hold t he measurement s of
an ent erprise. The relat ionship bet ween
f act t ables and measurement s is
ext remely simple. I f a measurement
exist s, it can be modeled as a f act t able
row. I f a f act t able row exist s, it is a
measurement
info@quontrasolutions.com
www.quontrasolutions.com Ph. (404)-900-9988
35. Key Building Process - Facts
When building a f act t able, t he f inal ETL st ep is
convert ing t he nat ural keys in t he new input records
int o t he correct , cont emporary surrogat e keys
ETL maint ains a special surrogat e key lookup t able f or
each dimension. This t able is updat ed whenever a new
dimension ent it y is creat ed and whenever a Type 2
change occurs on an exist ing dimension ent it y
All of t he required lookup t ables should be pinned in
memory so t hat t hey can be randomly accessed as
each incoming f act record present s it s nat ural keys.
This is one of t he reasons f or making t he lookup
t ables separat e f rom t he original dat a warehouse
dimension t ables.
info@quontrasolutions.com
www.quontrasolutions.com Ph. (404)-900-9988
38. Loading Fact Tables
Managing I ndexes
Perf or mance Killers at load t ime
Dr op all indexes in pre-load t ime
Segregat e Updat es f r om inser t s
Load updat es
Rebuild indexes
info@quontrasolutions.com
www.quontrasolutions.com Ph. (404)-900-9988
39. Managing Part it ions
Part it ions allow a t able (and it s indexes) t o
be physically divided int o minitablesf or
administ r at ive pur poses and t o impr ove
query per f or mance
The most common par t it ioning st rat egy on
f act t ables is t o part it ion t he t able by t he
dat e key. Because t he dat e dimension is
pr eloaded and st at ic, you know exact ly what
t he sur rogat e keys are
Need t o part it ion t he f act t able on t he key
t hat j oins t o t he dat e dimension f or t he
info@quontrasolutions.com
www.quontrasolutions.com Ph. (404)-900-9988
40. Outwitting the Rollback Log
The r ollback log, also known as t he redo log, is
invaluable in t ransact ion (OLTP) syst ems. But
in a dat a war ehouse envir onment where all
t ransact ions ar e managed by t he ETL pr ocess,
t he r ollback log is a superf luous f eat ur e t hat
must be dealt wit h t o achieve opt imal load
per f or mance. Reasons why t he dat a
war ehouse does not need rollback logging ar e:
All dat a is ent ered by a managed process—t he ETL
syst em.
Dat a is loaded in bulk.
Dat a can easily be reloaded if a load process f ails.
Each dat abase management syst em has dif f erent
info@quontrasolutions.com
www.quontrasolutions.com Ph. (404)-900-9988