SlideShare a Scribd company logo
1 of 59
Data WarehouseData Warehouse
Lutfi FreijLutfi Freij
Konstantin RimarchukKonstantin Rimarchuk
Vasken ChamlaianVasken Chamlaian
John SahakianJohn Sahakian
Suzan TonSuzan Ton
InmonInmon
Father of the data warehouseFather of the data warehouse
Co-creator of the CorporateCo-creator of the Corporate
Information Factory.Information Factory.
He has 35 years ofHe has 35 years of
experience in databaseexperience in database
technology managementtechnology management
and data warehouse design.and data warehouse design.
Inmon-Cont’dInmon-Cont’d
Bill has written about a varietyBill has written about a variety
of topics on the building, usage,of topics on the building, usage,
& maintenance of the data warehouse& maintenance of the data warehouse
& the Corporate Information Factory.& the Corporate Information Factory.
He has written more than 650He has written more than 650
articles (Datamation, ComputerWorld,articles (Datamation, ComputerWorld,
and Byte Magazine).and Byte Magazine).
Inmon has published 45 books.Inmon has published 45 books.

Many of books has been translated to Chinese, Dutch, French, German,Many of books has been translated to Chinese, Dutch, French, German,
Japanese, Korean, Portuguese, Russian, and Spanish.Japanese, Korean, Portuguese, Russian, and Spanish.
IntroductionIntroduction
What is Data Warehouse?What is Data Warehouse?
A data warehouse is a collection of integratedA data warehouse is a collection of integrated
databases designed to support a DSS.databases designed to support a DSS.
According to Inmon’s (father of data warehousing)According to Inmon’s (father of data warehousing)
definition(Inmon,1992a,p.5):definition(Inmon,1992a,p.5):

It is a collection of integrated, subject-orientedIt is a collection of integrated, subject-oriented
databases designed to support the DSS function,databases designed to support the DSS function,
where each unit of data is non-volatile and relevantwhere each unit of data is non-volatile and relevant
to some moment in time.to some moment in time.
Introduction-Cont’d.Introduction-Cont’d.
Where is it used?Where is it used?
It is used for evaluating future strategy.It is used for evaluating future strategy.
It needs a successful technician:It needs a successful technician:

Flexible.Flexible.

Team player.Team player.

Good balance of business and technicalGood balance of business and technical
understanding.understanding.
Introduction-Cont’d.Introduction-Cont’d.
The ultimate use of data warehouse is Mass Customization.The ultimate use of data warehouse is Mass Customization.

For example, it increased Capital One’s customers from 1For example, it increased Capital One’s customers from 1
million to approximately 9 millions in 8 years.million to approximately 9 millions in 8 years.
Just like a muscle: DW increases in strength with active use.Just like a muscle: DW increases in strength with active use.

With each new test and product, valuable information isWith each new test and product, valuable information is
added to the DW, allowing the analyst to learn from theadded to the DW, allowing the analyst to learn from the
success and failure of the past.success and failure of the past.
The key to survival:The key to survival:

Is the ability to analyze, plan, and react to changingIs the ability to analyze, plan, and react to changing
business conditions in a much more rapid fashion.business conditions in a much more rapid fashion.
Data WarehouseData Warehouse
In order for data to be effective, DW must be:In order for data to be effective, DW must be:

Consistent.Consistent.

Well integrated.Well integrated.

Well defined.Well defined.

Time stamped.Time stamped.
DW environment:DW environment:

The data store, data mart & the metadata.The data store, data mart & the metadata.
The Data StoreThe Data Store
An operational data store (ODS) stores data for aAn operational data store (ODS) stores data for a
specific application. It feeds the data warehouse aspecific application. It feeds the data warehouse a
stream of desired raw data.stream of desired raw data.
Is the most common component of DW environment.Is the most common component of DW environment.
Data store is generally subject oriented, volatile,Data store is generally subject oriented, volatile,
current commonly focused on customers, products,current commonly focused on customers, products,
orders, policies, claims, etc…orders, policies, claims, etc…
Data Store & Data WarehouseData Store & Data Warehouse
Data store & Data warehouse, table 10-1 pageData store & Data warehouse, table 10-1 page
296296
The data store-Cont’d.The data store-Cont’d.
Its day-to-day function is to store the data for aIts day-to-day function is to store the data for a
single specific set of operational application.single specific set of operational application.
Its function is to feed the data warehouse dataIts function is to feed the data warehouse data
for the purpose of analysis.for the purpose of analysis.
The Data MartThe Data Mart
It is lower-cost, scaled down version of theIt is lower-cost, scaled down version of the
DW.DW.
Data Mart offer a targeted and less costlyData Mart offer a targeted and less costly
method of gaining the advantages associatedmethod of gaining the advantages associated
with data warehousing and can be scaled up towith data warehousing and can be scaled up to
a full DW environment over time.a full DW environment over time.
The Meta DataThe Meta Data
Last component of DW environments.Last component of DW environments.
It is information that is kept about the warehouseIt is information that is kept about the warehouse
rather than information kept within the warehouse.rather than information kept within the warehouse.
Legacy systems generally don’t keep a record ofLegacy systems generally don’t keep a record of
characteristics of the data (such as what pieces of datacharacteristics of the data (such as what pieces of data
exist and where they are located).exist and where they are located).
The metadata is simply data about data.The metadata is simply data about data.
ConclusionConclusion
A Data Warehouse is a collection of integrated subject-A Data Warehouse is a collection of integrated subject-
oriented databases designed to support a DSS.oriented databases designed to support a DSS.

Each unit of data is non-volatile and relevant to some moment in time.Each unit of data is non-volatile and relevant to some moment in time.
An operational data store (ODS) stores data for a specificAn operational data store (ODS) stores data for a specific
application. It feeds the data warehouse a stream of desiredapplication. It feeds the data warehouse a stream of desired
raw data.raw data.
A data mart is a lower-cost, scaled-down version of a dataA data mart is a lower-cost, scaled-down version of a data
warehouse, usually designed to support a small group of userswarehouse, usually designed to support a small group of users
(rather than the entire firm).(rather than the entire firm).
The metadata is information that is kept about the warehouse.The metadata is information that is kept about the warehouse.
Data WarehouseData Warehouse
Subject orientedSubject oriented
Data integratedData integrated
Time variantTime variant
NonvolatileNonvolatile
Characteristics of Data WarehouseCharacteristics of Data Warehouse
Subject oriented.Subject oriented. Data are organized based onData are organized based on
how the users refer to them.how the users refer to them.
IntegratedIntegrated. All inconsistencies regarding naming. All inconsistencies regarding naming
convention and value representations areconvention and value representations are
removed.removed.
NonvolatileNonvolatile. Data are stored in read-only format. Data are stored in read-only format
and do not change over time.and do not change over time.
Time variantTime variant.. Data are not current but normallyData are not current but normally
time series.time series.
Characteristics of Data WarehouseCharacteristics of Data Warehouse
SummarizedSummarized Operational data are mapped intoOperational data are mapped into
a decision-usable formata decision-usable format
Large volumeLarge volume.. Time series data sets areTime series data sets are
normally quite large.normally quite large.
Not normalizedNot normalized.. DW data can be, and oftenDW data can be, and often
are, redundant.are, redundant.
MetadataMetadata.. Data about data are stored.Data about data are stored.
Data sourcesData sources. Data come from internal and. Data come from internal and
external unintegrated operational systems.external unintegrated operational systems.
A Data Warehouse isA Data Warehouse is SubjectSubject OrientedOriented
Subject OrientationSubject Orientation
Application EnvironmentApplication Environment Data warehouseData warehouse
EnvironmentEnvironment
Design activities must be equallyDesign activities must be equally
focused on both process and databasefocused on both process and database
designdesign
DW world is primarily void of processDW world is primarily void of process
design and tends to focus exclusively ondesign and tends to focus exclusively on
issues of data modeling and databaseissues of data modeling and database
designdesign
Data IntegratedData Integrated
IntegrationIntegration –consistency naming–consistency naming
conventions and measurement attributers,conventions and measurement attributers,
accuracy, and common aggregation.accuracy, and common aggregation.
Establishment of a common unit ofEstablishment of a common unit of
measure for all synonymous datameasure for all synonymous data
elements from dissimilar database.elements from dissimilar database.
The data must be stored in the DW in anThe data must be stored in the DW in an
integrated, globally acceptable mannerintegrated, globally acceptable manner
Data IntegratedData Integrated
Time VariantTime Variant
In an operational application system, theIn an operational application system, the
expectation is that all data within the databaseexpectation is that all data within the database
are accurate as of the moment of access. In theare accurate as of the moment of access. In the
DW data are simply assumed to be accurate asDW data are simply assumed to be accurate as
of some moment in time and not necessarilyof some moment in time and not necessarily
right now.right now.
One of the places where DW data display timeOne of the places where DW data display time
variance is in the structure of the record key.variance is in the structure of the record key.
Every primary key contained within the DWEvery primary key contained within the DW
must contain, either implicitly or explicitly anmust contain, either implicitly or explicitly an
element of time( day, week, month, etc)element of time( day, week, month, etc)
Time VariantTime Variant
Every piece of data contained within theEvery piece of data contained within the
warehouse must be associated with awarehouse must be associated with a
particular point in time if any usefulparticular point in time if any useful
analysis is to be conducted with it.analysis is to be conducted with it.
Another aspect of time variance in DWAnother aspect of time variance in DW
data is that, once recorded, data within thedata is that, once recorded, data within the
warehouse cannot be updated orwarehouse cannot be updated or
changed.changed.
NonvolatilityNonvolatility
Typical activities such as deletes, inserts,Typical activities such as deletes, inserts,
and changes that are performed in anand changes that are performed in an
operational application environment areoperational application environment are
completely nonexistent in a DWcompletely nonexistent in a DW
environment.environment.
Only two data operations are everOnly two data operations are ever
performed in the DW: data loading andperformed in the DW: data loading and
data accessdata access
NonvolatilityNonvolatility
Application DW
The design issues must focus on data
integrity and update anomalies. Complex
processes must be coded to ensure that the
data update processes allow for high
integrity of the final product.
Such issues are no concern to in a DW
environment because data update is never
performed.
Data is placed in normalized form to
ensure a minimal redundancy (totals that
could be calculated would never be
stored)
Designers find it useful to store many of
such calculations or summarizations.
The technologies necessary to support
issues of transaction and data recovery,
roll back, and detection and remedy of
deadlock are quite complex.
Relative simplicity in technology
Issues of Data Redundancy betweenIssues of Data Redundancy between
DW and operational environmentsDW and operational environments
The lack of relevancy of issues such as dataThe lack of relevancy of issues such as data
normalization in the DW environment may suggest thatnormalization in the DW environment may suggest that
existence of massive data redundancy within the dataexistence of massive data redundancy within the data
warehouse and between the operational and DWwarehouse and between the operational and DW
environments.environments.
Inmon(1992) pointed out and proved that it is not true.Inmon(1992) pointed out and proved that it is not true.
Issues of Data Redundancy betweenIssues of Data Redundancy between
DW and operational environmentsDW and operational environments
The data being loaded into the DW are filtered and “cleansed” as theyThe data being loaded into the DW are filtered and “cleansed” as they
pass from the operational database to the warehouse. Because of thispass from the operational database to the warehouse. Because of this
cleansing numerous data that exists in the operational environmentcleansing numerous data that exists in the operational environment
never pass to the data warehouse. Only the data necessary fornever pass to the data warehouse. Only the data necessary for
processing by the DSS or EIS are ever actually loaded into the DW.processing by the DSS or EIS are ever actually loaded into the DW.
The time horizons for warehouse and operational data elements areThe time horizons for warehouse and operational data elements are
unique. Data in the operational environment are fresh, whereasunique. Data in the operational environment are fresh, whereas
warehouse data are generally much older.(so there is minimalwarehouse data are generally much older.(so there is minimal
opportunity of the data to overlap between two environments )opportunity of the data to overlap between two environments )
The data loaded into the DW often undergo a radical transformation asThe data loaded into the DW often undergo a radical transformation as
they pass form operational to the DW environment. So data in DW arethey pass form operational to the DW environment. So data in DW are
not the same.not the same.
Given this factors, Inmon suggests that data redundancy between the twoGiven this factors, Inmon suggests that data redundancy between the two
environments is a rare occurrence with a typical redundancy factor ofenvironments is a rare occurrence with a typical redundancy factor of
less than 1 %less than 1 %
The Data WarehouseThe Data Warehouse
ArchitectureArchitecture
The architecture consists of variousThe architecture consists of various
interconnected elements:interconnected elements:

Operational and external database layerOperational and external database layer – the– the
source data for the DWsource data for the DW

Information access layerInformation access layer – the tools the end– the tools the end
user access to extract and analyze the datauser access to extract and analyze the data

Data access layerData access layer – the interface between the– the interface between the
operational and information access layersoperational and information access layers

Metadata layerMetadata layer – the data directory or– the data directory or
repository of metadata informationrepository of metadata information
Components of the DataComponents of the Data
Warehouse ArchitectureWarehouse Architecture
The Data WarehouseThe Data Warehouse
ArchitectureArchitecture
Additional layers are:Additional layers are:

Process management layerProcess management layer – the scheduler or job– the scheduler or job
controllercontroller

Application messaging layerApplication messaging layer – the “middleware” that– the “middleware” that
transports information around the firmtransports information around the firm

Physical data warehouse layerPhysical data warehouse layer – where the actual– where the actual
data used in the DSS are locateddata used in the DSS are located

Data staging layerData staging layer – all of the processes necessary to– all of the processes necessary to
select, edit, summarize and load warehouse dataselect, edit, summarize and load warehouse data
from the operational and external data basesfrom the operational and external data bases
Data Warehousing TypologyData Warehousing Typology
TheThe virtual data warehousevirtual data warehouse – the end users– the end users
have direct access to the data stores, usinghave direct access to the data stores, using
tools enabled at the data access layertools enabled at the data access layer
TheThe central data warehousecentral data warehouse – a single physical– a single physical
database contains all of the data for a specificdatabase contains all of the data for a specific
functional areafunctional area
TheThe distributed data warehousedistributed data warehouse – the– the
components are distributed across severalcomponents are distributed across several
physical databasesphysical databases
The MetadataThe Metadata
The name suggests some high-levelThe name suggests some high-level
technological concept, but it really is fairlytechnological concept, but it really is fairly
simple. Metadata is “data about data”.simple. Metadata is “data about data”.
With the emergence of the data warehouse as aWith the emergence of the data warehouse as a
decision support structure, the metadata aredecision support structure, the metadata are
considered as much a resource as the businessconsidered as much a resource as the business
data they describe.data they describe.
Metadata are abstractions -- they are high levelMetadata are abstractions -- they are high level
data that provide concise descriptions of lower-data that provide concise descriptions of lower-
level data.level data.
The MetadataThe Metadata
For example, a line in a sales database may contain:For example, a line in a sales database may contain:
4056 KJ596 223.454056 KJ596 223.45
This is mostly meaningless until we consult the metadataThis is mostly meaningless until we consult the metadata
that tells us it was store number 4056, product KJ596that tells us it was store number 4056, product KJ596
and sales of $223.45and sales of $223.45
The metadata are essential ingredients in theThe metadata are essential ingredients in the
transformation of raw data into knowledge. They are thetransformation of raw data into knowledge. They are the
“keys” that allow us to handle the raw data.“keys” that allow us to handle the raw data.
General Metadata IssuesGeneral Metadata Issues
General metadata issues associated with DataGeneral metadata issues associated with Data
Warehouse use:Warehouse use:

What tables, attributes and keys does the DWWhat tables, attributes and keys does the DW
contain?contain?

Where did each set of data come from?Where did each set of data come from?

What transformations were applied with cleansing?What transformations were applied with cleansing?

How have the metadata changed over time?How have the metadata changed over time?

How often do the data get reloaded?How often do the data get reloaded?

Are there so many data elements that you need to beAre there so many data elements that you need to be
careful what you ask for?careful what you ask for?
Components of the MetadataComponents of the Metadata
Transformation mapsTransformation maps – records that show– records that show
what transformations were appliedwhat transformations were applied
Extraction & relationship historyExtraction & relationship history – records– records
that show what data was analyzedthat show what data was analyzed
Algorithms for summarizationAlgorithms for summarization – methods– methods
available for aggregating and summarizingavailable for aggregating and summarizing
Data ownershipData ownership – records that show origin– records that show origin
Patterns of accessPatterns of access – records that show– records that show
what data are accessed and how oftenwhat data are accessed and how often
Typical Mapping MetadataTypical Mapping Metadata
Transformation mapping records include:Transformation mapping records include:

Identification of original sourceIdentification of original source

Attribute conversionsAttribute conversions

Physical characteristic conversionsPhysical characteristic conversions

Encoding/reference table conversionsEncoding/reference table conversions

Naming changesNaming changes

Key changesKey changes

Values of default attributesValues of default attributes

Logic to choose from multiple sourcesLogic to choose from multiple sources

Algorithmic changesAlgorithmic changes
Implementing the Data WarehouseImplementing the Data Warehouse
Kozar list of “seven deadly sins” of data warehouseKozar list of “seven deadly sins” of data warehouse
implementation:implementation:
1.1. ““If you build it, they will come”If you build it, they will come” – the DW needs to be– the DW needs to be
designed to meet people’s needsdesigned to meet people’s needs
2.2. Omission of an architectural frameworkOmission of an architectural framework – you need– you need
to consider the number of users, volume of data,to consider the number of users, volume of data,
update cycle, etc.update cycle, etc.
3.3. Underestimating the importance of documentingUnderestimating the importance of documenting
assumptionsassumptions – the assumptions and potential– the assumptions and potential
conflicts must be included in the frameworkconflicts must be included in the framework
““Seven Deadly Sins”,Seven Deadly Sins”, continuedcontinued
4.4. Failure to use the right toolFailure to use the right tool – a DW project needs– a DW project needs
different tools than those used to develop andifferent tools than those used to develop an
applicationapplication
5.5. Life cycle abuseLife cycle abuse – in a DW, the life cycle really– in a DW, the life cycle really
never endsnever ends
6.6. Ignorance about data conflictsIgnorance about data conflicts – resolving these– resolving these
takes a lot more effort than most people realizetakes a lot more effort than most people realize
7.7. Failure to learn from mistakesFailure to learn from mistakes – since one DW– since one DW
project tends to beget another, learning from theproject tends to beget another, learning from the
early mistakes will yield higher quality laterearly mistakes will yield higher quality later
Data Warehouse TechnologiesData Warehouse Technologies
No one currently offers an end-to-end DWNo one currently offers an end-to-end DW
solution. Organizations buy bits and pieces fromsolution. Organizations buy bits and pieces from
a number of vendors and hopefully make thema number of vendors and hopefully make them
work together.work together.
SAS, IBM, Software AG, Information BuildersSAS, IBM, Software AG, Information Builders
and Platinum offer solutions that are at leastand Platinum offer solutions that are at least
fairly comprehensive.fairly comprehensive.
The market is very competitive. Table 10-6 inThe market is very competitive. Table 10-6 in
the text lists 90 firms that produce DW products.the text lists 90 firms that produce DW products.
The Future of Data WarehousingThe Future of Data Warehousing
As the DW becomes a standard part of anAs the DW becomes a standard part of an
organization, there will be efforts to find neworganization, there will be efforts to find new
ways to use the data. This will likely bring with itways to use the data. This will likely bring with it
several new challenges:several new challenges:

Regulatory constraintsRegulatory constraints may limit the ability to combinemay limit the ability to combine
sources of disparate data.sources of disparate data.

These disparate sources are likely to containThese disparate sources are likely to contain
unstructured dataunstructured data, which is hard to store., which is hard to store.

TheThe InternetInternet makes it possible to access data frommakes it possible to access data from
virtually “anywhere”. Of course, this just increasesvirtually “anywhere”. Of course, this just increases
the disparity.the disparity.
ObjectiveObjective
Interesting FactsInteresting Facts
Data Can be Used ToData Can be Used To
Robust InfrastructureRobust Infrastructure
Success of DataSuccess of Data
Warehouse ProjectsWarehouse Projects
Implementing DataImplementing Data
WarehouseWarehouse
Real Time Alerts &Real Time Alerts &
IntegrationIntegration
Identity TheftIdentity Theft
What Can You Do?What Can You Do?
Interesting FactsInteresting Facts
Harrah’s Entertainment’s Data Warehouse holdsHarrah’s Entertainment’s Data Warehouse holds
30 terabytes, or 30 trillion bytes of data, roughly30 terabytes, or 30 trillion bytes of data, roughly
three times the number of printed characters inthree times the number of printed characters in
the Library of Congressthe Library of Congress
Casinos, retailers, airlines, and banks are pilingCasinos, retailers, airlines, and banks are piling
up data so vast, it would have been unthinkableup data so vast, it would have been unthinkable
years ago; result from the curse of cheapyears ago; result from the curse of cheap
storagestorage
Interesting FactsInteresting Facts
Storage Shipments as of 2004: 22Storage Shipments as of 2004: 22
exabytes or 22 million trillion bytes of hardexabytes or 22 million trillion bytes of hard
disk space, double the amount in 2002.disk space, double the amount in 2002.
Equivalent to 4x’s the space needed toEquivalent to 4x’s the space needed to
store every word ever spoken by everystore every word ever spoken by every
human being who has ever lived.human being who has ever lived.
Should double again in 2006Should double again in 2006
Data Can be Used ToData Can be Used To
Quantify the volume impact of vehicles across theQuantify the volume impact of vehicles across the
marketing matrixmarketing matrix
Account for decay and saturation factors in theAccount for decay and saturation factors in the
determination of investment choices and returnsdetermination of investment choices and returns
Execute “what-if” simulations of pricing or promotionalExecute “what-if” simulations of pricing or promotional
scenarios before a proposed action is takenscenarios before a proposed action is taken
Provide a continuous planning, measurement, analysis andProvide a continuous planning, measurement, analysis and
optimization cycle supported by a software structureoptimization cycle supported by a software structure
Deliver robust data feeds into other systems supportingDeliver robust data feeds into other systems supporting
supply chain, sales, and financial reporting and endeavorssupply chain, sales, and financial reporting and endeavors
Robust InfrastructureRobust Infrastructure
Data Identification and AcquisitionData Identification and Acquisition
Data Cleansing, Mapping, andData Cleansing, Mapping, and
TransformationTransformation
Production System Loading and OngoingProduction System Loading and Ongoing
UpdateUpdate
Success of Data WarehouseSuccess of Data Warehouse
ProjectsProjects
Over half of Data Warehouse projects are DoomedOver half of Data Warehouse projects are Doomed

Fail due to lack of attention to Data Quality IssuesFail due to lack of attention to Data Quality Issues

More than half only have limited acceptanceMore than half only have limited acceptance

Consistency and Accuracy of DataConsistency and Accuracy of Data

Most businesses fail to use business intelligence (BI)Most businesses fail to use business intelligence (BI)
strategicallystrategically

IT organizations build data warehouses with little to no businessIT organizations build data warehouses with little to no business
involvementinvolvement
““A real-time enterpriseA real-time enterprise
without real-time businesswithout real-time business
intelligence is a real fast,intelligence is a real fast,
dumb organization.”dumb organization.”
Stephen BrobstStephen Brobst
Chief Technology OfficeChief Technology Office
TeradataTeradata
Success of Data WarehouseSuccess of Data Warehouse
ProjectsProjects
Most challenging type of deployment for anMost challenging type of deployment for an
enterpriseenterprise

Large scale and complex system configurationsLarge scale and complex system configurations

Sophisticated data modeling and analysis toolsSophisticated data modeling and analysis tools

High visibility in broad range of important businessHigh visibility in broad range of important business
functions within companyfunctions within company

Adoption of Linux-Based PlatformAdoption of Linux-Based Platform
Implementing Data WarehouseImplementing Data Warehouse
Challenges:Challenges:

Identifying new processesIdentifying new processes

Assuring there were of real useAssuring there were of real use

Implementing and ensuring cultural shiftsImplementing and ensuring cultural shifts

Managing content and New communitiesManaging content and New communities
towards a common benefittowards a common benefit

Linear modelsLinear models

Standards, Governance, Controls, ValuationStandards, Governance, Controls, Valuation
TeradataTeradata
Division of NCR in Dayton, OhioDivision of NCR in Dayton, Ohio
Competitor of IBM and OracleCompetitor of IBM and Oracle
Multi-million Dollar Machines to run theMulti-million Dollar Machines to run the
world’s biggest data warehousesworld’s biggest data warehouses

Wal-MartWal-Mart

Bank of AmericaBank of America

Verizon WirelessVerizon Wireless
Teradata’s SuccessTeradata’s Success
Conventional IBM or Sun MicrosystemsConventional IBM or Sun Microsystems
overload for a couple hours to days on aoverload for a couple hours to days on a
few terabytes and/or data queriesfew terabytes and/or data queries
IBM cannot return computation on certainIBM cannot return computation on certain
complex requestscomplex requests
Equivalent to having data but not able toEquivalent to having data but not able to
use it.use it.
Real Time Alerts & IntegrationReal Time Alerts & Integration
Teradata 8.0 Version released in Oct 2004Teradata 8.0 Version released in Oct 2004

Improves real-time alerts and integrationImproves real-time alerts and integration
Businesses can analyze operational info againstBusinesses can analyze operational info against
historical info to identify events in near real-timehistorical info to identify events in near real-time
using the new table designusing the new table design
Used by:Used by:

Continental Airlines in the US: reroute passengers onContinental Airlines in the US: reroute passengers on
delayed flights, reissuing tickets, reserving a room indelayed flights, reissuing tickets, reserving a room in
a hotel booking systema hotel booking system

Southwest Airlines- savings between $1.2-$1.4 MillionSouthwest Airlines- savings between $1.2-$1.4 Million
Identity TheftIdentity Theft
Government Regulation of Personal Data is NeededGovernment Regulation of Personal Data is Needed
(National Consumer Protection Standards)(National Consumer Protection Standards)
ChoicePoint FollyChoicePoint Folly

Georgia-based data-collection companyGeorgia-based data-collection company

Founded in 1997 to analyze insurance claims information, butFounded in 1997 to analyze insurance claims information, but
now provides data to customers including finance companies,now provides data to customers including finance companies,
law enforcement, and governmentlaw enforcement, and government

Obtain personal information by perusing public records, orObtain personal information by perusing public records, or
purchasing the information from other companiespurchasing the information from other companies
Identity TheftIdentity Theft
Duped by scammers who set 150 phonyDuped by scammers who set 150 phony
accounts to access personal data of as many asaccounts to access personal data of as many as
145,000 people nationwide145,000 people nationwide
Scammers set user accounts by faxing in phonyScammers set user accounts by faxing in phony
business licenses, undetected for one yearbusiness licenses, undetected for one year
750 people had their identities stolen750 people had their identities stolen
Theft would have gone unnoticed withoutTheft would have gone unnoticed without
California Identity theft law SB 1386California Identity theft law SB 1386
Identity TheftIdentity Theft
MSN EventMSN Event
Data Warehouse Information GatheringData Warehouse Information Gathering
Over the Phone InterviewsOver the Phone Interviews
Trash Can HuntingTrash Can Hunting
Gathered from Doctors, Internet Transactions,Gathered from Doctors, Internet Transactions,
Telephone Operators (Overseas or Prisoners)Telephone Operators (Overseas or Prisoners)
MSN EmailMSN Email
What Can You Do?What Can You Do?
Carefully monitor your credit card bills and creditCarefully monitor your credit card bills and credit
reportsreports
Request a once a year free access credit reportRequest a once a year free access credit report
via the three big credit agencies.via the three big credit agencies.

Equifax, Experian, TransUnionEquifax, Experian, TransUnion
Victims: contact Federal Trade Commission toVictims: contact Federal Trade Commission to
report the theft and monitor credit reports.report the theft and monitor credit reports.

1-800-IDTHEFT1-800-IDTHEFT
ReferencesReferences
Decision Support Systems in the 21Decision Support Systems in the 21stst
CenturyCentury 22ndnd
Edition, by George M.Edition, by George M.
Marakas, Prentice Hall, Upper Saddle River, NJ, 2003Marakas, Prentice Hall, Upper Saddle River, NJ, 2003
http://seattletimes.nwsource.com/html/editorialsopinion/2002191098_credited27.hthttp://seattletimes.nwsource.com/html/editorialsopinion/2002191098_credited27.ht
Seattle times, plugging holes in data warehousingSeattle times, plugging holes in data warehousing
Teradata warehouse improves real-time alerts and integrationTeradata warehouse improves real-time alerts and integration
Cliff Saran.Cliff Saran. Computer Weekly.Computer Weekly. Sutton: Oct 12, 2004. p. 22 (1 page)Sutton: Oct 12, 2004. p. 22 (1 page)
ON THE MARKON THE MARK
Mark Hall.Mark Hall. Computerworld.Computerworld. Framingham: Oct 18, 2004. Vol. 38, Iss. 42; p.Framingham: Oct 18, 2004. Vol. 38, Iss. 42; p.
6 (1 page)6 (1 page)
Optimization: It's All About the DataOptimization: It's All About the Data Brandweek: Ellen Pederson, MarkBrandweek: Ellen Pederson, Mark
AndersonAnderson
THE NO-SACRIFICE, AFFORDABLE DATA WAREHOUSE APPTHE NO-SACRIFICE, AFFORDABLE DATA WAREHOUSE APP IntelligentIntelligent
Enterprises, Michael GonzalezEnterprises, Michael Gonzalez
ReferencesReferences
http://www.dmreview.com/article_sub.cfm?articleId=7071 Convergence-http://www.dmreview.com/article_sub.cfm?articleId=7071 Convergence-
Beyond the Data WarehouseBeyond the Data Warehouse
http://www.computerworld.com/printthis/2001/0,4814,56969,00.html Micro-http://www.computerworld.com/printthis/2001/0,4814,56969,00.html Micro-
segmentation – Computerworldsegmentation – Computerworld
Too Much InformationToo Much Information Forbes article on data warehouseForbes article on data warehouse
http://reviews.cnet.com/4520-3513_7-5690533-1.htmlhttp://reviews.cnet.com/4520-3513_7-5690533-1.html When identityWhen identity
thieves strike data warehousesthieves strike data warehouses
Over half of data warehouse projects doomedOver half of data warehouse projects doomed VNU Business PublicationsVNU Business Publications
Limited, Robert Jaques 25 February 2005Limited, Robert Jaques 25 February 2005
http://www.linuxworld.com/magazine/?issueid=571 Linux World Articlehttp://www.linuxworld.com/magazine/?issueid=571 Linux World Article
Questions?Questions?

More Related Content

What's hot

Data Ware Housing And Data Mining
Data Ware Housing And Data MiningData Ware Housing And Data Mining
Data Ware Housing And Data Miningcpjcollege
 
L16 l17 Data Warehousing
L16 l17  Data WarehousingL16 l17  Data Warehousing
L16 l17 Data WarehousingRushdi Shams
 
Data Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesData Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesIvo Andreev
 
Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.Vibrant Technologies & Computers
 
Data ware house architecture
Data ware house architectureData ware house architecture
Data ware house architectureDeepak Chaurasia
 
Warehouse components
Warehouse componentsWarehouse components
Warehouse componentsganblues
 
Complement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & HadoopComplement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & HadoopDatameer
 
Real World Business Intelligence and Data Warehousing
Real World Business Intelligence and Data WarehousingReal World Business Intelligence and Data Warehousing
Real World Business Intelligence and Data Warehousingukc4
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousingsumit621
 
An introduction to data warehousing
An introduction to data warehousingAn introduction to data warehousing
An introduction to data warehousingShahed Khalili
 
Components of a Data-Warehouse
Components of a Data-WarehouseComponents of a Data-Warehouse
Components of a Data-WarehouseAbdul Aslam
 
Dw & etl concepts
Dw & etl conceptsDw & etl concepts
Dw & etl conceptsjeshocarme
 
Data Warehousing And Data Mining Presentation Transcript
Data Warehousing And Data Mining   Presentation TranscriptData Warehousing And Data Mining   Presentation Transcript
Data Warehousing And Data Mining Presentation TranscriptSUBODH009
 
Data Warehouse Best Practices
Data Warehouse Best PracticesData Warehouse Best Practices
Data Warehouse Best PracticesEduardo Castro
 

What's hot (20)

Data Warehouse
Data Warehouse Data Warehouse
Data Warehouse
 
Data Ware Housing And Data Mining
Data Ware Housing And Data MiningData Ware Housing And Data Mining
Data Ware Housing And Data Mining
 
L16 l17 Data Warehousing
L16 l17  Data WarehousingL16 l17  Data Warehousing
L16 l17 Data Warehousing
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 
Data warehouse
Data warehouse Data warehouse
Data warehouse
 
Data Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesData Warehouse Design and Best Practices
Data Warehouse Design and Best Practices
 
Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.
 
Data ware house architecture
Data ware house architectureData ware house architecture
Data ware house architecture
 
Warehouse components
Warehouse componentsWarehouse components
Warehouse components
 
Complement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & HadoopComplement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & Hadoop
 
Real World Business Intelligence and Data Warehousing
Real World Business Intelligence and Data WarehousingReal World Business Intelligence and Data Warehousing
Real World Business Intelligence and Data Warehousing
 
Data Warehouse
Data WarehouseData Warehouse
Data Warehouse
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousing
 
An introduction to data warehousing
An introduction to data warehousingAn introduction to data warehousing
An introduction to data warehousing
 
Data ware house
Data ware houseData ware house
Data ware house
 
Components of a Data-Warehouse
Components of a Data-WarehouseComponents of a Data-Warehouse
Components of a Data-Warehouse
 
Dw & etl concepts
Dw & etl conceptsDw & etl concepts
Dw & etl concepts
 
Data Warehousing And Data Mining Presentation Transcript
Data Warehousing And Data Mining   Presentation TranscriptData Warehousing And Data Mining   Presentation Transcript
Data Warehousing And Data Mining Presentation Transcript
 
Data Warehouse Best Practices
Data Warehouse Best PracticesData Warehouse Best Practices
Data Warehouse Best Practices
 
Dw 07032018-dr pl pradhan
Dw 07032018-dr pl pradhanDw 07032018-dr pl pradhan
Dw 07032018-dr pl pradhan
 

Similar to Data warehouse

Introduction to data warehousing
Introduction to data warehousingIntroduction to data warehousing
Introduction to data warehousinguncleRhyme
 
Introduction to Warehousing
Introduction to WarehousingIntroduction to Warehousing
Introduction to WarehousingDrHiyamHatem
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSINGKing Julian
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data WarehouseShanthi Mukkavilli
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing conceptspcherukumalla
 
Data warehouse
Data warehouseData warehouse
Data warehouseMR Z
 
DWH_PROJECT [Compatibility Mode]
DWH_PROJECT [Compatibility Mode]DWH_PROJECT [Compatibility Mode]
DWH_PROJECT [Compatibility Mode]vasanth kumar C
 
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...Edureka!
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data WarehouseSOMASUNDARAM T
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousingwork
 
1-_Intro_to_Data_Minning__DWH.ppt
1-_Intro_to_Data_Minning__DWH.ppt1-_Intro_to_Data_Minning__DWH.ppt
1-_Intro_to_Data_Minning__DWH.pptBsMath3rdsem
 
BI_LECTURE_4-2021.pptx
BI_LECTURE_4-2021.pptxBI_LECTURE_4-2021.pptx
BI_LECTURE_4-2021.pptxhajon27910
 
dw_concepts_2_day_course.ppt
dw_concepts_2_day_course.pptdw_concepts_2_day_course.ppt
dw_concepts_2_day_course.pptDougSchoemaker
 
Traditional Data-warehousing / BI overview
Traditional Data-warehousing / BI overviewTraditional Data-warehousing / BI overview
Traditional Data-warehousing / BI overviewNagaraj Yerram
 

Similar to Data warehouse (20)

Introduction to data warehousing
Introduction to data warehousingIntroduction to data warehousing
Introduction to data warehousing
 
bi notes.docx
bi notes.docxbi notes.docx
bi notes.docx
 
Introduction to Warehousing
Introduction to WarehousingIntroduction to Warehousing
Introduction to Warehousing
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data Warehouse
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Lecture 2
Lecture 2Lecture 2
Lecture 2
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Data and Database
Data and DatabaseData and Database
Data and Database
 
DWH_PROJECT [Compatibility Mode]
DWH_PROJECT [Compatibility Mode]DWH_PROJECT [Compatibility Mode]
DWH_PROJECT [Compatibility Mode]
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data Warehouse
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousing
 
1-_Intro_to_Data_Minning__DWH.ppt
1-_Intro_to_Data_Minning__DWH.ppt1-_Intro_to_Data_Minning__DWH.ppt
1-_Intro_to_Data_Minning__DWH.ppt
 
BI_LECTURE_4-2021.pptx
BI_LECTURE_4-2021.pptxBI_LECTURE_4-2021.pptx
BI_LECTURE_4-2021.pptx
 
dw_concepts_2_day_course.ppt
dw_concepts_2_day_course.pptdw_concepts_2_day_course.ppt
dw_concepts_2_day_course.ppt
 
Informatica doc
Informatica docInformatica doc
Informatica doc
 
Traditional Data-warehousing / BI overview
Traditional Data-warehousing / BI overviewTraditional Data-warehousing / BI overview
Traditional Data-warehousing / BI overview
 

Recently uploaded

VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 

Recently uploaded (20)

VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 

Data warehouse

  • 1. Data WarehouseData Warehouse Lutfi FreijLutfi Freij Konstantin RimarchukKonstantin Rimarchuk Vasken ChamlaianVasken Chamlaian John SahakianJohn Sahakian Suzan TonSuzan Ton
  • 2. InmonInmon Father of the data warehouseFather of the data warehouse Co-creator of the CorporateCo-creator of the Corporate Information Factory.Information Factory. He has 35 years ofHe has 35 years of experience in databaseexperience in database technology managementtechnology management and data warehouse design.and data warehouse design.
  • 3. Inmon-Cont’dInmon-Cont’d Bill has written about a varietyBill has written about a variety of topics on the building, usage,of topics on the building, usage, & maintenance of the data warehouse& maintenance of the data warehouse & the Corporate Information Factory.& the Corporate Information Factory. He has written more than 650He has written more than 650 articles (Datamation, ComputerWorld,articles (Datamation, ComputerWorld, and Byte Magazine).and Byte Magazine). Inmon has published 45 books.Inmon has published 45 books.  Many of books has been translated to Chinese, Dutch, French, German,Many of books has been translated to Chinese, Dutch, French, German, Japanese, Korean, Portuguese, Russian, and Spanish.Japanese, Korean, Portuguese, Russian, and Spanish.
  • 4. IntroductionIntroduction What is Data Warehouse?What is Data Warehouse? A data warehouse is a collection of integratedA data warehouse is a collection of integrated databases designed to support a DSS.databases designed to support a DSS. According to Inmon’s (father of data warehousing)According to Inmon’s (father of data warehousing) definition(Inmon,1992a,p.5):definition(Inmon,1992a,p.5):  It is a collection of integrated, subject-orientedIt is a collection of integrated, subject-oriented databases designed to support the DSS function,databases designed to support the DSS function, where each unit of data is non-volatile and relevantwhere each unit of data is non-volatile and relevant to some moment in time.to some moment in time.
  • 5. Introduction-Cont’d.Introduction-Cont’d. Where is it used?Where is it used? It is used for evaluating future strategy.It is used for evaluating future strategy. It needs a successful technician:It needs a successful technician:  Flexible.Flexible.  Team player.Team player.  Good balance of business and technicalGood balance of business and technical understanding.understanding.
  • 6. Introduction-Cont’d.Introduction-Cont’d. The ultimate use of data warehouse is Mass Customization.The ultimate use of data warehouse is Mass Customization.  For example, it increased Capital One’s customers from 1For example, it increased Capital One’s customers from 1 million to approximately 9 millions in 8 years.million to approximately 9 millions in 8 years. Just like a muscle: DW increases in strength with active use.Just like a muscle: DW increases in strength with active use.  With each new test and product, valuable information isWith each new test and product, valuable information is added to the DW, allowing the analyst to learn from theadded to the DW, allowing the analyst to learn from the success and failure of the past.success and failure of the past. The key to survival:The key to survival:  Is the ability to analyze, plan, and react to changingIs the ability to analyze, plan, and react to changing business conditions in a much more rapid fashion.business conditions in a much more rapid fashion.
  • 7. Data WarehouseData Warehouse In order for data to be effective, DW must be:In order for data to be effective, DW must be:  Consistent.Consistent.  Well integrated.Well integrated.  Well defined.Well defined.  Time stamped.Time stamped. DW environment:DW environment:  The data store, data mart & the metadata.The data store, data mart & the metadata.
  • 8. The Data StoreThe Data Store An operational data store (ODS) stores data for aAn operational data store (ODS) stores data for a specific application. It feeds the data warehouse aspecific application. It feeds the data warehouse a stream of desired raw data.stream of desired raw data. Is the most common component of DW environment.Is the most common component of DW environment. Data store is generally subject oriented, volatile,Data store is generally subject oriented, volatile, current commonly focused on customers, products,current commonly focused on customers, products, orders, policies, claims, etc…orders, policies, claims, etc…
  • 9. Data Store & Data WarehouseData Store & Data Warehouse Data store & Data warehouse, table 10-1 pageData store & Data warehouse, table 10-1 page 296296
  • 10. The data store-Cont’d.The data store-Cont’d. Its day-to-day function is to store the data for aIts day-to-day function is to store the data for a single specific set of operational application.single specific set of operational application. Its function is to feed the data warehouse dataIts function is to feed the data warehouse data for the purpose of analysis.for the purpose of analysis.
  • 11. The Data MartThe Data Mart It is lower-cost, scaled down version of theIt is lower-cost, scaled down version of the DW.DW. Data Mart offer a targeted and less costlyData Mart offer a targeted and less costly method of gaining the advantages associatedmethod of gaining the advantages associated with data warehousing and can be scaled up towith data warehousing and can be scaled up to a full DW environment over time.a full DW environment over time.
  • 12. The Meta DataThe Meta Data Last component of DW environments.Last component of DW environments. It is information that is kept about the warehouseIt is information that is kept about the warehouse rather than information kept within the warehouse.rather than information kept within the warehouse. Legacy systems generally don’t keep a record ofLegacy systems generally don’t keep a record of characteristics of the data (such as what pieces of datacharacteristics of the data (such as what pieces of data exist and where they are located).exist and where they are located). The metadata is simply data about data.The metadata is simply data about data.
  • 13. ConclusionConclusion A Data Warehouse is a collection of integrated subject-A Data Warehouse is a collection of integrated subject- oriented databases designed to support a DSS.oriented databases designed to support a DSS.  Each unit of data is non-volatile and relevant to some moment in time.Each unit of data is non-volatile and relevant to some moment in time. An operational data store (ODS) stores data for a specificAn operational data store (ODS) stores data for a specific application. It feeds the data warehouse a stream of desiredapplication. It feeds the data warehouse a stream of desired raw data.raw data. A data mart is a lower-cost, scaled-down version of a dataA data mart is a lower-cost, scaled-down version of a data warehouse, usually designed to support a small group of userswarehouse, usually designed to support a small group of users (rather than the entire firm).(rather than the entire firm). The metadata is information that is kept about the warehouse.The metadata is information that is kept about the warehouse.
  • 14. Data WarehouseData Warehouse Subject orientedSubject oriented Data integratedData integrated Time variantTime variant NonvolatileNonvolatile
  • 15. Characteristics of Data WarehouseCharacteristics of Data Warehouse Subject oriented.Subject oriented. Data are organized based onData are organized based on how the users refer to them.how the users refer to them. IntegratedIntegrated. All inconsistencies regarding naming. All inconsistencies regarding naming convention and value representations areconvention and value representations are removed.removed. NonvolatileNonvolatile. Data are stored in read-only format. Data are stored in read-only format and do not change over time.and do not change over time. Time variantTime variant.. Data are not current but normallyData are not current but normally time series.time series.
  • 16. Characteristics of Data WarehouseCharacteristics of Data Warehouse SummarizedSummarized Operational data are mapped intoOperational data are mapped into a decision-usable formata decision-usable format Large volumeLarge volume.. Time series data sets areTime series data sets are normally quite large.normally quite large. Not normalizedNot normalized.. DW data can be, and oftenDW data can be, and often are, redundant.are, redundant. MetadataMetadata.. Data about data are stored.Data about data are stored. Data sourcesData sources. Data come from internal and. Data come from internal and external unintegrated operational systems.external unintegrated operational systems.
  • 17. A Data Warehouse isA Data Warehouse is SubjectSubject OrientedOriented
  • 18. Subject OrientationSubject Orientation Application EnvironmentApplication Environment Data warehouseData warehouse EnvironmentEnvironment Design activities must be equallyDesign activities must be equally focused on both process and databasefocused on both process and database designdesign DW world is primarily void of processDW world is primarily void of process design and tends to focus exclusively ondesign and tends to focus exclusively on issues of data modeling and databaseissues of data modeling and database designdesign
  • 19. Data IntegratedData Integrated IntegrationIntegration –consistency naming–consistency naming conventions and measurement attributers,conventions and measurement attributers, accuracy, and common aggregation.accuracy, and common aggregation. Establishment of a common unit ofEstablishment of a common unit of measure for all synonymous datameasure for all synonymous data elements from dissimilar database.elements from dissimilar database. The data must be stored in the DW in anThe data must be stored in the DW in an integrated, globally acceptable mannerintegrated, globally acceptable manner
  • 21. Time VariantTime Variant In an operational application system, theIn an operational application system, the expectation is that all data within the databaseexpectation is that all data within the database are accurate as of the moment of access. In theare accurate as of the moment of access. In the DW data are simply assumed to be accurate asDW data are simply assumed to be accurate as of some moment in time and not necessarilyof some moment in time and not necessarily right now.right now. One of the places where DW data display timeOne of the places where DW data display time variance is in the structure of the record key.variance is in the structure of the record key. Every primary key contained within the DWEvery primary key contained within the DW must contain, either implicitly or explicitly anmust contain, either implicitly or explicitly an element of time( day, week, month, etc)element of time( day, week, month, etc)
  • 22. Time VariantTime Variant Every piece of data contained within theEvery piece of data contained within the warehouse must be associated with awarehouse must be associated with a particular point in time if any usefulparticular point in time if any useful analysis is to be conducted with it.analysis is to be conducted with it. Another aspect of time variance in DWAnother aspect of time variance in DW data is that, once recorded, data within thedata is that, once recorded, data within the warehouse cannot be updated orwarehouse cannot be updated or changed.changed.
  • 23. NonvolatilityNonvolatility Typical activities such as deletes, inserts,Typical activities such as deletes, inserts, and changes that are performed in anand changes that are performed in an operational application environment areoperational application environment are completely nonexistent in a DWcompletely nonexistent in a DW environment.environment. Only two data operations are everOnly two data operations are ever performed in the DW: data loading andperformed in the DW: data loading and data accessdata access
  • 24. NonvolatilityNonvolatility Application DW The design issues must focus on data integrity and update anomalies. Complex processes must be coded to ensure that the data update processes allow for high integrity of the final product. Such issues are no concern to in a DW environment because data update is never performed. Data is placed in normalized form to ensure a minimal redundancy (totals that could be calculated would never be stored) Designers find it useful to store many of such calculations or summarizations. The technologies necessary to support issues of transaction and data recovery, roll back, and detection and remedy of deadlock are quite complex. Relative simplicity in technology
  • 25. Issues of Data Redundancy betweenIssues of Data Redundancy between DW and operational environmentsDW and operational environments The lack of relevancy of issues such as dataThe lack of relevancy of issues such as data normalization in the DW environment may suggest thatnormalization in the DW environment may suggest that existence of massive data redundancy within the dataexistence of massive data redundancy within the data warehouse and between the operational and DWwarehouse and between the operational and DW environments.environments. Inmon(1992) pointed out and proved that it is not true.Inmon(1992) pointed out and proved that it is not true.
  • 26. Issues of Data Redundancy betweenIssues of Data Redundancy between DW and operational environmentsDW and operational environments The data being loaded into the DW are filtered and “cleansed” as theyThe data being loaded into the DW are filtered and “cleansed” as they pass from the operational database to the warehouse. Because of thispass from the operational database to the warehouse. Because of this cleansing numerous data that exists in the operational environmentcleansing numerous data that exists in the operational environment never pass to the data warehouse. Only the data necessary fornever pass to the data warehouse. Only the data necessary for processing by the DSS or EIS are ever actually loaded into the DW.processing by the DSS or EIS are ever actually loaded into the DW. The time horizons for warehouse and operational data elements areThe time horizons for warehouse and operational data elements are unique. Data in the operational environment are fresh, whereasunique. Data in the operational environment are fresh, whereas warehouse data are generally much older.(so there is minimalwarehouse data are generally much older.(so there is minimal opportunity of the data to overlap between two environments )opportunity of the data to overlap between two environments ) The data loaded into the DW often undergo a radical transformation asThe data loaded into the DW often undergo a radical transformation as they pass form operational to the DW environment. So data in DW arethey pass form operational to the DW environment. So data in DW are not the same.not the same. Given this factors, Inmon suggests that data redundancy between the twoGiven this factors, Inmon suggests that data redundancy between the two environments is a rare occurrence with a typical redundancy factor ofenvironments is a rare occurrence with a typical redundancy factor of less than 1 %less than 1 %
  • 27. The Data WarehouseThe Data Warehouse ArchitectureArchitecture The architecture consists of variousThe architecture consists of various interconnected elements:interconnected elements:  Operational and external database layerOperational and external database layer – the– the source data for the DWsource data for the DW  Information access layerInformation access layer – the tools the end– the tools the end user access to extract and analyze the datauser access to extract and analyze the data  Data access layerData access layer – the interface between the– the interface between the operational and information access layersoperational and information access layers  Metadata layerMetadata layer – the data directory or– the data directory or repository of metadata informationrepository of metadata information
  • 28. Components of the DataComponents of the Data Warehouse ArchitectureWarehouse Architecture
  • 29. The Data WarehouseThe Data Warehouse ArchitectureArchitecture Additional layers are:Additional layers are:  Process management layerProcess management layer – the scheduler or job– the scheduler or job controllercontroller  Application messaging layerApplication messaging layer – the “middleware” that– the “middleware” that transports information around the firmtransports information around the firm  Physical data warehouse layerPhysical data warehouse layer – where the actual– where the actual data used in the DSS are locateddata used in the DSS are located  Data staging layerData staging layer – all of the processes necessary to– all of the processes necessary to select, edit, summarize and load warehouse dataselect, edit, summarize and load warehouse data from the operational and external data basesfrom the operational and external data bases
  • 30. Data Warehousing TypologyData Warehousing Typology TheThe virtual data warehousevirtual data warehouse – the end users– the end users have direct access to the data stores, usinghave direct access to the data stores, using tools enabled at the data access layertools enabled at the data access layer TheThe central data warehousecentral data warehouse – a single physical– a single physical database contains all of the data for a specificdatabase contains all of the data for a specific functional areafunctional area TheThe distributed data warehousedistributed data warehouse – the– the components are distributed across severalcomponents are distributed across several physical databasesphysical databases
  • 31. The MetadataThe Metadata The name suggests some high-levelThe name suggests some high-level technological concept, but it really is fairlytechnological concept, but it really is fairly simple. Metadata is “data about data”.simple. Metadata is “data about data”. With the emergence of the data warehouse as aWith the emergence of the data warehouse as a decision support structure, the metadata aredecision support structure, the metadata are considered as much a resource as the businessconsidered as much a resource as the business data they describe.data they describe. Metadata are abstractions -- they are high levelMetadata are abstractions -- they are high level data that provide concise descriptions of lower-data that provide concise descriptions of lower- level data.level data.
  • 32. The MetadataThe Metadata For example, a line in a sales database may contain:For example, a line in a sales database may contain: 4056 KJ596 223.454056 KJ596 223.45 This is mostly meaningless until we consult the metadataThis is mostly meaningless until we consult the metadata that tells us it was store number 4056, product KJ596that tells us it was store number 4056, product KJ596 and sales of $223.45and sales of $223.45 The metadata are essential ingredients in theThe metadata are essential ingredients in the transformation of raw data into knowledge. They are thetransformation of raw data into knowledge. They are the “keys” that allow us to handle the raw data.“keys” that allow us to handle the raw data.
  • 33. General Metadata IssuesGeneral Metadata Issues General metadata issues associated with DataGeneral metadata issues associated with Data Warehouse use:Warehouse use:  What tables, attributes and keys does the DWWhat tables, attributes and keys does the DW contain?contain?  Where did each set of data come from?Where did each set of data come from?  What transformations were applied with cleansing?What transformations were applied with cleansing?  How have the metadata changed over time?How have the metadata changed over time?  How often do the data get reloaded?How often do the data get reloaded?  Are there so many data elements that you need to beAre there so many data elements that you need to be careful what you ask for?careful what you ask for?
  • 34. Components of the MetadataComponents of the Metadata Transformation mapsTransformation maps – records that show– records that show what transformations were appliedwhat transformations were applied Extraction & relationship historyExtraction & relationship history – records– records that show what data was analyzedthat show what data was analyzed Algorithms for summarizationAlgorithms for summarization – methods– methods available for aggregating and summarizingavailable for aggregating and summarizing Data ownershipData ownership – records that show origin– records that show origin Patterns of accessPatterns of access – records that show– records that show what data are accessed and how oftenwhat data are accessed and how often
  • 35. Typical Mapping MetadataTypical Mapping Metadata Transformation mapping records include:Transformation mapping records include:  Identification of original sourceIdentification of original source  Attribute conversionsAttribute conversions  Physical characteristic conversionsPhysical characteristic conversions  Encoding/reference table conversionsEncoding/reference table conversions  Naming changesNaming changes  Key changesKey changes  Values of default attributesValues of default attributes  Logic to choose from multiple sourcesLogic to choose from multiple sources  Algorithmic changesAlgorithmic changes
  • 36. Implementing the Data WarehouseImplementing the Data Warehouse Kozar list of “seven deadly sins” of data warehouseKozar list of “seven deadly sins” of data warehouse implementation:implementation: 1.1. ““If you build it, they will come”If you build it, they will come” – the DW needs to be– the DW needs to be designed to meet people’s needsdesigned to meet people’s needs 2.2. Omission of an architectural frameworkOmission of an architectural framework – you need– you need to consider the number of users, volume of data,to consider the number of users, volume of data, update cycle, etc.update cycle, etc. 3.3. Underestimating the importance of documentingUnderestimating the importance of documenting assumptionsassumptions – the assumptions and potential– the assumptions and potential conflicts must be included in the frameworkconflicts must be included in the framework
  • 37. ““Seven Deadly Sins”,Seven Deadly Sins”, continuedcontinued 4.4. Failure to use the right toolFailure to use the right tool – a DW project needs– a DW project needs different tools than those used to develop andifferent tools than those used to develop an applicationapplication 5.5. Life cycle abuseLife cycle abuse – in a DW, the life cycle really– in a DW, the life cycle really never endsnever ends 6.6. Ignorance about data conflictsIgnorance about data conflicts – resolving these– resolving these takes a lot more effort than most people realizetakes a lot more effort than most people realize 7.7. Failure to learn from mistakesFailure to learn from mistakes – since one DW– since one DW project tends to beget another, learning from theproject tends to beget another, learning from the early mistakes will yield higher quality laterearly mistakes will yield higher quality later
  • 38. Data Warehouse TechnologiesData Warehouse Technologies No one currently offers an end-to-end DWNo one currently offers an end-to-end DW solution. Organizations buy bits and pieces fromsolution. Organizations buy bits and pieces from a number of vendors and hopefully make thema number of vendors and hopefully make them work together.work together. SAS, IBM, Software AG, Information BuildersSAS, IBM, Software AG, Information Builders and Platinum offer solutions that are at leastand Platinum offer solutions that are at least fairly comprehensive.fairly comprehensive. The market is very competitive. Table 10-6 inThe market is very competitive. Table 10-6 in the text lists 90 firms that produce DW products.the text lists 90 firms that produce DW products.
  • 39. The Future of Data WarehousingThe Future of Data Warehousing As the DW becomes a standard part of anAs the DW becomes a standard part of an organization, there will be efforts to find neworganization, there will be efforts to find new ways to use the data. This will likely bring with itways to use the data. This will likely bring with it several new challenges:several new challenges:  Regulatory constraintsRegulatory constraints may limit the ability to combinemay limit the ability to combine sources of disparate data.sources of disparate data.  These disparate sources are likely to containThese disparate sources are likely to contain unstructured dataunstructured data, which is hard to store., which is hard to store.  TheThe InternetInternet makes it possible to access data frommakes it possible to access data from virtually “anywhere”. Of course, this just increasesvirtually “anywhere”. Of course, this just increases the disparity.the disparity.
  • 40. ObjectiveObjective Interesting FactsInteresting Facts Data Can be Used ToData Can be Used To Robust InfrastructureRobust Infrastructure Success of DataSuccess of Data Warehouse ProjectsWarehouse Projects Implementing DataImplementing Data WarehouseWarehouse Real Time Alerts &Real Time Alerts & IntegrationIntegration Identity TheftIdentity Theft What Can You Do?What Can You Do?
  • 41. Interesting FactsInteresting Facts Harrah’s Entertainment’s Data Warehouse holdsHarrah’s Entertainment’s Data Warehouse holds 30 terabytes, or 30 trillion bytes of data, roughly30 terabytes, or 30 trillion bytes of data, roughly three times the number of printed characters inthree times the number of printed characters in the Library of Congressthe Library of Congress Casinos, retailers, airlines, and banks are pilingCasinos, retailers, airlines, and banks are piling up data so vast, it would have been unthinkableup data so vast, it would have been unthinkable years ago; result from the curse of cheapyears ago; result from the curse of cheap storagestorage
  • 42. Interesting FactsInteresting Facts Storage Shipments as of 2004: 22Storage Shipments as of 2004: 22 exabytes or 22 million trillion bytes of hardexabytes or 22 million trillion bytes of hard disk space, double the amount in 2002.disk space, double the amount in 2002. Equivalent to 4x’s the space needed toEquivalent to 4x’s the space needed to store every word ever spoken by everystore every word ever spoken by every human being who has ever lived.human being who has ever lived. Should double again in 2006Should double again in 2006
  • 43. Data Can be Used ToData Can be Used To Quantify the volume impact of vehicles across theQuantify the volume impact of vehicles across the marketing matrixmarketing matrix Account for decay and saturation factors in theAccount for decay and saturation factors in the determination of investment choices and returnsdetermination of investment choices and returns Execute “what-if” simulations of pricing or promotionalExecute “what-if” simulations of pricing or promotional scenarios before a proposed action is takenscenarios before a proposed action is taken Provide a continuous planning, measurement, analysis andProvide a continuous planning, measurement, analysis and optimization cycle supported by a software structureoptimization cycle supported by a software structure Deliver robust data feeds into other systems supportingDeliver robust data feeds into other systems supporting supply chain, sales, and financial reporting and endeavorssupply chain, sales, and financial reporting and endeavors
  • 44. Robust InfrastructureRobust Infrastructure Data Identification and AcquisitionData Identification and Acquisition Data Cleansing, Mapping, andData Cleansing, Mapping, and TransformationTransformation Production System Loading and OngoingProduction System Loading and Ongoing UpdateUpdate
  • 45. Success of Data WarehouseSuccess of Data Warehouse ProjectsProjects Over half of Data Warehouse projects are DoomedOver half of Data Warehouse projects are Doomed  Fail due to lack of attention to Data Quality IssuesFail due to lack of attention to Data Quality Issues  More than half only have limited acceptanceMore than half only have limited acceptance  Consistency and Accuracy of DataConsistency and Accuracy of Data  Most businesses fail to use business intelligence (BI)Most businesses fail to use business intelligence (BI) strategicallystrategically  IT organizations build data warehouses with little to no businessIT organizations build data warehouses with little to no business involvementinvolvement
  • 46. ““A real-time enterpriseA real-time enterprise without real-time businesswithout real-time business intelligence is a real fast,intelligence is a real fast, dumb organization.”dumb organization.” Stephen BrobstStephen Brobst Chief Technology OfficeChief Technology Office TeradataTeradata
  • 47. Success of Data WarehouseSuccess of Data Warehouse ProjectsProjects Most challenging type of deployment for anMost challenging type of deployment for an enterpriseenterprise  Large scale and complex system configurationsLarge scale and complex system configurations  Sophisticated data modeling and analysis toolsSophisticated data modeling and analysis tools  High visibility in broad range of important businessHigh visibility in broad range of important business functions within companyfunctions within company  Adoption of Linux-Based PlatformAdoption of Linux-Based Platform
  • 48. Implementing Data WarehouseImplementing Data Warehouse Challenges:Challenges:  Identifying new processesIdentifying new processes  Assuring there were of real useAssuring there were of real use  Implementing and ensuring cultural shiftsImplementing and ensuring cultural shifts  Managing content and New communitiesManaging content and New communities towards a common benefittowards a common benefit  Linear modelsLinear models  Standards, Governance, Controls, ValuationStandards, Governance, Controls, Valuation
  • 49. TeradataTeradata Division of NCR in Dayton, OhioDivision of NCR in Dayton, Ohio Competitor of IBM and OracleCompetitor of IBM and Oracle Multi-million Dollar Machines to run theMulti-million Dollar Machines to run the world’s biggest data warehousesworld’s biggest data warehouses  Wal-MartWal-Mart  Bank of AmericaBank of America  Verizon WirelessVerizon Wireless
  • 50. Teradata’s SuccessTeradata’s Success Conventional IBM or Sun MicrosystemsConventional IBM or Sun Microsystems overload for a couple hours to days on aoverload for a couple hours to days on a few terabytes and/or data queriesfew terabytes and/or data queries IBM cannot return computation on certainIBM cannot return computation on certain complex requestscomplex requests Equivalent to having data but not able toEquivalent to having data but not able to use it.use it.
  • 51. Real Time Alerts & IntegrationReal Time Alerts & Integration Teradata 8.0 Version released in Oct 2004Teradata 8.0 Version released in Oct 2004  Improves real-time alerts and integrationImproves real-time alerts and integration Businesses can analyze operational info againstBusinesses can analyze operational info against historical info to identify events in near real-timehistorical info to identify events in near real-time using the new table designusing the new table design Used by:Used by:  Continental Airlines in the US: reroute passengers onContinental Airlines in the US: reroute passengers on delayed flights, reissuing tickets, reserving a room indelayed flights, reissuing tickets, reserving a room in a hotel booking systema hotel booking system  Southwest Airlines- savings between $1.2-$1.4 MillionSouthwest Airlines- savings between $1.2-$1.4 Million
  • 52. Identity TheftIdentity Theft Government Regulation of Personal Data is NeededGovernment Regulation of Personal Data is Needed (National Consumer Protection Standards)(National Consumer Protection Standards) ChoicePoint FollyChoicePoint Folly  Georgia-based data-collection companyGeorgia-based data-collection company  Founded in 1997 to analyze insurance claims information, butFounded in 1997 to analyze insurance claims information, but now provides data to customers including finance companies,now provides data to customers including finance companies, law enforcement, and governmentlaw enforcement, and government  Obtain personal information by perusing public records, orObtain personal information by perusing public records, or purchasing the information from other companiespurchasing the information from other companies
  • 53. Identity TheftIdentity Theft Duped by scammers who set 150 phonyDuped by scammers who set 150 phony accounts to access personal data of as many asaccounts to access personal data of as many as 145,000 people nationwide145,000 people nationwide Scammers set user accounts by faxing in phonyScammers set user accounts by faxing in phony business licenses, undetected for one yearbusiness licenses, undetected for one year 750 people had their identities stolen750 people had their identities stolen Theft would have gone unnoticed withoutTheft would have gone unnoticed without California Identity theft law SB 1386California Identity theft law SB 1386
  • 54. Identity TheftIdentity Theft MSN EventMSN Event Data Warehouse Information GatheringData Warehouse Information Gathering Over the Phone InterviewsOver the Phone Interviews Trash Can HuntingTrash Can Hunting Gathered from Doctors, Internet Transactions,Gathered from Doctors, Internet Transactions, Telephone Operators (Overseas or Prisoners)Telephone Operators (Overseas or Prisoners)
  • 56. What Can You Do?What Can You Do? Carefully monitor your credit card bills and creditCarefully monitor your credit card bills and credit reportsreports Request a once a year free access credit reportRequest a once a year free access credit report via the three big credit agencies.via the three big credit agencies.  Equifax, Experian, TransUnionEquifax, Experian, TransUnion Victims: contact Federal Trade Commission toVictims: contact Federal Trade Commission to report the theft and monitor credit reports.report the theft and monitor credit reports.  1-800-IDTHEFT1-800-IDTHEFT
  • 57. ReferencesReferences Decision Support Systems in the 21Decision Support Systems in the 21stst CenturyCentury 22ndnd Edition, by George M.Edition, by George M. Marakas, Prentice Hall, Upper Saddle River, NJ, 2003Marakas, Prentice Hall, Upper Saddle River, NJ, 2003 http://seattletimes.nwsource.com/html/editorialsopinion/2002191098_credited27.hthttp://seattletimes.nwsource.com/html/editorialsopinion/2002191098_credited27.ht Seattle times, plugging holes in data warehousingSeattle times, plugging holes in data warehousing Teradata warehouse improves real-time alerts and integrationTeradata warehouse improves real-time alerts and integration Cliff Saran.Cliff Saran. Computer Weekly.Computer Weekly. Sutton: Oct 12, 2004. p. 22 (1 page)Sutton: Oct 12, 2004. p. 22 (1 page) ON THE MARKON THE MARK Mark Hall.Mark Hall. Computerworld.Computerworld. Framingham: Oct 18, 2004. Vol. 38, Iss. 42; p.Framingham: Oct 18, 2004. Vol. 38, Iss. 42; p. 6 (1 page)6 (1 page) Optimization: It's All About the DataOptimization: It's All About the Data Brandweek: Ellen Pederson, MarkBrandweek: Ellen Pederson, Mark AndersonAnderson THE NO-SACRIFICE, AFFORDABLE DATA WAREHOUSE APPTHE NO-SACRIFICE, AFFORDABLE DATA WAREHOUSE APP IntelligentIntelligent Enterprises, Michael GonzalezEnterprises, Michael Gonzalez
  • 58. ReferencesReferences http://www.dmreview.com/article_sub.cfm?articleId=7071 Convergence-http://www.dmreview.com/article_sub.cfm?articleId=7071 Convergence- Beyond the Data WarehouseBeyond the Data Warehouse http://www.computerworld.com/printthis/2001/0,4814,56969,00.html Micro-http://www.computerworld.com/printthis/2001/0,4814,56969,00.html Micro- segmentation – Computerworldsegmentation – Computerworld Too Much InformationToo Much Information Forbes article on data warehouseForbes article on data warehouse http://reviews.cnet.com/4520-3513_7-5690533-1.htmlhttp://reviews.cnet.com/4520-3513_7-5690533-1.html When identityWhen identity thieves strike data warehousesthieves strike data warehouses Over half of data warehouse projects doomedOver half of data warehouse projects doomed VNU Business PublicationsVNU Business Publications Limited, Robert Jaques 25 February 2005Limited, Robert Jaques 25 February 2005 http://www.linuxworld.com/magazine/?issueid=571 Linux World Articlehttp://www.linuxworld.com/magazine/?issueid=571 Linux World Article

Editor's Notes

  1. To maintain a robust infrastructure, you should follow these guidelines. Which data sources reflect the dynamics of the contemporary and historical market place. Validity Checks, Above loaded into the integrated data repository enabling a continuous cycle of near-time data availability
  2. These challenges are especially predominant in dot-com companies. The key word of Implementing Data Warehouse is Convergence! The inherent Linear relationship was either business has driven requirements for IT or IT has been an enabler of business. This was a practical need of separation. Structural changes in business as a result of endemic information technology (information power on all desktops and many businesspersons acting as their own information centers, automated information is a basic lubricant of all business processes.) Convergence means that info is no longer a separate commodity. For the Final Note: Information, as any other business commodity, deserves the same treatment as any other core component of business
  3. IBM and Oracle is designed more to handle online transaction processing
  4. Southwest is a low-cost infrastructure