Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Data warehouse presentaion
1.
2. The Data warehouses are integrated data from
multiple heterogeneous information sources and
transform them into a multidimensional represent for
decision making. Apart from a complex architecture,
involving source systems, the data staging area,
operational data stores, the global data warehouse, etc.,
a data warehouse is also characterized by a complex
lifecycle.
The business rules of an organization change, new
data are requested by the end users, new sources of
information become available, and the data warehouse
architecture must evolve to efficiently support the
decision-making process within the organization that
owns the data warehouse.
3. Data Warehouse (DWH) Operational Database Store(ODS)
It involves historical processing of information. It involves day-to-day processing.
DWH systems are used by knowledge workers such as
executives, managers, and analysts.
ODS systems are used by clerks, DBAs, or database
professionals.
It is used to analyze the business. It is used to run the business.
It focuses on Information out. It focuses on Data in.
It is based on Star Schema, Snowflake Schema, and Fact
Constellation Schema.
It is based on Entity Relationship Model.
It focuses on Information out. It is application oriented.
It contains historical data. It contains current data.
It provides summarized and consolidated data. It provides primitive and highly detailed data.
It provides summarized and multidimensional view of
data.
It provides detailed and flat relational view of data.
The number of users is in hundreds. The number of users is in thousands.
The number of records accessed is in millions. The number of records accessed is in tens.
4. Client Data
Source 1
Using script
tool Generation
Flat File and
copy in Secure
FTP
Staging
DB
ETL
Data Mart 1
Data Mart 2
Data Mart 3
Source 2
Source 3
BI
Dashboard
Reports
Data Warehouse Architecture
Data Warehouse
ETL
5. Bring heterogeneous source systems data into
ETL readable format i.e., Flat file using with any
script language and load into Staging data with
ETL tool.
Staging data is loaded into Data warehouse. Load
strategy is initial full load after that incremental
load.
In Data warehouse, Data marts are created based
on subject area using Star schema or Snowflake
schema, for this schema have to be identify Fact
tables and dimension tables.
Using BI tool with Data marts can be developed
Dashboard for making quick business decisions.
6. Source Systems: This data is maintained by
Clients. This may have homogeneous or
heterogeneous.
Staging Database: This is used to load data
from the sources, modify & cleansing them
before you final load them into the DWH.
Data Mart: Data marts contain a subset of
organization-wide data that is valuable to
specific groups of people in an organization.
In other words, a data mart contains only
those data that is specific to a particular
group.
7. Windows-based or Unix/Linux-based servers are used to implement
data marts. They are implemented on low-cost servers.
The implementation cycle of a data mart is measured in short
periods of time, i.e., in weeks rather than months or years.
The life cycle of data marts may be complex in the long run, if their
planning and design are not organization-wide.
Data marts are small in size.
Data marts are customized by subject area.
The source of a data mart is departmentally structured data
warehouse.
Data marts are flexible. Graphical Representation
9. Scripting Tool: Python (Source system to Flat
File generation and copy into Secure FTP)
ETL Tool: Talend (Flat file to Staging DB,
Staging DB to Data Marts)
BI Tool: SpagoBI
10. Functional Users: Dashboards helps in
providing operational data for planning,
scheduling, controlling and further helps
them in decision making.
Middle level Management users: It helps in
short term planning, target setting and
controlling the business functions.
Management: It helps them in goal setting,
strategic planning and also evolving the
business plans in addition to their
implementation.
11. Integrating data from multiple sources.
Master Data Management.
Performing new types of analyses and
Improving turnaround time for analysis and
reporting;.
Reducing cost to access historical data.
12. Data Quality –When a data warehouse tries to combine inconsistent
data from disparate sources, it encounters errors. Inconsistent
data, duplicates, logic conflicts, and missing data all result in data
quality challenges. Poor data quality results in faulty reporting and
analytics necessary for optimal decision making.
Performance – It must also be carefully designed to meet overall
performance requirements. While the final product can be
customized to fit the performance needs of the organization, the
initial overall design must be carefully thought out to provide a
stable foundation from which to start.
User Acceptance – People are not keen to changing their daily
routine especially if the new process is not intuitive. There are
many challenges to overcome to make a data warehouse that is
quickly adopted by an organization. Having a comprehensive user
training program can ease this hesitation but will require planning
and additional resources.