SlideShare a Scribd company logo
1 of 58
Shikha Gautam
Asst.Professor
 Data Warehouse Process and Technology: Warehousing
Strategy, Warehouse management and Support Processes.
 Warehouse Planning and Implementation.
 H/w and O.S. for Data Warehousing, C/Server Computing
Model & Data Warehousing, Parallel Processors &
Cluster Systems, Distributed DBMS implementations.
 Warehousing Software, Warehouse Schema Design.
 Data Extraction, Cleanup & Transformation Tools,
Warehouse Metadata
“Storage or warehousing provides the place
utility as part of logistics for any business and
along with Transportation is a critical
component of customer service standards”.
 To support the company’s customer policy.
 To maintain a source of supply without interruptions.
 To support changing market conditions and sudden
changes in demand.
 To provide customers with the right mix of products
at all times and all locations.
 To ensure least logistics cost for a desired level of
customer service.
 More cost effective decision making.
 Better enterprise intelligence: Increasing
quality and flexibility of enterprise analysis.
 Enhanced customer service.
 Business re-engineering: Knowing what
information is important provides direction and
priority for re-engineering efforts.
 Information system re-engineering.
 Private warehouses: It is a storage facility that is mostly
owned by big companies or single manufacturing units. It
is also known as proprietary warehousing.
 Public warehouses: It is a facility that stores inventory for
many different businesses as opposed to a
"private warehouse”.
 Contract warehouses: A contract warehouse handles the
shipping, receiving and storage of goods on a contract
basis. This type of warehouse usually requires a client to
commit to services for a particular period of time.
 An integrated warehouse strategy focuses on two
questions:
1. How many warehouses should be employed.
2. Which warehouse types should be used to meet
market requirements.
 Many firms utilize a combination of private, public,
and contract facilities.
 It involves following activities:
1. Establish sponsorship.
2. Identify enterprise needs.
3. Determine measurement cycle.
4. Validate measures.
5. Design data warehouse architecture.
6. Apply appropriate technologies.
7. Implementing data warehouse.
1. Establish sponsorship: Establishing the right
sponsorship chain will ensure successful development
and implementation. Sponsorship chain should include
a data warehousing manager and two key individuals.
2. Identify Enterprise needs: Interview with key
enterprise manager and analysis other pertinent
documentations are techniques used to determine
enterprise needs.
3. Determine measurement cycle: Describing
the cycles or time period used for the measure. Are
quarters, months or hours are appropriate to capture
useful measurement data? Does it need historical
data?
4. Validate measures: After determining and
identifying enterprise needs, it is necessary to
“reality check” of it. The feedback will be used for
refining the measures.
5. Design data warehouse architecture: This activity
involves active user participation in facilitated design
sessions.
6. Apply appropriate technologies: Enterprise selects
technology, key technology issues, security policies etc.
7. Implementing data warehouse: Loading preliminary
data, designing user interface, developing standard
queries and reports etc.
There are four major processes that build a data
warehouse:
1. Extract and load data: Data extraction takes data
from the source systems. Data load takes the
extracted data and loads it into the data warehouse.
It involves:
 Controlling the Process: Determining when to start data
extraction. It ensures that the tools, the logic modules,
and the programs are executed in correct sequence and at
correct time.
 When to Initiate Extract: Data warehouse should
represent a single, consistent version of the information to
the user. So, Data needs to be in a consistent state.
 Loading the Data : Data is loaded into a temporary data
store where it is cleaned up and made consistent.
2. Cleaning and transforming the data: Clean
and transform the loaded data into a structure,
Partition the data and Aggregation.
3. Backup and Archive the data: In order to recover the
data in the event of data loss, software failure, or
hardware failure, it is necessary to keep regular back
ups.
4. Managing queries & directing them to the
appropriate data sources: Manages the queries,
helps speed up the execution time of queries, Directs
the queries to their most effective data sources.
Ensures that all the system sources are used in the
most effective way, Monitors actual query profiles.
 A warehouse management system (WMS) is a
software application, designed to support and
optimize warehouse or distribution center
management.
 They facilitate management in their daily planning,
organizing, staffing, directing, and controlling the
utilization of available resources, to move and store
materials into, within, and out of a warehouse, while
supporting staff in the performance of material
movement and storage in and around a warehouse.
1. Load management: Relates to the collection of
information from internal or external sources.
Loading process includes summarizing,
manipulating and changing the data structures
into a format that lends itself to analytical
processing.
2. Warehouse Management: The management
tasks include ensuring its availability, the
effective backup of its contents, and its security.
3. Query management: relates to the provision of
access to the contents of the warehouse and may
include the partitioning of information into
different areas with different privileges to
different users.
Access may be provided through custom-built
applications, or ad hoc query tools.
 Includes loading preliminary data, implementing
transformation program, design user interface,
developing standard query and reports and
training to warehouse users.
ETL
Design user Interface
Develop standard query
Training Users
The process of extracting data from source systems and
bringing it into the data warehouse is commonly
called ETL, which stands for:
 Extraction: To retrieve all the required data from the
source system with as little resources as possible.
 Transformation, and
 Loading.
 Ways to perform the extract:
 Update notification – If the source system is able to
provide a notification that a record has been changed,
this is the easiest way to get the data.
 Incremental extract –They are able to identify which
records have been modified and provide an extract of
such records. By using daily extract, we may not be able
to handle deleted records.
 Full extract - The full extract requires keeping a copy of
the last extract in the same format in order to be able to
identify changes. Handles deletions as well.
2. Clean: Ensures the quality of the data in the data
warehouse.
3. Transform: Applies a set of rules to transform the
data from the source to the target.
Converting any measured data to the same dimension
using the same units so that they can later be joined.
It also requires joining data from several sources,
generating aggregates, generating surrogate keys,
sorting, deriving new calculated values.
4. Load: To ensure that the load is performed correctly
and with as little resources as possible.The target of the
Load process is often a database. The referential integrity
needs to be maintained by ETL tool to ensure consistency.
5. Managing ETL Process:
There is a possibility that the ETL process fails.This can be
caused by missing values in one of the reference tables, or
simply a connection or power outage. It is necessary to
design the ETL process keeping fail-recovery in mind.
6. Staging:
A staging area or landing zone is an intermediate storage
area used for data processing during the ETL process.
Primary motivations for their use are to increase
efficiency of ETL processes, ensure data integrity and
support data quality operations.
 Commercial tools : Ab Initio, IBM InfoSphere
DataStage, Informatica, Oracle Data
Integrator and SAP Data Integrator.
 Open source ETL tools: CloverETL, Apatar,
Pentaho and Talend.
 Data Warehousing comes in all shapes and sizes,
which is having a direct relationship to cost and
time involved.
 The steps listed below are summary of some of
the points to consider:
 Get Professional Advice
 Plan the Data
 Who will use the Data Warehouse
 Integration to External Applications
The key steps in developing a data warehouse can
be summarized as follows:
 Project initiation
 Requirements analysis
 Design (architecture, databases and applications)
 Construction (selecting and installing tools,
developing data feeds and building reports)
 Deployment (release & training)
 Maintenance
 It applies to the software architecture that describe
processing between application and supporting services.
 It represents distributive co-operating processing,
relationship between client and server is the relationship
between hardware and software components.
 It covers a wide range of functions, services and other
aspects of distributed environment.
 Host based application processing is performed on one
computer system with attached unintelligent, “dumb”
terminals.
 A single stand alone PC or an IBM mainframe with
attached character-based display terminals are example
of host-based processing environment.
 Host based processing is totally non-distributed.
 Slave computers are attached to master computer and
perform application-processing-related functions only as
directed by their master.
 Distribution of processing tends to be unidirectional-
from master to slaves.
 Slaves are capable of some limited local application
processing.
 E.g. Mainframe (host) computer, such as IBM 3090 used
with cluster controllers and intelligent terminals.
 This generation used to model:
1. Shared device LAN processing environment : PCs
are attached to a system device that allows these
PCs to share a common resource – file Server on
Hard disk or printer Server.
E.g. Microsoft’s LAN manager, which allows a LAN
to have a system dedicated to file and print services.
2. Client server LAN processing environment:
Extension of shared device processing.
E.g. SYBASE SQL Server
An application running on PC sends Read request
to its database server. DB server process it locally
and sends only the requested records to PC
applications.
 Two-tiered architecture to multi-tiered
architecture.
 Computing model deals with servers
dedicated to application, data, transaction
management and system management.
 Supported relational to multidimensional to
multimedia data structure.
 A distributed database system consists of
loosely coupled sites that share no physical
component.
 Database systems that run on each site are
independent of each other.
 Transactions may access data at one or more
sites.
 In a homogeneous distributed database
 All sites have identical software
 Are aware of each other and agree to cooperate in processing
user requests.
 Each site surrenders part of its autonomy in terms of right to
change schemas or software
 Appears to user as a single system
 In a heterogeneous distributed database
 Different sites may use different schemas and software
▪ Difference in schema is a major problem for query processing
▪ Difference in software is a major problem for transaction processing
 Sites may not be aware of each other and may provide only
limited facilities for cooperation in transaction processing
DDBMS architectures are generally developed
depending on three parameters −
 Distribution − It states the physical distribution of
data across the different sites.
 Autonomy − It indicates the distribution of control
of the database system and the degree to which each
constituent DBMS can operate independently.
 Heterogeneity − It refers to the uniformity or
dissimilarity of the data models, system components
and databases.
 Data Replication
 Fragmentation
The three dimensions of distribution
transparency are −
 Location transparency
 Fragmentation transparency
 Replication transparency
Communication
Network
Site 1
Site 2
Site 3
Site 4
 The data warehouse operations mainly consist of huge data loads and
index builds, generation of materialized views, and queries over large
volumes of data. The elemental I/O system of a data warehouse should
be built to meet these heavy requirements.
 Architecture Options:
1. Symmetric Multiprocessing (SMP): where two or more identical
processors are connected to a single, shared main memory.
2. Massive parallel processing (MPP): large number of processors to
perform a set of coordinated computations in parallel.
 Number of CPUs
 Memory of data warehouse
 Number of Disks
 Server OS determine:
 how quickly the server can fulfill client request
 how many clients it can support concurrently and
reliably,
 how efficient the system resources such as
memory,
 Disk I/O and communication components are
utilized.
 Multiuser Support
 Preemptive multitasking
 Multithreaded Design
 Memory Protection: Concurrent tasks should not
violate each others memory.
 Scalability
 Security
 Reliability
 Availability
 Relatively small and highly secure than uniprocessors.
 Simplified architecture, Extensibility, Portability, real
time support, robust system security and multiprocessor
support.
 This architecture results into highly modular OS that
can support multiple OS “personalities” by configuring
outside services as needed.
 For e.g. Mach 3.0 microkernel used by IBM to allow
DOS, OS/2 and AIX OS to coexist on single machine.
 Distributed Memory Architecture:
 Shared-Nothing Architecture
 Shared Disk Architecture
Local
Memory
Local
Memory
Local
Memory
Local
Memory
Processor
Unit (PU)
Processor
Unit (PU)
Processor
Unit (PU)
Processor
Unit (PU)
Interconnection Network
Local
Memory
Local
Memory
Local
Memory
Local
Memory
Processor
Unit (PU)
Processor
Unit (PU)
Processor
Unit (PU)
Processor
Unit (PU)
Interconnection Network
Global Shared Disk Subsystem
 A cluster is a loosely coupled SMP machines
connected by high speed interconnection
network.
 A cluster behave just like a single large
machine.

More Related Content

What's hot

1.4 data warehouse
1.4 data warehouse1.4 data warehouse
1.4 data warehouseKrish_ver2
 
DATA Warehousing & Data Mining
DATA Warehousing & Data MiningDATA Warehousing & Data Mining
DATA Warehousing & Data Miningcpjcollege
 
Data warehousing and data mart
Data warehousing and data martData warehousing and data mart
Data warehousing and data martAmit Sarkar
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional ModelingSunita Sahu
 
Data warehouse
Data warehouseData warehouse
Data warehouseMR Z
 
The Data Warehouse Lifecycle
The Data Warehouse LifecycleThe Data Warehouse Lifecycle
The Data Warehouse Lifecyclebartlowe
 
OLAP operations
OLAP operationsOLAP operations
OLAP operationskunj desai
 
Data warehouse concepts
Data warehouse conceptsData warehouse concepts
Data warehouse conceptsobieefans
 
Seminar datawarehousing
Seminar datawarehousingSeminar datawarehousing
Seminar datawarehousingKavisha Uniyal
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data WarehouseShanthi Mukkavilli
 
Data warehousing - Dr. Radhika Kotecha
Data warehousing - Dr. Radhika KotechaData warehousing - Dr. Radhika Kotecha
Data warehousing - Dr. Radhika KotechaRadhika Kotecha
 

What's hot (20)

Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
OLTP vs OLAP
OLTP vs OLAPOLTP vs OLAP
OLTP vs OLAP
 
1.4 data warehouse
1.4 data warehouse1.4 data warehouse
1.4 data warehouse
 
DATA Warehousing & Data Mining
DATA Warehousing & Data MiningDATA Warehousing & Data Mining
DATA Warehousing & Data Mining
 
Warehouse Management
Warehouse ManagementWarehouse Management
Warehouse Management
 
Ppt
PptPpt
Ppt
 
Data warehousing and data mart
Data warehousing and data martData warehousing and data mart
Data warehousing and data mart
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
The Data Warehouse Lifecycle
The Data Warehouse LifecycleThe Data Warehouse Lifecycle
The Data Warehouse Lifecycle
 
7 Ways to Improve Warehouse Efficiency
7 Ways to Improve Warehouse Efficiency 7 Ways to Improve Warehouse Efficiency
7 Ways to Improve Warehouse Efficiency
 
Oltp vs olap
Oltp vs olapOltp vs olap
Oltp vs olap
 
OLAP operations
OLAP operationsOLAP operations
OLAP operations
 
Data warehouse concepts
Data warehouse conceptsData warehouse concepts
Data warehouse concepts
 
ETL Process
ETL ProcessETL Process
ETL Process
 
Seminar datawarehousing
Seminar datawarehousingSeminar datawarehousing
Seminar datawarehousing
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data Warehouse
 
Data warehousing - Dr. Radhika Kotecha
Data warehousing - Dr. Radhika KotechaData warehousing - Dr. Radhika Kotecha
Data warehousing - Dr. Radhika Kotecha
 

Similar to Warehouse Planning and Implementation

Data Mining & Data Warehousing
Data Mining & Data WarehousingData Mining & Data Warehousing
Data Mining & Data WarehousingAAKANKSHA JAIN
 
Data warehouse
Data warehouseData warehouse
Data warehouseRajThakuri
 
Data warehouse architecture
Data warehouse architecture Data warehouse architecture
Data warehouse architecture janani thirupathi
 
Unit-IV-Introduction to Data Warehousing .pptx
Unit-IV-Introduction to Data Warehousing .pptxUnit-IV-Introduction to Data Warehousing .pptx
Unit-IV-Introduction to Data Warehousing .pptxHarsha Patel
 
Data Warehouses & Deployment By Ankita dubey
Data Warehouses & Deployment By Ankita dubeyData Warehouses & Deployment By Ankita dubey
Data Warehouses & Deployment By Ankita dubeyAnkita Dubey
 
ETL processes , Datawarehouse and Datamarts.pptx
ETL processes , Datawarehouse and Datamarts.pptxETL processes , Datawarehouse and Datamarts.pptx
ETL processes , Datawarehouse and Datamarts.pptxParnalSatle
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousingsumit621
 
Decoding the Role of a Data Engineer.pdf
Decoding the Role of a Data Engineer.pdfDecoding the Role of a Data Engineer.pdf
Decoding the Role of a Data Engineer.pdfDatavalley.ai
 
20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.ppt20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.pptSumathiG8
 
Synopsis on inventory_management_system
Synopsis on inventory_management_systemSynopsis on inventory_management_system
Synopsis on inventory_management_systemDivya Baghel
 

Similar to Warehouse Planning and Implementation (20)

Data Mining
Data MiningData Mining
Data Mining
 
Data Mining & Data Warehousing
Data Mining & Data WarehousingData Mining & Data Warehousing
Data Mining & Data Warehousing
 
Data Warehouse
Data WarehouseData Warehouse
Data Warehouse
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
H1803014347
H1803014347H1803014347
H1803014347
 
Unit 5
Unit 5 Unit 5
Unit 5
 
Data warehouse architecture
Data warehouse architecture Data warehouse architecture
Data warehouse architecture
 
Bi
BiBi
Bi
 
DMDW 1st module.pdf
DMDW 1st module.pdfDMDW 1st module.pdf
DMDW 1st module.pdf
 
Unit-IV-Introduction to Data Warehousing .pptx
Unit-IV-Introduction to Data Warehousing .pptxUnit-IV-Introduction to Data Warehousing .pptx
Unit-IV-Introduction to Data Warehousing .pptx
 
Data Warehouses & Deployment By Ankita dubey
Data Warehouses & Deployment By Ankita dubeyData Warehouses & Deployment By Ankita dubey
Data Warehouses & Deployment By Ankita dubey
 
ETL processes , Datawarehouse and Datamarts.pptx
ETL processes , Datawarehouse and Datamarts.pptxETL processes , Datawarehouse and Datamarts.pptx
ETL processes , Datawarehouse and Datamarts.pptx
 
Unit 1
Unit 1Unit 1
Unit 1
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousing
 
Decoding the Role of a Data Engineer.pdf
Decoding the Role of a Data Engineer.pdfDecoding the Role of a Data Engineer.pdf
Decoding the Role of a Data Engineer.pdf
 
Data warehouse testing
Data warehouse testingData warehouse testing
Data warehouse testing
 
Course Outline Ch 2
Course Outline Ch 2Course Outline Ch 2
Course Outline Ch 2
 
Data mining notes
Data mining notesData mining notes
Data mining notes
 
20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.ppt20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.ppt
 
Synopsis on inventory_management_system
Synopsis on inventory_management_systemSynopsis on inventory_management_system
Synopsis on inventory_management_system
 

More from SHIKHA GAUTAM

Agreement Protocols, distributed File Systems, Distributed Shared Memory
Agreement Protocols, distributed File Systems, Distributed Shared MemoryAgreement Protocols, distributed File Systems, Distributed Shared Memory
Agreement Protocols, distributed File Systems, Distributed Shared MemorySHIKHA GAUTAM
 
Distributed Mutual Exclusion and Distributed Deadlock Detection
Distributed Mutual Exclusion and Distributed Deadlock DetectionDistributed Mutual Exclusion and Distributed Deadlock Detection
Distributed Mutual Exclusion and Distributed Deadlock DetectionSHIKHA GAUTAM
 
Distributed Systems Introduction and Importance
Distributed Systems Introduction and Importance Distributed Systems Introduction and Importance
Distributed Systems Introduction and Importance SHIKHA GAUTAM
 
Type conversion in c
Type conversion in cType conversion in c
Type conversion in cSHIKHA GAUTAM
 
3. basic organization of a computer
3. basic organization of a computer3. basic organization of a computer
3. basic organization of a computerSHIKHA GAUTAM
 
Generations of computer
Generations of computerGenerations of computer
Generations of computerSHIKHA GAUTAM
 
Dbms Introduction and Basics
Dbms Introduction and BasicsDbms Introduction and Basics
Dbms Introduction and BasicsSHIKHA GAUTAM
 

More from SHIKHA GAUTAM (17)

Agreement Protocols, distributed File Systems, Distributed Shared Memory
Agreement Protocols, distributed File Systems, Distributed Shared MemoryAgreement Protocols, distributed File Systems, Distributed Shared Memory
Agreement Protocols, distributed File Systems, Distributed Shared Memory
 
Distributed Mutual Exclusion and Distributed Deadlock Detection
Distributed Mutual Exclusion and Distributed Deadlock DetectionDistributed Mutual Exclusion and Distributed Deadlock Detection
Distributed Mutual Exclusion and Distributed Deadlock Detection
 
Distributed Systems Introduction and Importance
Distributed Systems Introduction and Importance Distributed Systems Introduction and Importance
Distributed Systems Introduction and Importance
 
Unit 4
Unit 4Unit 4
Unit 4
 
Unit v
Unit vUnit v
Unit v
 
Unit iii
Unit iiiUnit iii
Unit iii
 
Unit ii_KCS201
Unit ii_KCS201Unit ii_KCS201
Unit ii_KCS201
 
Type conversion in c
Type conversion in cType conversion in c
Type conversion in c
 
C intro
C introC intro
C intro
 
4. algorithm
4. algorithm4. algorithm
4. algorithm
 
3. basic organization of a computer
3. basic organization of a computer3. basic organization of a computer
3. basic organization of a computer
 
Generations of computer
Generations of computerGenerations of computer
Generations of computer
 
c_programming
c_programmingc_programming
c_programming
 
Data Mining
Data MiningData Mining
Data Mining
 
Data Warehousing
Data WarehousingData Warehousing
Data Warehousing
 
Dbms Introduction and Basics
Dbms Introduction and BasicsDbms Introduction and Basics
Dbms Introduction and Basics
 
DBMS
DBMSDBMS
DBMS
 

Recently uploaded

INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)dollysharma2066
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...Chandu841456
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .Satyam Kumar
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfAsst.prof M.Gokilavani
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncWhy does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncssuser2ae721
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfme23b1001
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxPoojaBan
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 

Recently uploaded (20)

INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncWhy does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdf
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptx
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 

Warehouse Planning and Implementation

  • 2.  Data Warehouse Process and Technology: Warehousing Strategy, Warehouse management and Support Processes.  Warehouse Planning and Implementation.  H/w and O.S. for Data Warehousing, C/Server Computing Model & Data Warehousing, Parallel Processors & Cluster Systems, Distributed DBMS implementations.  Warehousing Software, Warehouse Schema Design.  Data Extraction, Cleanup & Transformation Tools, Warehouse Metadata
  • 3. “Storage or warehousing provides the place utility as part of logistics for any business and along with Transportation is a critical component of customer service standards”.
  • 4.  To support the company’s customer policy.  To maintain a source of supply without interruptions.  To support changing market conditions and sudden changes in demand.  To provide customers with the right mix of products at all times and all locations.  To ensure least logistics cost for a desired level of customer service.
  • 5.  More cost effective decision making.  Better enterprise intelligence: Increasing quality and flexibility of enterprise analysis.  Enhanced customer service.  Business re-engineering: Knowing what information is important provides direction and priority for re-engineering efforts.  Information system re-engineering.
  • 6.  Private warehouses: It is a storage facility that is mostly owned by big companies or single manufacturing units. It is also known as proprietary warehousing.  Public warehouses: It is a facility that stores inventory for many different businesses as opposed to a "private warehouse”.  Contract warehouses: A contract warehouse handles the shipping, receiving and storage of goods on a contract basis. This type of warehouse usually requires a client to commit to services for a particular period of time.
  • 7.  An integrated warehouse strategy focuses on two questions: 1. How many warehouses should be employed. 2. Which warehouse types should be used to meet market requirements.  Many firms utilize a combination of private, public, and contract facilities.
  • 8.  It involves following activities: 1. Establish sponsorship. 2. Identify enterprise needs. 3. Determine measurement cycle. 4. Validate measures. 5. Design data warehouse architecture. 6. Apply appropriate technologies. 7. Implementing data warehouse.
  • 9. 1. Establish sponsorship: Establishing the right sponsorship chain will ensure successful development and implementation. Sponsorship chain should include a data warehousing manager and two key individuals. 2. Identify Enterprise needs: Interview with key enterprise manager and analysis other pertinent documentations are techniques used to determine enterprise needs.
  • 10. 3. Determine measurement cycle: Describing the cycles or time period used for the measure. Are quarters, months or hours are appropriate to capture useful measurement data? Does it need historical data? 4. Validate measures: After determining and identifying enterprise needs, it is necessary to “reality check” of it. The feedback will be used for refining the measures.
  • 11. 5. Design data warehouse architecture: This activity involves active user participation in facilitated design sessions. 6. Apply appropriate technologies: Enterprise selects technology, key technology issues, security policies etc. 7. Implementing data warehouse: Loading preliminary data, designing user interface, developing standard queries and reports etc.
  • 12.
  • 13. There are four major processes that build a data warehouse: 1. Extract and load data: Data extraction takes data from the source systems. Data load takes the extracted data and loads it into the data warehouse. It involves:  Controlling the Process: Determining when to start data extraction. It ensures that the tools, the logic modules, and the programs are executed in correct sequence and at correct time.
  • 14.  When to Initiate Extract: Data warehouse should represent a single, consistent version of the information to the user. So, Data needs to be in a consistent state.  Loading the Data : Data is loaded into a temporary data store where it is cleaned up and made consistent. 2. Cleaning and transforming the data: Clean and transform the loaded data into a structure, Partition the data and Aggregation.
  • 15. 3. Backup and Archive the data: In order to recover the data in the event of data loss, software failure, or hardware failure, it is necessary to keep regular back ups. 4. Managing queries & directing them to the appropriate data sources: Manages the queries, helps speed up the execution time of queries, Directs the queries to their most effective data sources. Ensures that all the system sources are used in the most effective way, Monitors actual query profiles.
  • 16.
  • 17.  A warehouse management system (WMS) is a software application, designed to support and optimize warehouse or distribution center management.  They facilitate management in their daily planning, organizing, staffing, directing, and controlling the utilization of available resources, to move and store materials into, within, and out of a warehouse, while supporting staff in the performance of material movement and storage in and around a warehouse.
  • 18. 1. Load management: Relates to the collection of information from internal or external sources. Loading process includes summarizing, manipulating and changing the data structures into a format that lends itself to analytical processing. 2. Warehouse Management: The management tasks include ensuring its availability, the effective backup of its contents, and its security.
  • 19. 3. Query management: relates to the provision of access to the contents of the warehouse and may include the partitioning of information into different areas with different privileges to different users. Access may be provided through custom-built applications, or ad hoc query tools.
  • 20.
  • 21.  Includes loading preliminary data, implementing transformation program, design user interface, developing standard query and reports and training to warehouse users.
  • 22. ETL Design user Interface Develop standard query Training Users
  • 23. The process of extracting data from source systems and bringing it into the data warehouse is commonly called ETL, which stands for:  Extraction: To retrieve all the required data from the source system with as little resources as possible.  Transformation, and  Loading.
  • 24.  Ways to perform the extract:  Update notification – If the source system is able to provide a notification that a record has been changed, this is the easiest way to get the data.  Incremental extract –They are able to identify which records have been modified and provide an extract of such records. By using daily extract, we may not be able to handle deleted records.  Full extract - The full extract requires keeping a copy of the last extract in the same format in order to be able to identify changes. Handles deletions as well.
  • 25. 2. Clean: Ensures the quality of the data in the data warehouse. 3. Transform: Applies a set of rules to transform the data from the source to the target. Converting any measured data to the same dimension using the same units so that they can later be joined. It also requires joining data from several sources, generating aggregates, generating surrogate keys, sorting, deriving new calculated values.
  • 26. 4. Load: To ensure that the load is performed correctly and with as little resources as possible.The target of the Load process is often a database. The referential integrity needs to be maintained by ETL tool to ensure consistency. 5. Managing ETL Process: There is a possibility that the ETL process fails.This can be caused by missing values in one of the reference tables, or simply a connection or power outage. It is necessary to design the ETL process keeping fail-recovery in mind.
  • 27. 6. Staging: A staging area or landing zone is an intermediate storage area used for data processing during the ETL process. Primary motivations for their use are to increase efficiency of ETL processes, ensure data integrity and support data quality operations.
  • 28.  Commercial tools : Ab Initio, IBM InfoSphere DataStage, Informatica, Oracle Data Integrator and SAP Data Integrator.  Open source ETL tools: CloverETL, Apatar, Pentaho and Talend.
  • 29.  Data Warehousing comes in all shapes and sizes, which is having a direct relationship to cost and time involved.  The steps listed below are summary of some of the points to consider:  Get Professional Advice  Plan the Data  Who will use the Data Warehouse  Integration to External Applications
  • 30. The key steps in developing a data warehouse can be summarized as follows:  Project initiation  Requirements analysis  Design (architecture, databases and applications)  Construction (selecting and installing tools, developing data feeds and building reports)  Deployment (release & training)  Maintenance
  • 31.
  • 32.  It applies to the software architecture that describe processing between application and supporting services.  It represents distributive co-operating processing, relationship between client and server is the relationship between hardware and software components.  It covers a wide range of functions, services and other aspects of distributed environment.
  • 33.  Host based application processing is performed on one computer system with attached unintelligent, “dumb” terminals.  A single stand alone PC or an IBM mainframe with attached character-based display terminals are example of host-based processing environment.  Host based processing is totally non-distributed.
  • 34.
  • 35.  Slave computers are attached to master computer and perform application-processing-related functions only as directed by their master.  Distribution of processing tends to be unidirectional- from master to slaves.  Slaves are capable of some limited local application processing.  E.g. Mainframe (host) computer, such as IBM 3090 used with cluster controllers and intelligent terminals.
  • 36.
  • 37.  This generation used to model: 1. Shared device LAN processing environment : PCs are attached to a system device that allows these PCs to share a common resource – file Server on Hard disk or printer Server. E.g. Microsoft’s LAN manager, which allows a LAN to have a system dedicated to file and print services.
  • 38.
  • 39. 2. Client server LAN processing environment: Extension of shared device processing. E.g. SYBASE SQL Server An application running on PC sends Read request to its database server. DB server process it locally and sends only the requested records to PC applications.
  • 40.
  • 41.  Two-tiered architecture to multi-tiered architecture.  Computing model deals with servers dedicated to application, data, transaction management and system management.  Supported relational to multidimensional to multimedia data structure.
  • 42.
  • 43.  A distributed database system consists of loosely coupled sites that share no physical component.  Database systems that run on each site are independent of each other.  Transactions may access data at one or more sites.
  • 44.  In a homogeneous distributed database  All sites have identical software  Are aware of each other and agree to cooperate in processing user requests.  Each site surrenders part of its autonomy in terms of right to change schemas or software  Appears to user as a single system  In a heterogeneous distributed database  Different sites may use different schemas and software ▪ Difference in schema is a major problem for query processing ▪ Difference in software is a major problem for transaction processing  Sites may not be aware of each other and may provide only limited facilities for cooperation in transaction processing
  • 45. DDBMS architectures are generally developed depending on three parameters −  Distribution − It states the physical distribution of data across the different sites.  Autonomy − It indicates the distribution of control of the database system and the degree to which each constituent DBMS can operate independently.  Heterogeneity − It refers to the uniformity or dissimilarity of the data models, system components and databases.
  • 46.  Data Replication  Fragmentation The three dimensions of distribution transparency are −  Location transparency  Fragmentation transparency  Replication transparency
  • 48.
  • 49.
  • 50.
  • 51.  The data warehouse operations mainly consist of huge data loads and index builds, generation of materialized views, and queries over large volumes of data. The elemental I/O system of a data warehouse should be built to meet these heavy requirements.  Architecture Options: 1. Symmetric Multiprocessing (SMP): where two or more identical processors are connected to a single, shared main memory. 2. Massive parallel processing (MPP): large number of processors to perform a set of coordinated computations in parallel.  Number of CPUs  Memory of data warehouse  Number of Disks
  • 52.  Server OS determine:  how quickly the server can fulfill client request  how many clients it can support concurrently and reliably,  how efficient the system resources such as memory,  Disk I/O and communication components are utilized.
  • 53.  Multiuser Support  Preemptive multitasking  Multithreaded Design  Memory Protection: Concurrent tasks should not violate each others memory.  Scalability  Security  Reliability  Availability
  • 54.  Relatively small and highly secure than uniprocessors.  Simplified architecture, Extensibility, Portability, real time support, robust system security and multiprocessor support.  This architecture results into highly modular OS that can support multiple OS “personalities” by configuring outside services as needed.  For e.g. Mach 3.0 microkernel used by IBM to allow DOS, OS/2 and AIX OS to coexist on single machine.
  • 55.  Distributed Memory Architecture:  Shared-Nothing Architecture  Shared Disk Architecture
  • 57. Local Memory Local Memory Local Memory Local Memory Processor Unit (PU) Processor Unit (PU) Processor Unit (PU) Processor Unit (PU) Interconnection Network Global Shared Disk Subsystem
  • 58.  A cluster is a loosely coupled SMP machines connected by high speed interconnection network.  A cluster behave just like a single large machine.