The document discusses data warehousing and OLAP technology for data mining. It defines a data warehouse as a subject-oriented, integrated, time-variant, and nonvolatile collection of data to support management decision making. It describes how a data warehouse uses a multi-dimensional data model with dimensions and measures. It also discusses efficient computation of data cubes, OLAP operations, and further developments in data cube technology like discovery-driven and multi-feature cubes to support data mining applications from information processing to analytical processing and knowledge discovery.
The document discusses data warehousing and OLAP technology for data mining. It defines a data warehouse as a subject-oriented, integrated, time-variant, and non-volatile collection of data that supports management decision making. It describes how data warehouses use a multi-dimensional data model with dimensions and facts to organize data into cubes that can be sliced, diced, and aggregated. It also discusses how data warehouse architecture, implementation, indexing techniques, and metadata repositories help optimize online analytical processing queries on historical and summarized data to support data mining.
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALASaikiran Panjala
This document discusses data warehouses, including what they are, how they are implemented, and how they can be further developed. It provides definitions of key concepts like data warehouses, data cubes, and OLAP. It also describes techniques for efficient data cube computation, indexing of OLAP data, and processing of OLAP queries. Finally, it discusses different approaches to data warehouse implementation and development of data cube technology.
These slides will help in understanding what is Data warehouse? why we need it? DWh architecture, OLAP, Metadata, Data Mart, Schemas for multidimensional data, partitioning of data warehouse
The document provides an introduction to data warehousing. It defines a data warehouse as a subject-oriented, integrated, time-varying, and non-volatile collection of data used for organizational decision making. It describes key characteristics of a data warehouse such as maintaining historical data, facilitating analysis to improve understanding, and enabling better decision making. It also discusses dimensions, facts, ETL processes, and common data warehouse architectures like star schemas.
Designing high performance datawarehouseUday Kothari
Just when the world of “Data 1.0” showed some signs of maturing; the “Outside In” driven demands seem to have already initiated some the disruptive changes to the data landscape. Parallel growth in volume, velocity and variety of data coupled with incessant war on finding newer insights and value from data has posed a Big Question: Is Your Data Warehouse Relevant?
In short, the surrounding changes happening real time is the new “Data 2.0”. It is characterized by feeding the ever hungry minds with sharper insights whether it is related to regulation, finance, corporate action, risk management or purely aimed at improving operational efficiencies. The source in this new “Data 2.0” has to be commensurate to the outside in demands from customers, regulators, stakeholders and business users; and hence, you would need a high relformance (relevance + performance) data warehouse which will be relevant to your business eco-system and will have the power to scale exponentially.
We starts this webinar by giving the audiences a sneak preview of what happened in the Data 1.0 world & which characteristics are shaping the new Data 2.0 world. It then delves deep on the challenges that growing data volumes have posed to the Data warehouse teams. It also presents the audiences some of the practical and proven methodologies to address these performance challenges. Finally, in the end it will highlight some of the thought provoking ways to turbo charge your data warehouse related initiatives by leveraging some of the newer technologies like Hadoop. Overall, the webinar will educate audiences with building high performance and relevant data warehouses which is capable of meeting the newer demands while significantly driving down the total cost of ownership.
The document discusses various techniques for data warehousing and online analytical processing (OLAP), including constructing data warehouses, star schemas, materialized views, data cubes, and data mining. Specifically, it describes how a data warehouse can be used to integrate data from multiple sources and support complex OLAP queries run against historical data. It provides examples of star schemas, materialized views, data cubes, and market basket analysis to find frequent itemsets.
This document provides an overview of OLAP cubes and multidimensional databases. It discusses key concepts such as star schemas, dimensions and hierarchies, cube aggregation and operators like roll-up and drill-down. It also compares the relational and multidimensional models, highlighting how multidimensional databases allow for intuitive analysis and fast retrieval of large datasets by predefining dimensional perspectives.
A data warehouse is a subject-oriented, integrated, time-variant collection of data that supports management's decision-making processes. It contains data extracted from various operational databases and data sources. The data is cleaned, transformed, integrated and loaded into the data warehouse for analysis. A data warehouse uses a multidimensional model with facts and dimensions to allow for complex analytical and ad-hoc queries from multiple perspectives. It is separately administered from operational databases to avoid impacting transaction processing systems and allow optimized access for decision support.
The document discusses data warehousing and OLAP technology for data mining. It defines a data warehouse as a subject-oriented, integrated, time-variant, and non-volatile collection of data that supports management decision making. It describes how data warehouses use a multi-dimensional data model with dimensions and facts to organize data into cubes that can be sliced, diced, and aggregated. It also discusses how data warehouse architecture, implementation, indexing techniques, and metadata repositories help optimize online analytical processing queries on historical and summarized data to support data mining.
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALASaikiran Panjala
This document discusses data warehouses, including what they are, how they are implemented, and how they can be further developed. It provides definitions of key concepts like data warehouses, data cubes, and OLAP. It also describes techniques for efficient data cube computation, indexing of OLAP data, and processing of OLAP queries. Finally, it discusses different approaches to data warehouse implementation and development of data cube technology.
These slides will help in understanding what is Data warehouse? why we need it? DWh architecture, OLAP, Metadata, Data Mart, Schemas for multidimensional data, partitioning of data warehouse
The document provides an introduction to data warehousing. It defines a data warehouse as a subject-oriented, integrated, time-varying, and non-volatile collection of data used for organizational decision making. It describes key characteristics of a data warehouse such as maintaining historical data, facilitating analysis to improve understanding, and enabling better decision making. It also discusses dimensions, facts, ETL processes, and common data warehouse architectures like star schemas.
Designing high performance datawarehouseUday Kothari
Just when the world of “Data 1.0” showed some signs of maturing; the “Outside In” driven demands seem to have already initiated some the disruptive changes to the data landscape. Parallel growth in volume, velocity and variety of data coupled with incessant war on finding newer insights and value from data has posed a Big Question: Is Your Data Warehouse Relevant?
In short, the surrounding changes happening real time is the new “Data 2.0”. It is characterized by feeding the ever hungry minds with sharper insights whether it is related to regulation, finance, corporate action, risk management or purely aimed at improving operational efficiencies. The source in this new “Data 2.0” has to be commensurate to the outside in demands from customers, regulators, stakeholders and business users; and hence, you would need a high relformance (relevance + performance) data warehouse which will be relevant to your business eco-system and will have the power to scale exponentially.
We starts this webinar by giving the audiences a sneak preview of what happened in the Data 1.0 world & which characteristics are shaping the new Data 2.0 world. It then delves deep on the challenges that growing data volumes have posed to the Data warehouse teams. It also presents the audiences some of the practical and proven methodologies to address these performance challenges. Finally, in the end it will highlight some of the thought provoking ways to turbo charge your data warehouse related initiatives by leveraging some of the newer technologies like Hadoop. Overall, the webinar will educate audiences with building high performance and relevant data warehouses which is capable of meeting the newer demands while significantly driving down the total cost of ownership.
The document discusses various techniques for data warehousing and online analytical processing (OLAP), including constructing data warehouses, star schemas, materialized views, data cubes, and data mining. Specifically, it describes how a data warehouse can be used to integrate data from multiple sources and support complex OLAP queries run against historical data. It provides examples of star schemas, materialized views, data cubes, and market basket analysis to find frequent itemsets.
This document provides an overview of OLAP cubes and multidimensional databases. It discusses key concepts such as star schemas, dimensions and hierarchies, cube aggregation and operators like roll-up and drill-down. It also compares the relational and multidimensional models, highlighting how multidimensional databases allow for intuitive analysis and fast retrieval of large datasets by predefining dimensional perspectives.
A data warehouse is a subject-oriented, integrated, time-variant collection of data that supports management's decision-making processes. It contains data extracted from various operational databases and data sources. The data is cleaned, transformed, integrated and loaded into the data warehouse for analysis. A data warehouse uses a multidimensional model with facts and dimensions to allow for complex analytical and ad-hoc queries from multiple perspectives. It is separately administered from operational databases to avoid impacting transaction processing systems and allow optimized access for decision support.
The document discusses dimensional modeling and data warehousing. It describes how dimensional models are designed for understandability and ease of reporting rather than updates. Key aspects include facts and dimensions, with facts being numeric measures and dimensions providing context. Slowly changing dimensions are also covered, with types 1-3 handling changes to dimension attribute values over time.
The document discusses the goals and requirements for building a data warehouse for the SF Goodwill Retail organization. The data warehouse would provide a single place for sales and inventory reports, allow automated reporting available from any location, and pull data from POS systems for consolidated performance reporting and comparisons to goals. It would also standardize the design and development process using common tools like SQL, HTML and PHP. A variety of standardized reports would be available through a web interface, including high-level summaries, drill-down details, filtering and exporting capabilities.
The document discusses the need for data warehousing and provides examples of how data warehousing can help companies analyze data from multiple sources to help with decision making. It describes common data warehouse architectures like star schemas and snowflake schemas. It also outlines the process of building a data warehouse, including data selection, preprocessing, transformation, integration and loading. Finally, it discusses some advantages and disadvantages of data warehousing.
This document discusses data warehousing and online analytic processing (OLAP). It introduces key concepts such as data warehouses, OLAP, multidimensional data models, dimension hierarchies, and OLAP queries including roll-up, drill-down, pivoting, slicing and dicing. It also covers implementation issues such as indexing techniques and view maintenance to enable interactive queries for OLAP.
The document discusses data warehousing and OLAP (online analytical processing). It defines a data warehouse as a subject-oriented, integrated, time-variant and non-volatile collection of data used to support management decision making. The document outlines common data warehouse architectures like star schemas and snowflake schemas and discusses how data is modeled and organized in multidimensional data cubes. It also describes typical OLAP operations for analyzing and exploring cube data like roll-up, drill-down, slice and dice.
A data warehouse is a subject-oriented, consolidated collection of integrated data from multiple sources used to support management decision making. It is separate from operational databases and contains historical data for analysis. Data warehouses use a star schema with fact and dimension tables and support online analytical processing (OLAP) for complex analysis and reporting.
A data warehouse is a consolidated view of enterprise data structured for dynamic queries and analytics. It has the following key characteristics: integrated, subject-oriented, time-variant, and non-volatile. A data warehouse uses a three-tier architecture including a database bottom tier, middle OLAP server tier, and top reporting tools tier. It enables improved decision making by storing large volumes of historical data separately from operational systems and facilitating analysis through dimensional modeling.
Business Intelligence: Multidimensional AnalysisMichael Lamont
An introduction to multidimensional business intelligence and OnLine Analytical Processing (OLAP) suitable for both a technical and non-technical audience. Covers dimensions, attributes, measures, Key Performance Indicators (KPIs), aggregates, hierarchies, and data cubes.
The document discusses various business analysis tools and techniques. It begins by defining business analysis and the responsibilities of business analysts. It then covers topics like reporting tools, query tools, OLAP, data mining, and executive information systems. Under OLAP, it discusses multidimensional data modeling concepts like star schemas, snowflake schemas, and fact constellations. It also covers OLAP operations and different types of OLAP servers including MOLAP, ROLAP, and HOLAP servers.
This document provides an overview of key concepts related to data warehousing including what a data warehouse is, common data warehouse architectures, types of data warehouses, and dimensional modeling techniques. It defines key terms like facts, dimensions, star schemas, and snowflake schemas and provides examples of each. It also discusses business intelligence tools that can analyze and extract insights from data warehouses.
This document provides an overview and summary of Hyperion products and the Hyperion System 9 platform. It describes the key components of Hyperion including BI+, Planning, Performance Management, and Essbase. It then summarizes the typical architecture of a multidimensional database with Essbase and the lifecycle of building an Essbase database including dimensional modeling, data loading, and reporting.
The document outlines the agenda for a data warehousing training course. The agenda covers topics such as data warehouse structure and modeling, extract transform load (ETL) processes, dimensional modeling, aggregation, online analytical processing (OLAP), and data marts. Time is allocated to discuss loading, refreshing, and querying the data warehouse.
Data Warehousing and Bitmap Indexes - More than just some bitsTrivadis
The document discusses bitmap indexes and their usage in data warehousing. It begins with an overview of bitmap index concepts and how they compare to B-tree indexes. It then covers best practices for using bitmap indexes in a star schema data warehouse, including maintaining bitmap indexes during ETL processes. The presentation concludes that bitmap indexes are highly effective for data warehousing queries and there are few reasons to use B-tree indexes within a data warehouse.
Introduction to Data Warehousing: Introduction, Necessity, Framework
of the datawarehouse, options, developing datawarehouses, end points.
Data Warehousing Design Consideration and Dimensional Modeling:
Defining Dimensional Model, Granularity of Facts, Additivity of Facts,
Functional dependency of the Data, Helper Tables, Implementation manyto-
many relationships between fact and dimensional modelling.
Business Intelligence Data Warehouse SystemKiran kumar
This document provides an overview of data warehousing and business intelligence concepts. It discusses:
- What a data warehouse is and its key properties like being integrated, non-volatile, time-variant and subject-oriented.
- Common data warehouse architectures including dimensional modeling, ETL processes, and different layers like the data storage layer and presentation layer.
- How data marts are subsets of the data warehouse that focus on specific business functions or departments.
- Different types of dimensions tables and slowly changing dimensions.
- How business intelligence uses the data warehouse for analysis, querying, reporting and generating insights to help with decision making.
Data warehouse implementation design for a Retail businessArsalan Qadri
The document contains an end to end data warehouse design - from SKU procurement to SKU Sale. Additionally, a BI dashboard has been created in Tableau, to mine the warehouse, with SKU as the grain. The data can be aggregated at levels of Supplier/Store/Location/Inventory/Sale Date/Time in Warehouse etc.
Data warehouse-dimensional-modeling-and-designSarita Kataria
This document provides an overview of data warehousing, dimensional modeling, and online analytical processing (OLAP). It defines key concepts in data warehousing like the data mart, metadata, cube, extraction transformation and loading (ETL), and data mining. Dimensional modeling is presented as an important technique for data warehouse design that uses facts, dimensions, and star or snowflake schemas. Finally, the document discusses OLAP features like multidimensional views and time intelligence, and different OLAP system types including multidimensional, relational, and hybrid OLAP.
This document discusses key concepts related to data warehousing including:
- The definition of a data warehouse as a subject-oriented, integrated, time-variant collection of data used for analysis and decision making.
- Common features of data warehouses such as being separate from operational databases, containing consolidated historical data, and being non-volatile.
- Types of data warehouse applications including information processing, analytical processing, and data mining.
- Common schemas used in data warehousing including star schemas, snowflake schemas, and fact constellation schemas.
Become BI Architect with 1KEY Agile BI Suite - OLAPDhiren Gala
Business intelligence uses applications and technologies to analyze data and help users make better business decisions. Online transaction processing (OLTP) is used for daily operations like processing, while online analytical processing (OLAP) is used for data analysis and decision making. Data warehouses integrate data from different sources to provide a centralized system for analysis and reporting. Dimensional modeling approaches like star schemas and snowflake schemas organize data to support OLAP.
The document discusses decision support, data warehousing, and online analytical processing (OLAP). It outlines the evolution of decision support from batch reporting in the 1960s to modern data warehousing with OLAP engines. Key aspects covered include the differences between OLTP and OLAP systems, data warehouse architecture including star schemas, and approaches to OLAP including relational and multidimensional servers.
The document discusses dimensional modeling and data warehousing. It describes how dimensional models are designed for understandability and ease of reporting rather than updates. Key aspects include facts and dimensions, with facts being numeric measures and dimensions providing context. Slowly changing dimensions are also covered, with types 1-3 handling changes to dimension attribute values over time.
The document discusses the goals and requirements for building a data warehouse for the SF Goodwill Retail organization. The data warehouse would provide a single place for sales and inventory reports, allow automated reporting available from any location, and pull data from POS systems for consolidated performance reporting and comparisons to goals. It would also standardize the design and development process using common tools like SQL, HTML and PHP. A variety of standardized reports would be available through a web interface, including high-level summaries, drill-down details, filtering and exporting capabilities.
The document discusses the need for data warehousing and provides examples of how data warehousing can help companies analyze data from multiple sources to help with decision making. It describes common data warehouse architectures like star schemas and snowflake schemas. It also outlines the process of building a data warehouse, including data selection, preprocessing, transformation, integration and loading. Finally, it discusses some advantages and disadvantages of data warehousing.
This document discusses data warehousing and online analytic processing (OLAP). It introduces key concepts such as data warehouses, OLAP, multidimensional data models, dimension hierarchies, and OLAP queries including roll-up, drill-down, pivoting, slicing and dicing. It also covers implementation issues such as indexing techniques and view maintenance to enable interactive queries for OLAP.
The document discusses data warehousing and OLAP (online analytical processing). It defines a data warehouse as a subject-oriented, integrated, time-variant and non-volatile collection of data used to support management decision making. The document outlines common data warehouse architectures like star schemas and snowflake schemas and discusses how data is modeled and organized in multidimensional data cubes. It also describes typical OLAP operations for analyzing and exploring cube data like roll-up, drill-down, slice and dice.
A data warehouse is a subject-oriented, consolidated collection of integrated data from multiple sources used to support management decision making. It is separate from operational databases and contains historical data for analysis. Data warehouses use a star schema with fact and dimension tables and support online analytical processing (OLAP) for complex analysis and reporting.
A data warehouse is a consolidated view of enterprise data structured for dynamic queries and analytics. It has the following key characteristics: integrated, subject-oriented, time-variant, and non-volatile. A data warehouse uses a three-tier architecture including a database bottom tier, middle OLAP server tier, and top reporting tools tier. It enables improved decision making by storing large volumes of historical data separately from operational systems and facilitating analysis through dimensional modeling.
Business Intelligence: Multidimensional AnalysisMichael Lamont
An introduction to multidimensional business intelligence and OnLine Analytical Processing (OLAP) suitable for both a technical and non-technical audience. Covers dimensions, attributes, measures, Key Performance Indicators (KPIs), aggregates, hierarchies, and data cubes.
The document discusses various business analysis tools and techniques. It begins by defining business analysis and the responsibilities of business analysts. It then covers topics like reporting tools, query tools, OLAP, data mining, and executive information systems. Under OLAP, it discusses multidimensional data modeling concepts like star schemas, snowflake schemas, and fact constellations. It also covers OLAP operations and different types of OLAP servers including MOLAP, ROLAP, and HOLAP servers.
This document provides an overview of key concepts related to data warehousing including what a data warehouse is, common data warehouse architectures, types of data warehouses, and dimensional modeling techniques. It defines key terms like facts, dimensions, star schemas, and snowflake schemas and provides examples of each. It also discusses business intelligence tools that can analyze and extract insights from data warehouses.
This document provides an overview and summary of Hyperion products and the Hyperion System 9 platform. It describes the key components of Hyperion including BI+, Planning, Performance Management, and Essbase. It then summarizes the typical architecture of a multidimensional database with Essbase and the lifecycle of building an Essbase database including dimensional modeling, data loading, and reporting.
The document outlines the agenda for a data warehousing training course. The agenda covers topics such as data warehouse structure and modeling, extract transform load (ETL) processes, dimensional modeling, aggregation, online analytical processing (OLAP), and data marts. Time is allocated to discuss loading, refreshing, and querying the data warehouse.
Data Warehousing and Bitmap Indexes - More than just some bitsTrivadis
The document discusses bitmap indexes and their usage in data warehousing. It begins with an overview of bitmap index concepts and how they compare to B-tree indexes. It then covers best practices for using bitmap indexes in a star schema data warehouse, including maintaining bitmap indexes during ETL processes. The presentation concludes that bitmap indexes are highly effective for data warehousing queries and there are few reasons to use B-tree indexes within a data warehouse.
Introduction to Data Warehousing: Introduction, Necessity, Framework
of the datawarehouse, options, developing datawarehouses, end points.
Data Warehousing Design Consideration and Dimensional Modeling:
Defining Dimensional Model, Granularity of Facts, Additivity of Facts,
Functional dependency of the Data, Helper Tables, Implementation manyto-
many relationships between fact and dimensional modelling.
Business Intelligence Data Warehouse SystemKiran kumar
This document provides an overview of data warehousing and business intelligence concepts. It discusses:
- What a data warehouse is and its key properties like being integrated, non-volatile, time-variant and subject-oriented.
- Common data warehouse architectures including dimensional modeling, ETL processes, and different layers like the data storage layer and presentation layer.
- How data marts are subsets of the data warehouse that focus on specific business functions or departments.
- Different types of dimensions tables and slowly changing dimensions.
- How business intelligence uses the data warehouse for analysis, querying, reporting and generating insights to help with decision making.
Data warehouse implementation design for a Retail businessArsalan Qadri
The document contains an end to end data warehouse design - from SKU procurement to SKU Sale. Additionally, a BI dashboard has been created in Tableau, to mine the warehouse, with SKU as the grain. The data can be aggregated at levels of Supplier/Store/Location/Inventory/Sale Date/Time in Warehouse etc.
Data warehouse-dimensional-modeling-and-designSarita Kataria
This document provides an overview of data warehousing, dimensional modeling, and online analytical processing (OLAP). It defines key concepts in data warehousing like the data mart, metadata, cube, extraction transformation and loading (ETL), and data mining. Dimensional modeling is presented as an important technique for data warehouse design that uses facts, dimensions, and star or snowflake schemas. Finally, the document discusses OLAP features like multidimensional views and time intelligence, and different OLAP system types including multidimensional, relational, and hybrid OLAP.
This document discusses key concepts related to data warehousing including:
- The definition of a data warehouse as a subject-oriented, integrated, time-variant collection of data used for analysis and decision making.
- Common features of data warehouses such as being separate from operational databases, containing consolidated historical data, and being non-volatile.
- Types of data warehouse applications including information processing, analytical processing, and data mining.
- Common schemas used in data warehousing including star schemas, snowflake schemas, and fact constellation schemas.
Become BI Architect with 1KEY Agile BI Suite - OLAPDhiren Gala
Business intelligence uses applications and technologies to analyze data and help users make better business decisions. Online transaction processing (OLTP) is used for daily operations like processing, while online analytical processing (OLAP) is used for data analysis and decision making. Data warehouses integrate data from different sources to provide a centralized system for analysis and reporting. Dimensional modeling approaches like star schemas and snowflake schemas organize data to support OLAP.
The document discusses decision support, data warehousing, and online analytical processing (OLAP). It outlines the evolution of decision support from batch reporting in the 1960s to modern data warehousing with OLAP engines. Key aspects covered include the differences between OLTP and OLAP systems, data warehouse architecture including star schemas, and approaches to OLAP including relational and multidimensional servers.
This document discusses data warehousing and OLAP (online analytical processing) technology. It defines a data warehouse as a subject-oriented, integrated, time-variant, and nonvolatile collection of data to support management decision making. It describes how data warehouses use a multi-dimensional data model with facts and dimensions to organize historical data from multiple sources for analysis. Common data warehouse architectures like star schemas and snowflake schemas are also summarized.
Data Warehousing for students educationpptxjainyshah20
This document discusses data warehousing and OLAP technology. It defines a data warehouse as a subject-oriented, integrated, time-variant, and nonvolatile collection of data used to support management decision making. Key aspects covered include the multi-dimensional data model using cubes and dimensions, various data warehouse architectures like star schemas and snowflake schemas, and OLAP operations for analysis like roll-up, drill-down, slice and dice. Building a data warehouse requires a range of business, technology, and program management skills.
The document discusses data warehousing and data mining. It covers topics like data warehouse implementation, efficient cube computation, indexing OLAP data, OLAP query processing, and OLAP server architectures. It also discusses challenges in data mining like data types, quality, preprocessing, and measures of similarity. The document focuses on efficient implementation of data warehouses to support fast OLAP queries through techniques like partial cube materialization and indexing.
The document discusses OLAP cubes and data warehousing. It defines OLAP as online analytical processing used to analyze aggregated data in data warehouses. Key concepts covered include star schemas, dimensions and facts, cube operations like roll-up and drill-down, and different OLAP architectures like MOLAP and ROLAP that use multidimensional or relational storage respectively.
Chapter 4. Data Warehousing and On-Line Analytical Processing.pptSubrata Kumer Paul
Jiawei Han, Micheline Kamber and Jian Pei
Data Mining: Concepts and Techniques, 3rd ed.
The Morgan Kaufmann Series in Data Management Systems
Morgan Kaufmann Publishers, July 2011. ISBN 978-0123814791
This document discusses OLAP (Online Analytical Processing) and how it compares to OLTP (Online Transactional Processing). It defines OLAP as software for performing multidimensional analysis on large volumes of data from a data warehouse or data mart. Key points:
- OLAP is optimized for analysis and complex queries, while OLTP is optimized for processing high volumes of transactions.
- OLAP uses multidimensional data models (cubes) to organize and analyze data across multiple dimensions like time, products, locations. This allows for fast analysis on aggregated data.
- A hypercube model can represent data with more than three dimensions by displaying the data across multiple tables and pages.
The document provides an overview of key concepts in data warehousing and business intelligence, including:
1) It defines data warehousing concepts such as the characteristics of a data warehouse (subject-oriented, integrated, time-variant, non-volatile), grain/granularity, and the differences between OLTP and data warehouse systems.
2) It discusses the evolution of business intelligence and key components of a data warehouse such as the source systems, staging area, presentation area, and access tools.
3) It covers dimensional modeling concepts like star schemas, snowflake schemas, and slowly and rapidly changing dimensions.
Data warehousing and online analytical processingVijayasankariS
The document discusses data warehousing and online analytical processing (OLAP). It defines a data warehouse as a subject-oriented, integrated, time-variant and non-volatile collection of data used to support management decision making. It describes key concepts such as data warehouse modeling using data cubes and dimensions, extraction, transformation and loading of data, and common OLAP operations. The document also provides examples of star schemas and how they are used to model data warehouses.
This document provides an overview of data warehousing and related concepts. It defines a data warehouse as a centralized database for analysis and reporting that stores current and historical data from multiple sources. The document describes key elements of data warehousing including Extract-Transform-Load (ETL) processes, multidimensional data models, online analytical processing (OLAP), and data marts. It also outlines advantages such as enhanced access and consistency, and disadvantages like time required for data extraction and loading.
William Inmon is considered the father of data warehousing. He has over 35 years of experience in database technology management and data warehouse design. Inmon helped define key characteristics of data warehouses such as being subject oriented, integrated, nonvolatile, and time-variant. He has authored over 45 books and 650 articles on topics related to building, using, and maintaining data warehouses and their role in decision support.
The document discusses advances in database querying and summarizes key topics including data warehousing, online analytical processing (OLAP), and data mining. It describes how data warehouses integrate data from various sources to enable decision making, and how OLAP tools allow users to analyze aggregated data and model "what-if" scenarios. The document also covers data transformation techniques used to build the data warehouse.
The document defines a data warehouse as a subject-oriented, integrated, time-variant and non-volatile collection of data to support management decision making. A data warehouse is maintained separately from operational databases and provides a platform for consolidated historical data analysis. Key features of a data warehouse include dimensional modeling using facts, dimensions, and star or snowflake schemas.
Data Mining Concept & Technique-ch04.pptMutiaSari53
This chapter discusses data warehousing and online analytical processing (OLAP). It defines a data warehouse as a subject-oriented collection of integrated and nonvolatile data used for analysis. Key concepts covered include the multidimensional data cube model used to organize warehouse data, ETL processes for loading data into the warehouse, and star and snowflake schemas for conceptual modeling. The chapter also distinguishes between OLTP and OLAP systems and operations.
The document provides an overview of data warehousing, decision support, online analytical processing (OLAP), and data mining. It discusses what data warehousing is, how it can help organizations make better decisions by integrating data from various sources and making it available for analysis. It also describes OLAP as a way to transform warehouse data into meaningful information for interactive analysis, and lists some common OLAP operations like roll-up, drill-down, slice and dice, and pivot. Finally, it gives a brief introduction to data mining as the process of extracting patterns and relationships from data.
The document discusses data warehousing concepts including:
1) A data warehouse is a subject-oriented, integrated, and non-volatile collection of data used for decision making. It stores historical and current data from multiple sources.
2) The architecture of a data warehouse is typically three-tiered, with an operational data tier, data warehouse/data mart tier for storage, and client access tier. OLAP servers allow analysis of stored data.
3) ROLAP and MOLAP refer to relational and multidimensional approaches for OLAP. ROLAP dynamically generates data cubes from relational databases, while MOLAP pre-calculates and stores aggregated data in multidimensional structures.
This document discusses data warehousing and online analytical processing (OLAP) technology. It defines a data warehouse, compares it to operational databases, and explains how OLAP systems organize and present data for analysis. The document also describes multidimensional data models, common OLAP operations, and the steps to design and construct a data warehouse. Finally, it discusses applications of data warehouses and efficient processing of OLAP queries.
This document discusses data warehousing and online analytical processing (OLAP) technology. It defines a data warehouse, compares it to operational databases, and explains how OLAP systems organize and present data for analysis. The document also describes multidimensional data models, common OLAP operations, and the steps to design and construct a data warehouse. Finally, it discusses applications of data warehouses and efficient processing of OLAP queries.
This presentation was provided by Racquel Jemison, Ph.D., Christina MacLaughlin, Ph.D., and Paulomi Majumder. Ph.D., all of the American Chemical Society, for the second session of NISO's 2024 Training Series "DEIA in the Scholarly Landscape." Session Two: 'Expanding Pathways to Publishing Careers,' was held June 13, 2024.
THE SACRIFICE HOW PRO-PALESTINE PROTESTS STUDENTS ARE SACRIFICING TO CHANGE T...indexPub
The recent surge in pro-Palestine student activism has prompted significant responses from universities, ranging from negotiations and divestment commitments to increased transparency about investments in companies supporting the war on Gaza. This activism has led to the cessation of student encampments but also highlighted the substantial sacrifices made by students, including academic disruptions and personal risks. The primary drivers of these protests are poor university administration, lack of transparency, and inadequate communication between officials and students. This study examines the profound emotional, psychological, and professional impacts on students engaged in pro-Palestine protests, focusing on Generation Z's (Gen-Z) activism dynamics. This paper explores the significant sacrifices made by these students and even the professors supporting the pro-Palestine movement, with a focus on recent global movements. Through an in-depth analysis of printed and electronic media, the study examines the impacts of these sacrifices on the academic and personal lives of those involved. The paper highlights examples from various universities, demonstrating student activism's long-term and short-term effects, including disciplinary actions, social backlash, and career implications. The researchers also explore the broader implications of student sacrifices. The findings reveal that these sacrifices are driven by a profound commitment to justice and human rights, and are influenced by the increasing availability of information, peer interactions, and personal convictions. The study also discusses the broader implications of this activism, comparing it to historical precedents and assessing its potential to influence policy and public opinion. The emotional and psychological toll on student activists is significant, but their sense of purpose and community support mitigates some of these challenges. However, the researchers call for acknowledging the broader Impact of these sacrifices on the future global movement of FreePalestine.
Chapter wise All Notes of First year Basic Civil Engineering.pptxDenish Jangid
Chapter wise All Notes of First year Basic Civil Engineering
Syllabus
Chapter-1
Introduction to objective, scope and outcome the subject
Chapter 2
Introduction: Scope and Specialization of Civil Engineering, Role of civil Engineer in Society, Impact of infrastructural development on economy of country.
Chapter 3
Surveying: Object Principles & Types of Surveying; Site Plans, Plans & Maps; Scales & Unit of different Measurements.
Linear Measurements: Instruments used. Linear Measurement by Tape, Ranging out Survey Lines and overcoming Obstructions; Measurements on sloping ground; Tape corrections, conventional symbols. Angular Measurements: Instruments used; Introduction to Compass Surveying, Bearings and Longitude & Latitude of a Line, Introduction to total station.
Levelling: Instrument used Object of levelling, Methods of levelling in brief, and Contour maps.
Chapter 4
Buildings: Selection of site for Buildings, Layout of Building Plan, Types of buildings, Plinth area, carpet area, floor space index, Introduction to building byelaws, concept of sun light & ventilation. Components of Buildings & their functions, Basic concept of R.C.C., Introduction to types of foundation
Chapter 5
Transportation: Introduction to Transportation Engineering; Traffic and Road Safety: Types and Characteristics of Various Modes of Transportation; Various Road Traffic Signs, Causes of Accidents and Road Safety Measures.
Chapter 6
Environmental Engineering: Environmental Pollution, Environmental Acts and Regulations, Functional Concepts of Ecology, Basics of Species, Biodiversity, Ecosystem, Hydrological Cycle; Chemical Cycles: Carbon, Nitrogen & Phosphorus; Energy Flow in Ecosystems.
Water Pollution: Water Quality standards, Introduction to Treatment & Disposal of Waste Water. Reuse and Saving of Water, Rain Water Harvesting. Solid Waste Management: Classification of Solid Waste, Collection, Transportation and Disposal of Solid. Recycling of Solid Waste: Energy Recovery, Sanitary Landfill, On-Site Sanitation. Air & Noise Pollution: Primary and Secondary air pollutants, Harmful effects of Air Pollution, Control of Air Pollution. . Noise Pollution Harmful Effects of noise pollution, control of noise pollution, Global warming & Climate Change, Ozone depletion, Greenhouse effect
Text Books:
1. Palancharmy, Basic Civil Engineering, McGraw Hill publishers.
2. Satheesh Gopi, Basic Civil Engineering, Pearson Publishers.
3. Ketki Rangwala Dalal, Essentials of Civil Engineering, Charotar Publishing House.
4. BCP, Surveying volume 1
This presentation was provided by Rebecca Benner, Ph.D., of the American Society of Anesthesiologists, for the second session of NISO's 2024 Training Series "DEIA in the Scholarly Landscape." Session Two: 'Expanding Pathways to Publishing Careers,' was held June 13, 2024.
This document provides an overview of wound healing, its functions, stages, mechanisms, factors affecting it, and complications.
A wound is a break in the integrity of the skin or tissues, which may be associated with disruption of the structure and function.
Healing is the body’s response to injury in an attempt to restore normal structure and functions.
Healing can occur in two ways: Regeneration and Repair
There are 4 phases of wound healing: hemostasis, inflammation, proliferation, and remodeling. This document also describes the mechanism of wound healing. Factors that affect healing include infection, uncontrolled diabetes, poor nutrition, age, anemia, the presence of foreign bodies, etc.
Complications of wound healing like infection, hyperpigmentation of scar, contractures, and keloid formation.
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptxEduSkills OECD
Iván Bornacelly, Policy Analyst at the OECD Centre for Skills, OECD, presents at the webinar 'Tackling job market gaps with a skills-first approach' on 12 June 2024
How Barcodes Can Be Leveraged Within Odoo 17Celine George
In this presentation, we will explore how barcodes can be leveraged within Odoo 17 to streamline our manufacturing processes. We will cover the configuration steps, how to utilize barcodes in different manufacturing scenarios, and the overall benefits of implementing this technology.
2. Data Warehousing and OLAP
Technology for Data Mining
What is a data warehouse?
A multi-dimensional data model
Data warehouse architecture
Data warehouse implementation
Further development of data cube technology
From data warehousing to data mining
3. What is Data
Warehouse?
Defined in many different ways, but not rigorously.
A decision support database that is maintained separately from
the organization’s operational database
Support information processing by providing a solid platform of
consolidated, historical data for analysis.
“A data warehouse is a subject-oriented, integrated, time-variant,
and nonvolatile collection of data in support of management’s
decision-making process.”—W. H. Inmon
4. Data Warehouse—SubjectOriented
Organized around major subjects, such as
customer, product, sales.
Focusing on the modeling and analysis of data
for decision makers, not on daily operations or
transaction processing.
Provide a simple and concise view around
particular subject issues by excluding data that
are not useful in the decision support process.
5. Data Warehouse—
Integrated
Constructed by integrating multiple,
heterogeneous data sources
relational databases, flat files, on-line transaction
records
Data cleaning and data integration techniques
are applied.
Ensure consistency in naming conventions,
encoding structures, attribute measures, etc. among
different data sources
E.g., Hotel price: currency, tax, breakfast covered, etc.
When data is moved to the warehouse, it is
converted.
6. Data Warehouse—Time
Variant
The time horizon for the data warehouse is significantly longer than
that of operational systems.
Operational database: current value data.
Data warehouse data: provide information from a historical
perspective (e.g., past 5-10 years)
Every key structure in the data warehouse
Contains an element of time, explicitly or implicitly
But the key of operational data may or may not contain “time
element”.
7. Data Warehouse—NonVolatile
A physically separate store of data transformed
from the operational environment.
Operational update of data does not occur in the
data warehouse environment.
Does not require transaction processing, recovery,
and concurrency control mechanisms
Requires only two operations in data accessing:
initial loading of data and access of data.
8. Data Warehouse vs.
Operational DBMS
OLTP (on-line transaction processing)
Major task of traditional relational DBMS
Day-to-day operations: purchasing, inventory, banking,
manufacturing, payroll, registration, accounting, etc.
OLAP (on-line analytical processing)
Major task of data warehouse system
Data analysis and decision making
9. OLTP vs. OLAP
OLTP
OLAP
users
clerk, IT professional
knowledge worker
function
day to day operations
decision support
DB design
application-oriented
subject-oriented
data
current, up-to-date
detailed, flat relational
isolated
repetitive
historical,
summarized, multidimensional
integrated, consolidated
ad-hoc
lots of scans
unit of work
read/write
index/hash on prim. key
short, simple transaction
# records accessed
tens
millions
#users
thousands
hundreds
DB size
100MB-GB
100GB-TB
metric
transaction throughput
query throughput, response
usage
access
complex query
10. Why Separate Data
Warehouse?
High performance for both systems
DBMS— tuned for OLTP: access methods, indexing,
concurrency control, recovery
Warehouse—tuned for OLAP: complex OLAP queries,
multidimensional view, consolidation.
Different functions and different data:
missing data: Decision support requires historical data
which operational DBs do not typically maintain
data consolidation: DS requires consolidation
(aggregation, summarization) of data from
heterogeneous sources
data quality: different sources typically use inconsistent
data representations, codes and formats which have to
be reconciled
11. Data Warehousing and OLAP
Technology for Data Mining
What is a data warehouse?
A multi-dimensional data model
Data warehouse architecture
Data warehouse implementation
Further development of data cube technology
From data warehousing to data mining
12. From Tables and
Spreadsheets to Data
Cubes
A data warehouse is based on a multidimensional data model
which views data in the form of a data cube
A data cube, such as sales, allows data to be modeled and viewed
in multiple dimensions
Dimension tables, such as item (item_name, brand, type), or
time(day, week, month, quarter, year)
Fact table contains measures (such as dollars_sold) and keys
to each of the related dimension tables
13. Multidimensional Data
Sales volume as a function of product,
month, and region Dimensions: Product, Location, Time
Re
gi
on
Hierarchical summarization paths
Industry Region
Year
Product
Category Country Quarter
Product
City
Office
Month
Month
Day
Week
15. Cuboids Corresponding to
the Cube
all
0-D(apex) cuboid
product
product,date
date
country
product,country
1-D cuboids
date, country
2-D cuboids
product, date, country
3-D(base) cuboid
16. Typical OLAP Operations
Roll up (drill-up): summarize data
by climbing up hierarchy or by dimension reduction
Drill down (roll down): reverse of roll-up
from higher level summary to lower level summary or detailed
data, or introducing new dimensions
Slice and dice:
project and select
Pivot (rotate):
reorient the cube, visualization, 3D to series of 2D planes.
Other operations
drill across: involving (across) more than one fact table
drill through: through the bottom level of the cube to its backend relational tables (using SQL)
17. Data Warehousing and OLAP
Technology for Data Mining
What is a data warehouse?
A multi-dimensional data model
Data warehouse architecture
Data warehouse implementation
Further development of data cube technology
From data warehousing to data mining
18. Data Warehouse Design
Process
Top-down, bottom-up approaches or a combination of both
Top-down: Starts with overall design and planning (mature)
Bottom-up: Starts with experiments and prototypes (rapid)
From software engineering point of view
Waterfall: structured and systematic analysis at each step before
proceeding to the next
Spiral: rapid generation of increasingly functional systems, short
turn around time, quick turn around
Typical data warehouse design process
Choose a business process to model, e.g., orders, invoices, etc.
Choose the grain (atomic level of data) of the business process
Choose the dimensions that will apply to each fact table record
Choose the measure that will populate each fact table record
20. Data Warehousing and OLAP
Technology for Data Mining
What is a data warehouse?
A multi-dimensional data model
Data warehouse architecture
Data warehouse implementation
Further development of data cube technology
From data warehousing to data mining
21. Efficient Data Cube
Computation
Data cube can be viewed as a lattice of cuboids
The bottom-most cuboid is the base cuboid
The top-most cuboid (apex) contains only one cell
How many cuboids in an n-dimensional cube with L
n
levels? T = ∏ (L +1)
i =1
i
Materialization of data cube
Materialize every (cuboid) (full materialization), none
(no materialization), or some (partial materialization)
Selection of which cuboids to materialize
Based on size, sharing, access frequency, etc.
22. Cube Operation
Cube definition and computation in DMQL
define cube sales[item, city, year]: sum(sales_in_dollars)
compute cube sales
Transform it into a SQL-like language (with a new operator
cube by, introduced by Gray et al.’96)
()
SELECT item, city, year, SUM (amount)
FROM SALES
(city)
(item)
(year)
CUBE BY item, city, year
(city, item)
(city, year)
(city, item, year)
(item, year)
23. Indexing OLAP Data:
Bitmap Index
Index on a particular column
Each value in the column has a bit vector: bit-op is fast
The length of the bit vector: # of records in the base table
The i-th bit is set if the i-th row of the base table has the value for
the indexed column
not suitable for high cardinality domains
Base table
Cust
C1
C2
C3
C4
C5
Region
Asia
Europe
Asia
America
Europe
Index on Region
Index on Type
Type RecID Asia Europe America RecID Retail Dealer
Retail
1
1
0
1
1
0
0
Dealer 2
2
0
1
0
1
0
Dealer 3
3
0
1
1
0
0
Retail
4
1
0
4
0
0
1
5
0
1
0
1
0
Dealer 5
24. Indexing OLAP Data: Join
Indices
Traditional indices map the values to a list of
record ids
It materializes relational join in JI file and
speeds up relational join — a rather costly
operation
In data warehouses, join index relates the values
of the dimensions of a start schema to rows in
the fact table.
E.g. fact table: Sales and two dimensions city
and product
A join index on city maintains for each
distinct city a list of R-IDs of the tuples
recording the Sales in the city
Join indices can span multiple dimensions
25. Efficient Processing OLAP
Queries
Determine which operations should be performed
on the available cuboids:
transform drill, roll, etc. into corresponding SQL and/or
OLAP operations, e.g, dice = selection + projection
Determine to which materialized cuboid(s) the
relevant operations should be applied.
26. Metadata Repository
Meta data is the data defining warehouse objects. It has
the following kinds
Description of the structure of the warehouse
schema, view, dimensions, hierarchies, derived data defn, data mart
locations and contents
Operational meta-data
data lineage (history of migrated data and transformation path),
currency of data (active, archived, or purged), monitoring information
(warehouse usage statistics, error reports, audit trails)
The algorithms used for summarization
The mapping from operational environment to the data warehouse
Data related to system performance
warehouse schema, view and derived data definitions
Business data
business terms and definitions, ownership of data, charging policies
27. Data Warehouse Back-End
Tools and Utilities
Data extraction:
get data from multiple, heterogeneous, and external
sources
Data cleaning:
detect errors in the data and rectify them when
possible
Data transformation:
convert data from legacy or host format to warehouse
format
Load:
sort, summarize, consolidate, compute views, check
integrity, and build indicies and partitions
Refresh
propagate the updates from the data sources to the
warehouse
28. Data Warehousing and OLAP
Technology for Data Mining
What is a data warehouse?
A multi-dimensional data model
Data warehouse architecture
Data warehouse implementation
Further development of data cube technology
From data warehousing to data mining
29. Discovery-Driven
Exploration of Data Cubes
Hypothesis-driven: exploration by user, huge search space
Discovery-driven
pre-compute measures indicating exceptions, guide user in the
data analysis, at all levels of aggregation
Exception: significantly different from the value anticipated,
based on a statistical model
Visual cues such as background color are used to reflect the
degree of exception of each cell
Computation of exception indicator (modeling fitting and
computing SelfExp, InExp, and PathExp values) can be
overlapped with cube construction
31. Complex Aggregation at Multiple
Granularities: Multi-Feature Cubes
Ex. Grouping by all subsets of {item, region, month}, find the
maximum price in 1997 for each group, and the total sales among all
maximum price tuples
select item, region, month, max(price), sum(R.sales)
from purchases
where year = 1997
cube by item, region, month: R
such that R.price = max(price)
32. Data Warehousing and OLAP
Technology for Data Mining
What is a data warehouse?
A multi-dimensional data model
Data warehouse architecture
Data warehouse implementation
Further development of data cube technology
From data warehousing to data mining
33. Data Warehouse Usage
Three kinds of data warehouse applications
Information processing
supports querying, basic statistical analysis, and reporting
using crosstabs, tables, charts and graphs
Analytical processing
multidimensional analysis of data warehouse data
supports basic OLAP operations, slice-dice, drilling,
pivoting
Data mining
knowledge discovery from hidden patterns
supports associations, constructing analytical models,
performing classification and prediction, and presenting the
mining results using visualization tools.
34. Summary
Data warehouse
A subject-oriented, integrated, time-variant, and nonvolatile collection of
data in support of management’s decision-making process
A multi-dimensional model of a data warehouse
Star schema, snowflake schema, fact constellations
A data cube consists of dimensions & measures
OLAP operations: drilling, rolling, slicing, dicing and pivoting
Efficient computation of data cubes
Partial vs. full vs. no materialization
Multiway array aggregation
Bitmap index and join index implementations
Further development of data cube technology
Discovery-drive and multi-feature cubes
35. References (I)
S. Agarwal, R. Agrawal, P. M. Deshpande, A. Gupta, J. F. Naughton, R. Ramakrishnan, and S.
Sarawagi. On the computation of multidimensional aggregates. In Proc. 1996 Int. Conf. Very Large
Data Bases, 506-521, Bombay, India, Sept. 1996.
D. Agrawal, A. E. Abbadi, A. Singh, and T. Yurek. Efficient view maintenance in data warehouses.
In Proc. 1997 ACM-SIGMOD Int. Conf. Management of Data, 417-427, Tucson, Arizona, May 1997.
R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan. Automatic subspace clustering of high
dimensional data for data mining applications. In Proc. 1998 ACM-SIGMOD Int. Conf. Management
of Data, 94-105, Seattle, Washington, June 1998.
R. Agrawal, A. Gupta, and S. Sarawagi. Modeling multidimensional databases. In Proc. 1997 Int.
Conf. Data Engineering, 232-243, Birmingham, England, April 1997.
K. Beyer and R. Ramakrishnan. Bottom-Up Computation of Sparse and Iceberg CUBEs. In Proc.
1999 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD'99), 359-370, Philadelphia, PA, June
1999.
S. Chaudhuri and U. Dayal. An overview of data warehousing and OLAP technology. ACM SIGMOD
Record, 26:65-74, 1997.
OLAP council. MDAPI specification version 2.0. In http://www.olapcouncil.org/research/apily.htm,
1998.
J. Gray, S. Chaudhuri, A. Bosworth, A. Layman, D. Reichart, M. Venkatrao, F. Pellow, and H.
Pirahesh. Data cube: A relational aggregation operator generalizing group-by, cross-tab and subtotals. Data Mining and Knowledge Discovery, 1:29-54, 1997.
36. References (II)
V. Harinarayan, A. Rajaraman, and J. D. Ullman. Implementing data cubes efficiently. In Proc.
1996 ACM-SIGMOD Int. Conf. Management of Data, pages 205-216, Montreal, Canada, June
1996.
Microsoft. OLEDB for OLAP programmer's reference version 1.0. In
http://www.microsoft.com/data/oledb/olap, 1998.
K. Ross and D. Srivastava. Fast computation of sparse datacubes. In Proc. 1997 Int. Conf. Very
Large Data Bases, 116-125, Athens, Greece, Aug. 1997.
K. A. Ross, D. Srivastava, and D. Chatziantoniou. Complex aggregation at multiple granularities.
In Proc. Int. Conf. of Extending Database Technology (EDBT'98), 263-277, Valencia, Spain, March
1998.
S. Sarawagi, R. Agrawal, and N. Megiddo. Discovery-driven exploration of OLAP data cubes. In
Proc. Int. Conf. of Extending Database Technology (EDBT'98), pages 168-182, Valencia, Spain,
March 1998.
E. Thomsen. OLAP Solutions: Building Multidimensional Information Systems. John Wiley & Sons,
1997.
Y. Zhao, P. M. Deshpande, and J. F. Naughton. An array-based algorithm for simultaneous
multidimensional aggregates. In Proc. 1997 ACM-SIGMOD Int. Conf. Management of Data, 159170, Tucson, Arizona, May 1997.