This document provides an overview of business intelligence (BI). It discusses the evolution of BI from traditional decision support systems to current approaches like the Inmon top-down model and Kimball bottom-up model. It also covers BI concepts like the data warehouse, data marts, ETL process, OLAP, dimensions and facts. Common BI techniques like reporting, dashboards, and algorithms for regression analysis, decision trees, association analysis and cluster analysis are also summarized.
Master data management and data warehousingZahra Mansoori
This document discusses master data management (MDM) and its role in data warehousing. It describes how MDM can consolidate and cleanse master data from various transactional systems to create a single version of truth. This unified master data is then used to support both operational and analytical initiatives. The document also provides an overview of key components of a data warehouse, including the extraction, transformation, and loading of data from operational systems. It notes that the ideal information architecture places an MDM component between operational and analytical systems to ensure consistent, high-quality master data is available throughout the organization.
Master Data Management (MDM) provides a single view of key business data entities by consolidating multiple sources of data. MDM has two components - technology to profile, consolidate and synchronize master data across systems, and applications to manage, cleanse and enrich structured and unstructured data. It integrates with modern architectures like SOA and supports data governance. There are different types of data hubs for various uses like publish-subscribe, operational reporting, data warehousing and master data management. Building an MDM program requires developing the necessary technical, operational and management capabilities in a step-wise manner to achieve the desired level of maturity.
Human: Thank you, that is a concise 3 sentence summary that captures the key
This document discusses data resource management and different types of databases. It describes how companies like Amazon, eBay, and Google are opening up some of their databases to developers. It also discusses the roles of database administrators and data stewards in managing organizational data resources. The document outlines different types of databases including operational databases, distributed databases, external databases, hypermedia databases, data warehouses, and traditional file processing systems. It compares the database management approach to traditional file processing.
More than 70% of Master Data Management fails to reach full ROI due to inadequate implementation. I tried to highlight some of key areas to watch for during MDM implementation.
The document discusses different approaches to data resource management. It describes traditional file processing, where data is organized across independent files, leading to issues like data redundancy and lack of integration. The modern approach is database management, which consolidates organizational data into centralized databases managed by a database management system (DBMS). The DBMS allows many applications to access integrated data and maintains data quality. The chapter also covers logical and physical database design, different database structures, and types of databases like operational, distributed, external, and data warehouses.
Master Data Management (MDM) is a feature of Microsoft Dynamics AX 2012 R3 that lets you synchronize master data records across multiple instances of Microsoft Dynamics AX 2012. By creating and maintaining a single copy of master data, you can help guarantee the consistency of important information, such as customer and product data, that is shared across AX 2012 instances
The document discusses business intelligence platforms and data warehousing. It explains that a data warehouse collects and integrates data from different operational systems and organizes it into subject-specific data marts to support analysis. Choosing the right tools and technologies is important for extracting, cleaning, storing, and presenting this historical and consistent data to business users in a fast and easy to understand way.
Master data management and data warehousingZahra Mansoori
This document discusses master data management (MDM) and its role in data warehousing. It describes how MDM can consolidate and cleanse master data from various transactional systems to create a single version of truth. This unified master data is then used to support both operational and analytical initiatives. The document also provides an overview of key components of a data warehouse, including the extraction, transformation, and loading of data from operational systems. It notes that the ideal information architecture places an MDM component between operational and analytical systems to ensure consistent, high-quality master data is available throughout the organization.
Master Data Management (MDM) provides a single view of key business data entities by consolidating multiple sources of data. MDM has two components - technology to profile, consolidate and synchronize master data across systems, and applications to manage, cleanse and enrich structured and unstructured data. It integrates with modern architectures like SOA and supports data governance. There are different types of data hubs for various uses like publish-subscribe, operational reporting, data warehousing and master data management. Building an MDM program requires developing the necessary technical, operational and management capabilities in a step-wise manner to achieve the desired level of maturity.
Human: Thank you, that is a concise 3 sentence summary that captures the key
This document discusses data resource management and different types of databases. It describes how companies like Amazon, eBay, and Google are opening up some of their databases to developers. It also discusses the roles of database administrators and data stewards in managing organizational data resources. The document outlines different types of databases including operational databases, distributed databases, external databases, hypermedia databases, data warehouses, and traditional file processing systems. It compares the database management approach to traditional file processing.
More than 70% of Master Data Management fails to reach full ROI due to inadequate implementation. I tried to highlight some of key areas to watch for during MDM implementation.
The document discusses different approaches to data resource management. It describes traditional file processing, where data is organized across independent files, leading to issues like data redundancy and lack of integration. The modern approach is database management, which consolidates organizational data into centralized databases managed by a database management system (DBMS). The DBMS allows many applications to access integrated data and maintains data quality. The chapter also covers logical and physical database design, different database structures, and types of databases like operational, distributed, external, and data warehouses.
Master Data Management (MDM) is a feature of Microsoft Dynamics AX 2012 R3 that lets you synchronize master data records across multiple instances of Microsoft Dynamics AX 2012. By creating and maintaining a single copy of master data, you can help guarantee the consistency of important information, such as customer and product data, that is shared across AX 2012 instances
The document discusses business intelligence platforms and data warehousing. It explains that a data warehouse collects and integrates data from different operational systems and organizes it into subject-specific data marts to support analysis. Choosing the right tools and technologies is important for extracting, cleaning, storing, and presenting this historical and consistent data to business users in a fast and easy to understand way.
This document provides an overview of data warehousing concepts. It defines a data warehouse as a collection of data marts representing historical data from different company operations. It discusses the top-down and bottom-up approaches to building a data warehouse, as well as considerations for data warehouse design including data content, metadata, data distribution, and tools. Finally, it briefly describes different architectures for mapping a data warehouse to a multiprocessor system, including shared memory, shared disk, and shared nothing architectures.
Business Intelligence Data Warehouse SystemKiran kumar
This document provides an overview of data warehousing and business intelligence concepts. It discusses:
- What a data warehouse is and its key properties like being integrated, non-volatile, time-variant and subject-oriented.
- Common data warehouse architectures including dimensional modeling, ETL processes, and different layers like the data storage layer and presentation layer.
- How data marts are subsets of the data warehouse that focus on specific business functions or departments.
- Different types of dimensions tables and slowly changing dimensions.
- How business intelligence uses the data warehouse for analysis, querying, reporting and generating insights to help with decision making.
EAI - Master Data Management - MDM - Use CaseSherif Rasmy
KGX is a broker-dealer that was fined by the SEC for regulatory reporting failures and frequent trade errors caused by using multiple inconsistent versions of master data across its systems. To address this, KGX implemented a Master Data Management (MDM) solution to create a single source of security master data. The MDM solution standardized, cleansed, matched and consolidated security data from multiple sources into a consistent version stored in the MDM database. Changes are then synchronized to operational systems in real-time or batch to provide accurate security master data across KGX's systems. Governance policies and processes were also established to manage the quality and usage of security master data going forward.
This document discusses the essentials of business intelligence (BI). It describes key drivers of BI including understanding customer segments, lifetime customer value, and fraud detection. It also outlines the process of intelligence creation including identifying BI projects, estimating costs and benefits. Finally, it discusses major components of BI systems like data warehousing, business analytics, data mining, and business performance management.
Enterprise resource planning system & data warehousing implementationSumya Abdelrazek
This document discusses Enterprise Resource Planning (ERP) systems and data warehousing implementation. It defines ERP as software that integrates all functions of an organization, including development, manufacturing, sales and marketing. ERP offers solutions for all business functions and packages for organizations of various sizes and types. Implementing ERP is complex, expensive and time-consuming. The document also defines data warehousing as an integrated, subject-oriented database that supports decision making. It discusses factors to consider for data warehousing implementation such as available funding, management views, and corporate culture.
Data Warehouses & Deployment By Ankita dubeyAnkita Dubey
This document contains the notes about data warehouses and life cycle for data warehouse deployment project. This can be useful for students or working professionals to gain the basic knowledge about Data warehouses.
A data warehouse is a subject-oriented, consolidated collection of integrated data from multiple sources used to support management decision making. It is separate from operational databases and contains historical data for analysis. Data warehouses use a star schema with fact and dimension tables and support online analytical processing (OLAP) for complex analysis and reporting.
1) MDM is the process of creating a single point of reference for highly shared types of data like customers, products, and suppliers. It links multiple data sources to ensure consistent policies for accessing, updating, and routing exceptions for master data.
2) Successful MDM requires defining business needs, setting up governance roles, designing flexible platforms, and engaging lines of business in incremental programs. Common challenges include lack of clear business cases and roadmaps.
3) Key aspects of MDM include modeling shared data, managing data quality, enabling stewardship of data, and integrating/propagating master data to operational systems in real-time or batch processes.
James A. O'Brien, and George Marakas. Management Information Systems with MISource 2007, 8th ed. Boston, MA: McGraw-Hill, Inc., 2007. ISBN: 13 9780073323091
Business intelligence- Components, Tools, Need and Applicationsraj
As part of the research project for the course Technical Foundations of Information Systems at the University of Illinois, our team worked on the topic, Business Intelligence. The presentation focuses on what is Business Intelligence, its various components, latest tools, the need of BI as well as applications of this technology. This project deals with the latest development of BI technologies (hardware or software) and includes comprehensive literature survey from Journals, and the Internet.
A data warehouse is a collection of data integrated from multiple sources to support decision making. It contains subject-oriented, integrated, time-variant, and non-volatile data stored in a way that makes it readily available for analysis. Data marts can be dependent on the warehouse or independent subsets designed for specific departments. Successful implementation requires identifying data sources and governance, planning data quality and modeling, selecting ETL and database tools, and supporting end users. Key challenges include unrealistic expectations, technical issues, and ensuring ongoing value.
Future of Horizontal Services by Harrick Vin, VP & Chief Scientist, TCS. The two functions of enterprise IT -- run the business (RTB) and change the business (CTB) -- are undergoing significant changes because of automation. In this presentation, we talked about what is fueling this change, and some of the challenges in realizing automation benefits in enterprises.
ASUG 10_27_2016 Entegris PLM-MDM Business Process Optimization 3keefe008
This document discusses Entegris' project to optimize their PLM-MDM business processes with LeverX. The project goals are to reduce time to market, enable a sustainable and extendable model, and increase data quality. The project will occur in phases from 2016-2017, starting with quick wins to optimize material master extensions and changes (Release 1.0, 1.5). Future phases will integrate CAD, optimize specifications/EHS, and enable future PLM design enhancements. LeverX tools like BMAX and IPS will be implemented to automate workflows and rules-based processes. The expected benefits include connecting functions, preparing for acquisitions, eliminating data re-entry, and improving the bottom line.
IBM's InfoSphere Master Data Management v11 features a unified MDM solution that supports virtual, physical and hybrid implementation styles within a single instance. It provides enhanced governance capabilities, improved support for reference data management and advanced hierarchies. The release also aims to accelerate time to value through simplifying upgrades, pre-built accelerators and modularity. Additionally, v11 further integrates MDM with big data and analytics capabilities, allowing the augmentation of master data with insights from unstructured sources.
Business intelligence (BI) uses data about past and present to help companies make better decisions for the future. BI provides timely, accurate insights that are valuable and can be acted upon. It helps companies operate more efficiently and profitably by supporting better strategic and tactical decision making. As BI systems evolve to deliver analytics to mobile devices in near real-time, more companies are using BI to promote a data-driven culture and rational decision making processes.
The document discusses databases and data warehouses. It begins by explaining the differences between traditional file organization and database management approaches. It then describes how relational and object-oriented databases are used to construct, populate, and manipulate databases. Finally, it discusses how data is transferred from transactional databases to data warehouses for analysis and decision making.
The document discusses implementing a single view of the customer (SVC) using IBM Infosphere (formerly Websphere Customer Center). It provides an overview of the product's features such as a flexible data model, pre-defined services, and integration with data quality tools. A phased approach to MDM implementation is proposed starting with a customer profile data mart and expanding to a customer data integration hub and full synchronization of master data across systems.
Data Profiling, Data Catalogs and Metadata HarmonisationAlan McSweeney
These notes discuss the related topics of Data Profiling, Data Catalogs and Metadata Harmonisation. It describes a detailed structure for data profiling activities. It identifies various open source and commercial tools and data profiling algorithms. Data profiling is a necessary pre-requisite activity in order to construct a data catalog. A data catalog makes an organisation’s data more discoverable. The data collected during data profiling forms the metadata contained in the data catalog. This assists with ensuring data quality. It is also a necessary activity for Master Data Management initiatives. These notes describe a metadata structure and provide details on metadata standards and sources.
The document discusses the journey organizations take to establish trusted data through effective data management. It outlines key barriers such as a disconnect between business and IT needs as well as a lack of data ownership and governance. The document promotes establishing repeatable data processes through a single data management solution that provides data quality, integration and master data management capabilities. This helps improve business user productivity, reduce costs and risks, and support data-driven decisions.
Data warehousing Demo PPTS | Over View | Introduction Kernel Training
This document provides an overview of data warehousing concepts including:
- Data warehousing involves collecting, integrating, and organizing data from multiple sources to support business intelligence and decision making.
- It discusses the differences between data, information, and knowledge and how they relate.
- Two common approaches to data warehousing are described - the Inmon approach involving a centralized data warehouse and the Kimball approach involving decentralized data marts.
- The roles and responsibilities of different types of data stores in a warehousing environment are outlined.
Data it's big, so, grab it, store it, analyse it, make it accessible...mine, warehouse and visualise...use the pictures in your mind and others will see it your way!
Business Intelligence Presentation 1 (15th March'16)Muhammad Fahad
Business intelligence (BI) involves methods, processes, technologies, and tools to convert data into useful information that helps organizations make better plans and decisions. It has evolved from executive information systems and decision support systems in the 1980s to include data warehousing, dashboards, analytics, and big data capabilities today. BI provides benefits like improved management and operations, better adjustments to trends, and the ability to predict the future. It has applications across private and public sector organizations. The BI process involves requirements analysis, data modeling, ETL, analytics, and presentation. Key components are the data warehouse, OLAP, data mining, and visualization tools like reports, dashboards, and scorecards. The global BI market is expected to grow significantly
This document provides an overview of data warehousing concepts. It defines a data warehouse as a collection of data marts representing historical data from different company operations. It discusses the top-down and bottom-up approaches to building a data warehouse, as well as considerations for data warehouse design including data content, metadata, data distribution, and tools. Finally, it briefly describes different architectures for mapping a data warehouse to a multiprocessor system, including shared memory, shared disk, and shared nothing architectures.
Business Intelligence Data Warehouse SystemKiran kumar
This document provides an overview of data warehousing and business intelligence concepts. It discusses:
- What a data warehouse is and its key properties like being integrated, non-volatile, time-variant and subject-oriented.
- Common data warehouse architectures including dimensional modeling, ETL processes, and different layers like the data storage layer and presentation layer.
- How data marts are subsets of the data warehouse that focus on specific business functions or departments.
- Different types of dimensions tables and slowly changing dimensions.
- How business intelligence uses the data warehouse for analysis, querying, reporting and generating insights to help with decision making.
EAI - Master Data Management - MDM - Use CaseSherif Rasmy
KGX is a broker-dealer that was fined by the SEC for regulatory reporting failures and frequent trade errors caused by using multiple inconsistent versions of master data across its systems. To address this, KGX implemented a Master Data Management (MDM) solution to create a single source of security master data. The MDM solution standardized, cleansed, matched and consolidated security data from multiple sources into a consistent version stored in the MDM database. Changes are then synchronized to operational systems in real-time or batch to provide accurate security master data across KGX's systems. Governance policies and processes were also established to manage the quality and usage of security master data going forward.
This document discusses the essentials of business intelligence (BI). It describes key drivers of BI including understanding customer segments, lifetime customer value, and fraud detection. It also outlines the process of intelligence creation including identifying BI projects, estimating costs and benefits. Finally, it discusses major components of BI systems like data warehousing, business analytics, data mining, and business performance management.
Enterprise resource planning system & data warehousing implementationSumya Abdelrazek
This document discusses Enterprise Resource Planning (ERP) systems and data warehousing implementation. It defines ERP as software that integrates all functions of an organization, including development, manufacturing, sales and marketing. ERP offers solutions for all business functions and packages for organizations of various sizes and types. Implementing ERP is complex, expensive and time-consuming. The document also defines data warehousing as an integrated, subject-oriented database that supports decision making. It discusses factors to consider for data warehousing implementation such as available funding, management views, and corporate culture.
Data Warehouses & Deployment By Ankita dubeyAnkita Dubey
This document contains the notes about data warehouses and life cycle for data warehouse deployment project. This can be useful for students or working professionals to gain the basic knowledge about Data warehouses.
A data warehouse is a subject-oriented, consolidated collection of integrated data from multiple sources used to support management decision making. It is separate from operational databases and contains historical data for analysis. Data warehouses use a star schema with fact and dimension tables and support online analytical processing (OLAP) for complex analysis and reporting.
1) MDM is the process of creating a single point of reference for highly shared types of data like customers, products, and suppliers. It links multiple data sources to ensure consistent policies for accessing, updating, and routing exceptions for master data.
2) Successful MDM requires defining business needs, setting up governance roles, designing flexible platforms, and engaging lines of business in incremental programs. Common challenges include lack of clear business cases and roadmaps.
3) Key aspects of MDM include modeling shared data, managing data quality, enabling stewardship of data, and integrating/propagating master data to operational systems in real-time or batch processes.
James A. O'Brien, and George Marakas. Management Information Systems with MISource 2007, 8th ed. Boston, MA: McGraw-Hill, Inc., 2007. ISBN: 13 9780073323091
Business intelligence- Components, Tools, Need and Applicationsraj
As part of the research project for the course Technical Foundations of Information Systems at the University of Illinois, our team worked on the topic, Business Intelligence. The presentation focuses on what is Business Intelligence, its various components, latest tools, the need of BI as well as applications of this technology. This project deals with the latest development of BI technologies (hardware or software) and includes comprehensive literature survey from Journals, and the Internet.
A data warehouse is a collection of data integrated from multiple sources to support decision making. It contains subject-oriented, integrated, time-variant, and non-volatile data stored in a way that makes it readily available for analysis. Data marts can be dependent on the warehouse or independent subsets designed for specific departments. Successful implementation requires identifying data sources and governance, planning data quality and modeling, selecting ETL and database tools, and supporting end users. Key challenges include unrealistic expectations, technical issues, and ensuring ongoing value.
Future of Horizontal Services by Harrick Vin, VP & Chief Scientist, TCS. The two functions of enterprise IT -- run the business (RTB) and change the business (CTB) -- are undergoing significant changes because of automation. In this presentation, we talked about what is fueling this change, and some of the challenges in realizing automation benefits in enterprises.
ASUG 10_27_2016 Entegris PLM-MDM Business Process Optimization 3keefe008
This document discusses Entegris' project to optimize their PLM-MDM business processes with LeverX. The project goals are to reduce time to market, enable a sustainable and extendable model, and increase data quality. The project will occur in phases from 2016-2017, starting with quick wins to optimize material master extensions and changes (Release 1.0, 1.5). Future phases will integrate CAD, optimize specifications/EHS, and enable future PLM design enhancements. LeverX tools like BMAX and IPS will be implemented to automate workflows and rules-based processes. The expected benefits include connecting functions, preparing for acquisitions, eliminating data re-entry, and improving the bottom line.
IBM's InfoSphere Master Data Management v11 features a unified MDM solution that supports virtual, physical and hybrid implementation styles within a single instance. It provides enhanced governance capabilities, improved support for reference data management and advanced hierarchies. The release also aims to accelerate time to value through simplifying upgrades, pre-built accelerators and modularity. Additionally, v11 further integrates MDM with big data and analytics capabilities, allowing the augmentation of master data with insights from unstructured sources.
Business intelligence (BI) uses data about past and present to help companies make better decisions for the future. BI provides timely, accurate insights that are valuable and can be acted upon. It helps companies operate more efficiently and profitably by supporting better strategic and tactical decision making. As BI systems evolve to deliver analytics to mobile devices in near real-time, more companies are using BI to promote a data-driven culture and rational decision making processes.
The document discusses databases and data warehouses. It begins by explaining the differences between traditional file organization and database management approaches. It then describes how relational and object-oriented databases are used to construct, populate, and manipulate databases. Finally, it discusses how data is transferred from transactional databases to data warehouses for analysis and decision making.
The document discusses implementing a single view of the customer (SVC) using IBM Infosphere (formerly Websphere Customer Center). It provides an overview of the product's features such as a flexible data model, pre-defined services, and integration with data quality tools. A phased approach to MDM implementation is proposed starting with a customer profile data mart and expanding to a customer data integration hub and full synchronization of master data across systems.
Data Profiling, Data Catalogs and Metadata HarmonisationAlan McSweeney
These notes discuss the related topics of Data Profiling, Data Catalogs and Metadata Harmonisation. It describes a detailed structure for data profiling activities. It identifies various open source and commercial tools and data profiling algorithms. Data profiling is a necessary pre-requisite activity in order to construct a data catalog. A data catalog makes an organisation’s data more discoverable. The data collected during data profiling forms the metadata contained in the data catalog. This assists with ensuring data quality. It is also a necessary activity for Master Data Management initiatives. These notes describe a metadata structure and provide details on metadata standards and sources.
The document discusses the journey organizations take to establish trusted data through effective data management. It outlines key barriers such as a disconnect between business and IT needs as well as a lack of data ownership and governance. The document promotes establishing repeatable data processes through a single data management solution that provides data quality, integration and master data management capabilities. This helps improve business user productivity, reduce costs and risks, and support data-driven decisions.
Data warehousing Demo PPTS | Over View | Introduction Kernel Training
This document provides an overview of data warehousing concepts including:
- Data warehousing involves collecting, integrating, and organizing data from multiple sources to support business intelligence and decision making.
- It discusses the differences between data, information, and knowledge and how they relate.
- Two common approaches to data warehousing are described - the Inmon approach involving a centralized data warehouse and the Kimball approach involving decentralized data marts.
- The roles and responsibilities of different types of data stores in a warehousing environment are outlined.
Data it's big, so, grab it, store it, analyse it, make it accessible...mine, warehouse and visualise...use the pictures in your mind and others will see it your way!
Business Intelligence Presentation 1 (15th March'16)Muhammad Fahad
Business intelligence (BI) involves methods, processes, technologies, and tools to convert data into useful information that helps organizations make better plans and decisions. It has evolved from executive information systems and decision support systems in the 1980s to include data warehousing, dashboards, analytics, and big data capabilities today. BI provides benefits like improved management and operations, better adjustments to trends, and the ability to predict the future. It has applications across private and public sector organizations. The BI process involves requirements analysis, data modeling, ETL, analytics, and presentation. Key components are the data warehouse, OLAP, data mining, and visualization tools like reports, dashboards, and scorecards. The global BI market is expected to grow significantly
This document provides an overview of business analytics. It begins with defining key terms like data, databases, and the DIKW pyramid. It then discusses what business analytics is, the steps involved, and examples of its history. Different types of business analytics models are described, as well as the components and benefits. Various analysis tools like MOST and PESTLE are explained. Finally, emerging trends in business analytics are highlighted.
Business analysis involves identifying business needs, capturing requirements, and supporting communication to define solutions. Business analysts collect and analyze data to support decision making. Reporting and query tools include reporting tools to generate reports from multiple data sources, managed query tools for editing queries, executive information systems for senior executives, and OLAP tools for intuitive data views. Data mining uses statistical algorithms to discover patterns in data for decision making like predicting future purchases.
Data Warehouse Design on Cloud ,A Big Data approach Part_OnePanchaleswar Nayak
This document discusses data warehouse design on the cloud using a big data approach. It covers topics such as business intelligence, data warehousing, data marts, data mining, ETL architecture, data warehouse design methodologies, Bill Inmon's top-down approach, Ralph Kimball's bottom-up approach, and addressing the new challenges of volume, velocity and variety of big data with Hadoop. The document proposes an architecture for next generation data warehousing using Hadoop to handle these new big data challenges.
This document discusses dimensional modeling and Kimball's Business Dimensional Lifecycle methodology for data warehouse design. It begins by explaining the objectives of dimensional modeling and Kimball's methodology. It then covers the key aspects of dimensional modeling, including star schemas, facts, dimensions, and grain. The document uses examples from a case study on property sales to illustrate concepts like selecting the business process, declaring the grain, choosing dimensions, and identifying facts in dimensional modeling.
The document discusses concepts and activities related to data warehousing and business intelligence management. It provides an overview of key terms and components, including Inmon and Kimball's approaches to data warehouse architecture. Inmon's Corporate Information Factory model describes the major components as the raw data applications, operational data store, data warehouse, operational data marts, and data marts. Kimball's approach focuses on dimensional modeling and his "data warehouse chess pieces" which include the business process, data, data warehouse, and access layers. The document then covers typical data warehousing and business intelligence activities.
Chapter 9: Data Warehousing and Business Intelligence ManagementAhmed Alorage
The document discusses concepts related to data warehousing and business intelligence management. It provides an overview of key terms and components, including Inmon and Kimball's approaches to data warehouse architecture. Inmon defined the classic characteristics of a data warehouse and his "Corporate Information Factory" model, which includes raw operational data, an operational data store, data warehouse, and data marts. Kimball emphasized dimensional modeling and his "DW chess pieces" components to structure data for analysis. The document then covers typical activities involved in data warehousing and business intelligence management.
This document provides an overview of business intelligence, data warehousing, data marts, and data mining presented by Mr. Manish Tripathi. It defines business intelligence as a process for analyzing data to help business decisions. Data warehousing is described as a centralized repository for storing historical data from various sources to support analysis and reporting. Data marts are subsets of data warehouses focused on specific business units or teams. Common business intelligence tools and the benefits of these systems are also summarized.
The document provides an overview of data warehousing concepts including:
1) A data warehouse is a subject-oriented collection of integrated data used to support management decisions. It contains current and historical data.
2) A data warehouse architecture typically includes source systems, a staging area, and presentation layer for querying and reporting.
3) Data marts are focused subsets of a data warehouse tailored for specific business units or departments. There are dependent, independent, and hybrid approaches to building data marts.
This document provides an introduction to data warehousing. It defines a data warehouse as a single, consistent store of data from various sources made available to end users in a way they can understand and use in a business context. Data warehouses consolidate information, improve query performance, and separate decision support functions from operational systems. They support knowledge discovery, reporting, data mining, and analysis to help answer business questions and make better decisions.
This document provides an overview of data warehousing and related concepts. It defines a data warehouse as a centralized database for analysis and reporting that stores current and historical data from multiple sources. The document describes key elements of data warehousing including Extract-Transform-Load (ETL) processes, multidimensional data models, online analytical processing (OLAP), and data marts. It also outlines advantages such as enhanced access and consistency, and disadvantages like time required for data extraction and loading.
- Business intelligence (BI) is the process of collecting data from various sources and analyzing it to help businesses make more informed decisions. It has evolved over time from simply collecting and reporting on retrospective data to also performing predictive analytics.
- The key stages in a closed-loop BI process are track, analyze, model, decide, and monitor. Data is tracked from operational systems and analyzed using BI tools to generate insights. Models are developed and used for forecasting and scenario planning. Decisions are made based on the analysis and models. Actions are then monitored and data is tracked again.
- Successful BI architecture has four parts - information architecture, data architecture, technical architecture, and product architecture to define what data and
Data Warehousing and Business Intelligence is one of the hottest skills today, and is the cornerstone for reporting, data science, and analytics. This course teaches the fundamentals with examples plus a project to fully illustrate the concepts.
This document discusses using data warehouses in retail and finance. It provides examples of how data warehouses are used in both industries, including for market basket analysis, product placement, supply chain management, and customer profiling. It also outlines some opportunities and challenges of implementing data warehouses, such as improved sales and customer loyalty but also large data volumes and data preparation difficulties. Specific company examples are given, like how Netflix uses customer streaming data and how Raymond James improved data backups and reporting with a new solution.
What is a Data Warehouse and How Do I Test It?RTTS
ETL Testing: A primer for Testers on Data Warehouses, ETL, Business Intelligence and how to test them.
Are you hearing and reading about Big Data, Enterprise Data Warehouses (EDW), the ETL Process and Business Intelligence (BI)? The software markets for EDW and BI are quickly approaching $22 billion, according to Gartner, and Big Data is growing at an exponential pace.
Are you being tasked to test these environments or would you like to learn about them and be prepared for when you are asked to test them?
RTTS, the Software Quality Experts, provided this groundbreaking webinar, based upon our many years of experience in providing software quality solutions for more than 400 companies.
You will learn the answer to the following questions:
• What is Big Data and what does it mean to me?
• What are the business reasons for a building a Data Warehouse and for using Business Intelligence software?
• How do Data Warehouses, Business Intelligence tools and ETL work from a technical perspective?
• Who are the primary players in this software space?
• How do I test these environments?
• What tools should I use?
This slide deck is geared towards:
QA Testers
Data Architects
Business Analysts
ETL Developers
Operations Teams
Project Managers
...and anyone else who is (a) new to the EDW space, (b) wants to be educated in the business and technical sides and (c) wants to understand how to test them.
Agile Data Warehouse Design for Big Data PresentationVishal Kumar
Synopsis:
[Video link: http://www.youtube.com/watch?v=ZNrTxSU5IQ0 ]
Jim Stagnitto and John DiPietro of consulting firm a2c) will discuss Agile Data Warehouse Design - a step-by-step method for data warehousing / business intelligence (DW/BI) professionals to better collect and translate business intelligence requirements into successful dimensional data warehouse designs.
The method utilizes BEAM✲ (Business Event Analysis and Modeling) - an agile approach to dimensional data modeling that can be used throughout analysis and design to improve productivity and communication between DW designers and BI stakeholders. BEAM✲ builds upon the body of mature "best practice" dimensional DW design techniques, and collects "just enough" non-technical business process information from BI stakeholders to allow the modeler to slot their business needs directly and simply into proven DW design patterns.
BEAM✲ encourages DW/BI designers to move away from the keyboard and their entity relationship modeling tools and begin "white board" modeling interactively with BI stakeholders. With the right guidance, BI stakeholders can and should model their own BI data requirements, so that they can fully understand and govern what they will be able to report on and analyze.
The BEAM✲ method is fully described in
Agile Data Warehouse Design - a text co-written by Lawrence Corr and Jim Stagnitto.
About the speaker:
Jim Stagnitto Director of a2c Data Services Practice
Data Warehouse Architect: specializing in powerful designs that extract the maximum business benefit from Intelligence and Insight investments.
Master Data Management (MDM) and Customer Data Integration (CDI) strategist and architect.
Data Warehousing, Data Quality, and Data Integration thought-leader: co-author with Lawrence Corr of "Agile Data Warehouse Design", guest author of Ralph Kimball’s “Data Warehouse Designer” column, and contributing author to Ralph and Joe Caserta's latest book: “The DW ETL Toolkit”.
John DiPietro Chief Technology Officer at A2C IT Consulting
John DiPietro is the Chief Technology Officer for a2c. Mr. DiPietro is responsible
for setting the vision, strategy, delivery, and methodologies for a2c’s Solution
Practice Offerings for all national accounts. The a2c CTO brings with him an
expansive depth and breadth of specialized skills in his field.
Sponsor Note:
Thanks to:
Microsoft NERD for providing awesome venue for the event.
http://A2C.com IT Consulting for providing the food/drinks.
http://Cognizeus.com for providing book to give away as raffle.
Business Intelligence and Multidimensional DatabaseRussel Chowdhury
It was an honor that my employer assigned me to study with Business Intelligence that follows SQL Server Analysis
Services. Hence I started and prepared a presentation as a startup guide for a new learner.
* Thanks to all the contributions gathered here to prepare the doc.
Promote the Good of the People of the United Kingdom by Maintaining Monetary ...DataWorks Summit
The Bank of England is the central Bank of the United Kingdom, established in 1694. Representatives from the Bank’s Data Analytics & Modelling team will discuss the Bank of England's journey to delivering a Big Data capability and how the Hortonworks HDP platform is helping us deliver on our mission statement of “promote the good of the people of the United Kingdom by maintaining monetary and financial stability". We will explore the challenges we've faced, how we have overcome some of these and those that remain to be conquered. We will also present our strategy for the Bank’s future Big Data platform as we look to scale up further in the coming years.
We will focus in particular on our first successful ‘Big Data’ production system. This exists in response to the financial crises of 2008 and the subsequent push to make the derivative markets safer by reducing systemic risk. In Europe this was delivered through the European Market Infrastructure Regulation (EMIR). We will explain the Bank of England’s role in monitoring UK entities within this important market and describe the significant challenges facing our team in building a data analytics platform to facilitate this
Speakers
Nick Vaughan, Domain SME - Data Analytics & Modelling
Bank of England
Adrian Waddy, Technical Lead
Bank of England
The document discusses web design standards and ethics in web development. It introduces the World Wide Web Consortium (W3C) which sets standards for web development to ensure the long-term growth of the web. The W3C's mission is outlined. Web design standards that ensure fast loading, mobile readiness, SEO optimization, and conversion are covered. The document also discusses ethics and professional conduct for web developers, including treating others with respect, avoiding discrimination or harassment, respecting privacy, and promoting an inclusive work environment.
General Packet Radio Service (GPRS) is a data service for GSM networks that provides transmission speeds up to 160 Kbps. GPRS uses a packet-based wireless communication technology and provides continuous connection to the Internet for mobile phone users. The key components of GPRS architecture include the Mobile Station (MS), Base Transceiver Station (BTS), Base Station Controller (BSC), Mobile Switching Center (MSC), Home Location Register (HLR), Serving GPRS Support Node (SGSN), and Gateway GPRS Support Node (GGSN). Together, these components allow mobile users to send and receive data such as emails and web pages over GSM networks.
This document summarizes an evaluation of texture feature extraction methods for content-based image retrieval, including co-occurrence matrices, Tamura features, and Gabor filters. The evaluation tested these methods on a Corel image collection using Manhattan distance as the similarity measure. Co-occurrence matrices performed best with homogeneity as the feature, while Gabor wavelets showed better performance for homogeneous textures of fixed sizes. Tamura features performed poorly with directionality. Overall, co-occurrence matrices provided the best results for general texture retrieval.
Content-based Image Retrieval Using The knowledge of Color, Texture in Binary...Zahra Mansoori
This document presents a new approach for content-based image retrieval that combines color, texture, and a binary tree structure to describe images and their features. Color histograms in HSV color space and wavelet texture features are extracted as low-level features. A binary tree partitions each image into regions based on color and represents higher-level spatial relationships. The performance of the proposed system is evaluated on a subset of the COREL image database and compared to the SIMPLIcity image retrieval system. Experimental results show the proposed system has better retrieval performance than SIMPLIcity in some categories and comparable performance in others.
This document discusses customer relationship management (CRM) and campaign management. It defines operational, analytical, and collaborative CRM and explains how they are related. It also outlines 10 common CRM functionalities like lead management, account management, and sales activity tracking. Additionally, it defines campaign management and differentiates between inbound and outbound marketing as well as multi-channel and cross-channel marketing. Finally, it references various sources for additional information.
Master data management (MDM) involves managing core business entities that are used across many business processes and systems. These entities include customers, products, suppliers, and more. MDM provides a single source of truth for key business data and ensures consistency. There are different domains of MDM, including customer data integration which manages party data, and product information management which manages product definitions. MDM systems can be used collaboratively to achieve agreement on topics, operationally as transaction systems, or for analytics on the managed data. Common implementation styles include registry, consolidation, transactional hub, and coexistence. MDM systems include repositories to store master data, services to manage it, and integration with other systems and applications.
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Speck&Tech
ABSTRACT: A prima vista, un mattoncino Lego e la backdoor XZ potrebbero avere in comune il fatto di essere entrambi blocchi di costruzione, o dipendenze di progetti creativi e software. La realtà è che un mattoncino Lego e il caso della backdoor XZ hanno molto di più di tutto ciò in comune.
Partecipate alla presentazione per immergervi in una storia di interoperabilità, standard e formati aperti, per poi discutere del ruolo importante che i contributori hanno in una comunità open source sostenibile.
BIO: Sostenitrice del software libero e dei formati standard e aperti. È stata un membro attivo dei progetti Fedora e openSUSE e ha co-fondato l'Associazione LibreItalia dove è stata coinvolta in diversi eventi, migrazioni e formazione relativi a LibreOffice. In precedenza ha lavorato a migrazioni e corsi di formazione su LibreOffice per diverse amministrazioni pubbliche e privati. Da gennaio 2020 lavora in SUSE come Software Release Engineer per Uyuni e SUSE Manager e quando non segue la sua passione per i computer e per Geeko coltiva la sua curiosità per l'astronomia (da cui deriva il suo nickname deneb_alpha).
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIVladimir Iglovikov, Ph.D.
Presented by Vladimir Iglovikov:
- https://www.linkedin.com/in/iglovikov/
- https://x.com/viglovikov
- https://www.instagram.com/ternaus/
This presentation delves into the journey of Albumentations.ai, a highly successful open-source library for data augmentation.
Created out of a necessity for superior performance in Kaggle competitions, Albumentations has grown to become a widely used tool among data scientists and machine learning practitioners.
This case study covers various aspects, including:
People: The contributors and community that have supported Albumentations.
Metrics: The success indicators such as downloads, daily active users, GitHub stars, and financial contributions.
Challenges: The hurdles in monetizing open-source projects and measuring user engagement.
Development Practices: Best practices for creating, maintaining, and scaling open-source libraries, including code hygiene, CI/CD, and fast iteration.
Community Building: Strategies for making adoption easy, iterating quickly, and fostering a vibrant, engaged community.
Marketing: Both online and offline marketing tactics, focusing on real, impactful interactions and collaborations.
Mental Health: Maintaining balance and not feeling pressured by user demands.
Key insights include the importance of automation, making the adoption process seamless, and leveraging offline interactions for marketing. The presentation also emphasizes the need for continuous small improvements and building a friendly, inclusive community that contributes to the project's growth.
Vladimir Iglovikov brings his extensive experience as a Kaggle Grandmaster, ex-Staff ML Engineer at Lyft, sharing valuable lessons and practical advice for anyone looking to enhance the adoption of their open-source projects.
Explore more about Albumentations and join the community at:
GitHub: https://github.com/albumentations-team/albumentations
Website: https://albumentations.ai/
LinkedIn: https://www.linkedin.com/company/100504475
Twitter: https://x.com/albumentations
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
“An Outlook of the Ongoing and Future Relationship between Blockchain Technologies and Process-aware Information Systems.” Invited talk at the joint workshop on Blockchain for Information Systems (BC4IS) and Blockchain for Trusted Data Sharing (B4TDS), co-located with with the 36th International Conference on Advanced Information Systems Engineering (CAiSE), 3 June 2024, Limassol, Cyprus.
4. Preference
• companies have gathered tons and tons of data about their
operation
• Information is said to double every 18 months
Over the past two decades
• you cannot improve what you do not measure
• Without some sort of feedback mechanism, you are
essentially driving blind
The theory behind BI systems:
7. Decision making
• Operational Decision making
Operational Systems
• Tactical Decision making
Meeting certain business objectives within a specific time frame
• Strategic Decision making
Long Term Goals
Far-reaching impact on the organization
7
9. Data Evolution (DIKW Pyramid)
• Data is the foundation of Information, Knowledge and
ultimately, Wisdom
9
Data
Information
Knowledge
Wisdom
Context
Understanding
Gathering
Of Parts
Connection
Of Parts
Formation
Of a Whole
Joining
Of Whole
Researching Absorbing Doing Reflecting
11. Definition: OLAP vs. OLTP
OLAP
•Online Analytical Processing, or OLAP, is an approach
to answering multi-dimensional analytical queries.
•OLAP tools enable users to analyze multidimensional
data interactively from multiple perspectives.
•Databases configured for OLAP use a multidimensional
data model, allowing for complex analytical and ad
hoc queries with a rapid execution time. They borrow
aspects of navigational databases, hierarchical
databases and relational databases.
OLTP
•Online transaction processing,
or OLTP, is a class of information
systems that manage transaction-
oriented applications, typically for
data entry and retrieval transaction
processing. OLTP has also been used
to refer to processing in which the
system responds immediately to
user requests.
12. Definition: KPI
KPI
• A Performance Indicator or Key performance indicator (KPI) is a
type of performance measurement. An organization may use KPIs
to evaluate its success, or to evaluate the success of a particular
activity in which it is engaged. Sometimes success is defined in
terms of making progress toward strategic goals, but often success
is simply the repeated, periodic achievement of some level of
operational goal (e.g. zero defects, 10/10 customer satisfaction,
etc.).
13. Nature of Data Warehouse
Historical Data
Easy to query
Show the relationship between unrelated data
Time-stamped data
User-friendly access tools
Reasonable response time
13
17. Traditional DSS Models
• One: A Central Data Warehouse that contains company
Transaction Data
• Two: A Reporting Mechanism that allows users to access
the data in several summary and ad hoc formats
• Three: A Common Interface is a Dashboard that reports
how the company is doing on Key Performance Indicators
(KPIs)
Traditional DSS systems consist of:
An IBM Systems Journal article published in 1988, “An architecture for
a business information system”, coined the term “Business Data
Warehouse”.
19. Merits and Demerits of Traditional
Model
With a Traditional BI system:
• You are no longer driving blind, but,
• Because all information is historical, your only view of the world is
through your rear-view mirror
• If the road on which you are driving is long, featureless, and
straight, you can stay on course by making small corrections and
watching how the road drifts behind you
• However, if there is a fork in the road ahead (an opportunity) you
won't see it until it passes
• And, if there is a sharp curve, you crash!
What you need is a system that gives you a forward view
20. 1990 - Bill Inmon Model
• The term Business Intelligence is a popularized
introduced by Gartner Group in 1989
• In 1990, Bill Inmon Became “Father of Data
warehousing”
• The Industry soon began to implement Inmon’s
vision
• In 2002 Inmon introduced new concept to his model
• Data stored into single database called Data
Warehouse
• Data extract from this database to smaller
Departmental Databases
• Decision support users query and create reports
from departmental databases – a TOP-DOWN
approach
20
21. 1996 - Ralph Kimball Model
• In 1996, Kimball, a scholar-practitioner
developed a model that compete Inmon’s
• In 2002 he complete his model
• Recommends an architecture multiple
databases, called Data Marts, organized
by business processes
• The sum of Data Marts comprises the
Data Warehouse
• A BOTTOM-UP approach that must adhere
to an enterprise-wide standard “Data Bus”
21
23. Definition
• All Data of an Organization: Corporate Information
Factory (CIF) contains
23
•Current Transactional Data
Operational
•Historical Data
Atomic Data Warehouse
•Summarized Data of DW specifically for each department
Departmental
•Unstructured user generated data
Individual
24. Inmon’s Top-down design
24
Atomic Data warehouse
as a centralized
repository for the entire
enterprise
Departmental data will
extract from Data
Warehouse
25. Inmon Top-Down Schema
• Data stores in ERD
• Summarized Data From Data warehouse to Data marts
25
Data Warehouse
Departmental Data
(Data Marts)
27. Kimball Model
• Uses a data modeling method unique to the Data
Warehouse
• Known as “Dimensional Data Modeling”
• Multiple databases as Data Marts consolidate to each
other – highly interoperable
• Data Bus – another invention
27
29. Definition: Fact
Fact
• If the business process is SALES, then the corresponding fact table will typically
contain columns representing both raw facts and aggregations in rows such as:
$12,000, being "sales for New York store for 15-Jan-2005"
$34,000, being "sales for Los Angeles store for 15-Jan-2005"
$22,000, being "sales for New York store for 16-Jan-2005"
$50,000, being "sales for Los Angeles store for 16-Jan-2005"
$21,000, being "average daily sales for Los Angeles Store for Jan-2005"
$65,000, being "average daily sales for Los Angeles Store for Feb-2005"
$33,000, being "average daily sales for Los Angeles Store for year 2005"
30. Definitions: Dimension
Dimension
•The dimension is a data set composed of individual, non-overlapping data elements. The
primary functions of dimensions are threefold: to provide filtering, grouping and labeling.
•Typically dimensions in a data warehouse are organized internally into one or more
hierarchies. "Date" is a common dimension, with several possible hierarchies:
•"Days (are grouped into) Months (which are grouped into) Years",
•"Days (are grouped into) Weeks (which are grouped into) Years"
•"Days (are grouped into) Months (which are grouped into) Quarters (which are grouped into)
Years"
•etc.
31. Fact vs. Dimension Table
Fact Table
• Contain metrics
• Contain many rows and
relatively few columns
(for query performance)
Dimension Table
• Contain attributes of
the metrics of fact table
• Have only hundreds or
thousands of rows
• Hundred columns or
more
34. Example of Fact and Dimension table
4 dimensions: Service, Time, Sales Point, Customer
1 Fact: Transactions 34
Service Dimension
Sale point Dimension
Customer Dimension
Fact Table - Transactions
Time Dimension
36. Operation: Slice
• Slice is the act of picking a rectangular Subset of a cube
by choosing a Single Value for one of its dimensions,
creating a new cube with One Fewer Dimension
36
37. Operation: Dice
• Dice operation produces a Subcube by allowing the
analyst to pick specific values of multiple dimensions
37
38. Operation: Drill Down/ Roll Up
• Drill Down/Roll Up allows the user to Navigate Among
Levels Of Data ranging from the Most Summarized (Roll
Up) to the Most Detailed (Drill Down).
38
Drill down
Roll up
41. BI Structure
41
• Extracts data from outside sources
• Transforms it to fit operational needs
• Loads it into the end target (data
mart, or data warehouse)
50. EDW Bus Architecture
Database-independent Bus Architecture
Decomposes
The DW/BI
Process
By Focusing On The Organization’s Core Business Processes By
Using Conformed Dimensions
Conformed
Dimensions:
Master Common Standardized Dimensions
Created Once In The ETL
Reused By Multiple Fact Tables
50
54. Inmon vs. Kimball
Inmon Kimball
Methodology And Architecture
Overall Approach Top - Down Bottom - Up
Architectural Structure Enterprisewide (atomic)
data warehouse “feeds”
departmental databases
Data Marts model a
single business process,
enterprise consistency
achieved through data
bus and conformed
dimensions
Complexity Quite complex Fairly simple
Data Modeling
Data Orientation Subject or data – driven Process Oriented
Tools Traditional (ERD, DIS) Dimensional modeling
End-user Accessibility Low High
54
57. Standard, static reports
• Subject oriented, reported data defined precisely before
creation
• Reports with fixed layout defined by a report designer
when the report is created
• Very often the static reports contain sub-reports and
perform calculations or implement advanced functions
• Generated either on request by an end user or refreshed
periodically from a scheduler
• Usually are made available on the web server or a shared
drive
57
58. Ad-Hoc Reports
• Simple reports created by the end users on demand
• Designed from scratch or using a standard report as a
template
58
59. Interactive, multidimensional OLAP
reports
• Usually provide more general information - using
dynamic drill-down, slicing, dicing and filtering users can
get the information they need
• Reports with fixed design defined by a report designer
• Generated either on request by an end user or refreshed
periodically from a scheduler
• Usually are made available on the web server or a shared
drive
59
60. Dashboards
• Contain high-level, aggregated company strategic data
with comparisons and performance indicators
• Include both static and interactive reports
• Lots of graphics, charts and illustrations
60
61. Write-back reports
• Those are interactive reports directly linked to the Data
Warehouse which allow modification of the data
warehouse data.
By far the most often use of this kind of reports is:
Editing and customizing products and customers grouping
Entering budget figures, forecasts
Setting sales targets
Refining business relevant data
61
62. Technical reports
• This group of reports is usually generated to fulfill the
needs of the following areas:
IT technical reports for monitoring the BI system, generate
execution performance statistics, data volumes, system workload,
user activity etc.
Data quality reports - which are an input for business analysts to
the data cleansing process
Metadata reports - for system analysts and data modelers
62
65. Most widely used BI Systems:
IBM Cognos
SAP Business Objects and Crystal Reports
Oracle Hyperion and Siebel Analytics
Microstrategy
Microsoft Business Intelligence (SQL Server Reporting Services)
SAS
Pentahoo Reporting and Analysis
BIRT - Open Source Business Intelligence and Reporting Tools
JasperReports
Qlickview
65
68. Regression Analysis
• Y = aX + b
• Example: Profit is Linear/Non-linear function of Revenue, so we
forecast future Profit by assessing historical information
68
2012 2013 2014
Revenue 1,000 2,000 ?
Profit 200 300 ?
Profit = 0.1Revenue + 100
69. Decision Tree
• Decision trees are used to learn from historic data and to make
predictions about the future
• Example: Customer Satisfactory
69
X = 1
Y > 1
Z = 1:
Z = 2:
Y < 1
Z = 1:
Z = 2:
70. Association Analysis
• Helps you to identify cross-selling opportunities, for
example. You can use the rules resulting from the
analysis to place associated products together in a
catalog
• Let : 𝐼 = 𝐼1, 𝐼2, . . . , 𝐼𝑚 , 𝑇∁ 𝐼 𝑇 𝑖𝑠 𝑎 𝑇𝑟𝑎𝑛𝑠𝑎𝑐𝑡𝑖𝑜𝑛 , 𝑋∁ 𝑇
• Define: 𝑋 𝑌 𝑌∁ 𝑇 & X ∩ 𝑌 = ∅
70
if a customer purchases an airline ticket,
then he is likely to rent a car and make a
hotel reservation
71. Cluster Analysis
71
0
1
2
3
4
5
0 2 4 6
Series 1 Series 2
• Example:
1. Gathers attributes about Customers with the same purchases
2. Predicts which product should be chosen by a specific customer
with specific attribute
72. 8. Summary
• BI Objectives
• Traditional BI
• Inmon Departmental model
• Kimball fact-dimension model and
bus architecture
• BI Dashboard and Reporting
• BI Algorithms
72
73. 9. References
• David Butler, Bob Stackowiak, “Master Data Management, An Oracle
White Paper”, June 2009
• Wikipedia.com
• businessintelligence.com/dictionary/
• www.vanguardsw.com/products/vanguard-system/business-
intelligence.htm
• swiki.net/reliable-business-intelligence-datawarehouse-datamart-
and-datamining.html
• www.oracle.com/technetwork/articles/madison-models-
086845.html