Introduction to Data Warehousing: Introduction, Necessity, Framework
of the datawarehouse, options, developing datawarehouses, end points.
Data Warehousing Design Consideration and Dimensional Modeling:
Defining Dimensional Model, Granularity of Facts, Additivity of Facts,
Functional dependency of the Data, Helper Tables, Implementation manyto-
many relationships between fact and dimensional modelling.
Relational databases vs Non-relational databasesJames Serra
There is a lot of confusion about the place and purpose of the many recent non-relational database solutions ("NoSQL databases") compared to the relational database solutions that have been around for so many years. In this presentation I will first clarify what exactly these database solutions are, compare them, and discuss the best use cases for each. I'll discuss topics involving OLTP, scaling, data warehousing, polyglot persistence, and the CAP theorem. We will even touch on a new type of database solution called NewSQL. If you are building a new solution it is important to understand all your options so you take the right path to success.
Slides for my Associate Professor (oavlönad docent) lecture.
The lecture is about Data Streaming (its evolution and basic concepts) and also contains an overview of my research.
HDFS is a Java-based file system that provides scalable and reliable data storage, and it was designed to span large clusters of commodity servers. HDFS has demonstrated production scalability of up to 200 PB of storage and a single cluster of 4500 servers, supporting close to a billion files and blocks.
Relational databases vs Non-relational databasesJames Serra
There is a lot of confusion about the place and purpose of the many recent non-relational database solutions ("NoSQL databases") compared to the relational database solutions that have been around for so many years. In this presentation I will first clarify what exactly these database solutions are, compare them, and discuss the best use cases for each. I'll discuss topics involving OLTP, scaling, data warehousing, polyglot persistence, and the CAP theorem. We will even touch on a new type of database solution called NewSQL. If you are building a new solution it is important to understand all your options so you take the right path to success.
Slides for my Associate Professor (oavlönad docent) lecture.
The lecture is about Data Streaming (its evolution and basic concepts) and also contains an overview of my research.
HDFS is a Java-based file system that provides scalable and reliable data storage, and it was designed to span large clusters of commodity servers. HDFS has demonstrated production scalability of up to 200 PB of storage and a single cluster of 4500 servers, supporting close to a billion files and blocks.
Big Data - The 5 Vs Everyone Must KnowBernard Marr
This slide deck, by Big Data guru Bernard Marr, outlines the 5 Vs of big data. It describes in simple language what big data is, in terms of Volume, Velocity, Variety, Veracity and Value.
Data Mining: What is Data Mining?
History
How data mining works?
Data Mining Techniques.
Data Mining Process.
(The Cross-Industry Standard Process)
Data Mining: Applications.
Advantages and Disadvantages of Data Mining.
Conclusion.
Know different types of tips about Importance of dataware housing, Data Cleansing and Extracting etc . For more details visit: http://www.skylinecollege.com/business-analytics-course
In computing, a data warehouse (DW or DWH), also known as an enterprise data warehouse (EDW), is a system used for reporting and data analysis, and is considered a core component of business intelligence.[1] DWs are central repositories of integrated data from one or more disparate sources. They store current and historical data in one single place that are used for creating analytical reports for knowledge workers throughout the enterprise.
Class lecture by Prof. Raj Jain on Big Data. The talk covers Why Big Data Now?, Big Data Applications, ACID Requirements, Terminology, Google File System, BigTable, MapReduce, MapReduce Optimization, Story of Hadoop, Hadoop, Apache Hadoop Tools, Apache Other Big Data Tools, Other Big Data Tools, Analytics, Types of Databases, Relational Databases and SQL, Non-relational Databases, NewSQL Databases, Columnar Databases. Video recording available in YouTube.
The right architecture is key for any IT project. This is especially the case for big data projects, where there are no standard architectures which have proven their suitability over years. This session discusses the different Big Data Architectures which have evolved over time, including traditional Big Data Architecture, Streaming Analytics architecture as well as Lambda and Kappa architecture and presents the mapping of components from both Open Source as well as the Oracle stack onto these architectures.
ETL: Transformations and Other Operators: STORE mapping, Adding
source and target operators, Adding Transformation Operators, Using a Key
Lookup operator, Creating an external table, Creating and loading a lookup
table, Retrieving the key to use for a Lookup Operator, Adding a Key
Lookup operator, PRODUCT mapping, SALES cube mapping, Dimension
attributes in the cube, Measures and other attributes in the cube, Mapping
values to cube attributes, Mapping measures' values to a cube, Mapping
PRODUCT and STORE dimension values to the cube, Mapping
DATE_DIM values to the cube, Features and benefits of OWB.
Big Data - The 5 Vs Everyone Must KnowBernard Marr
This slide deck, by Big Data guru Bernard Marr, outlines the 5 Vs of big data. It describes in simple language what big data is, in terms of Volume, Velocity, Variety, Veracity and Value.
Data Mining: What is Data Mining?
History
How data mining works?
Data Mining Techniques.
Data Mining Process.
(The Cross-Industry Standard Process)
Data Mining: Applications.
Advantages and Disadvantages of Data Mining.
Conclusion.
Know different types of tips about Importance of dataware housing, Data Cleansing and Extracting etc . For more details visit: http://www.skylinecollege.com/business-analytics-course
In computing, a data warehouse (DW or DWH), also known as an enterprise data warehouse (EDW), is a system used for reporting and data analysis, and is considered a core component of business intelligence.[1] DWs are central repositories of integrated data from one or more disparate sources. They store current and historical data in one single place that are used for creating analytical reports for knowledge workers throughout the enterprise.
Class lecture by Prof. Raj Jain on Big Data. The talk covers Why Big Data Now?, Big Data Applications, ACID Requirements, Terminology, Google File System, BigTable, MapReduce, MapReduce Optimization, Story of Hadoop, Hadoop, Apache Hadoop Tools, Apache Other Big Data Tools, Other Big Data Tools, Analytics, Types of Databases, Relational Databases and SQL, Non-relational Databases, NewSQL Databases, Columnar Databases. Video recording available in YouTube.
The right architecture is key for any IT project. This is especially the case for big data projects, where there are no standard architectures which have proven their suitability over years. This session discusses the different Big Data Architectures which have evolved over time, including traditional Big Data Architecture, Streaming Analytics architecture as well as Lambda and Kappa architecture and presents the mapping of components from both Open Source as well as the Oracle stack onto these architectures.
ETL: Transformations and Other Operators: STORE mapping, Adding
source and target operators, Adding Transformation Operators, Using a Key
Lookup operator, Creating an external table, Creating and loading a lookup
table, Retrieving the key to use for a Lookup Operator, Adding a Key
Lookup operator, PRODUCT mapping, SALES cube mapping, Dimension
attributes in the cube, Measures and other attributes in the cube, Mapping
values to cube attributes, Mapping measures' values to a cube, Mapping
PRODUCT and STORE dimension values to the cube, Mapping
DATE_DIM values to the cube, Features and benefits of OWB.
Data warehousing has quickly evolved into a unique and popular busin.pdfapleather
Data warehousing has quickly evolved into a unique and popular business application class.
Early builders of data warehouses already consider their systems to be key components of their
IT strategy and architecture. Numerous examples can be cited of highly successful data
warehouses developed and deployed for businesses of all sizes and all types. Hardware and
software vendors have quickly developed products and services that specifically target the data
warehousing market. This paper will introduce key concepts surrounding the data warehousing
systems.
What is a data warehouse? A simple answer could be that a data warehouse is managed data
situated after and outside the operational systems. A complete definition requires discussion of
many key attributes of a data warehouse system. Later in Section 2, we will identify these key
attributes and discuss the definition they provide for a data warehouse. Section 3 briefly reviews
the activity against a data warehouse system. Initially in Section 1, however, we will take a brief
tour of the traditions of managing data after it passes through the operational systems and the
types of analysis generated from this historical data.
Evolution of an application class
This section reviews the historical management of the analysis data and the factors that have led
to the evolution of the data warehousing application class.
Traditional approaches to historical data
In reviewing the development of data warehousing, we need to begin with a review of what had
been done with the data before of evolution of data warehouses. Let us first look at how the kind
of data that ends up in today\'s data warehouses had been managed historically.
Throughout the history of systems development, the primary emphasis had been given to the
operational systems and the data they process. It is not practical to keep data in the operational
systems indefinitely; and only as an afterthought was a structure designed for archiving the data
that the operational system has processed. The fundamental requirements of the operational and
analysis systems are different: the operational systems need performance, whereas the analysis
systems need flexibility and broad scope. It has rarely been acceptable to have business analysis
interfere with and degrade performance of the operational systems.
Data from legacy systems
In the 1970s virtually all business system development was done on the IBM mainframe
computers using tools such as Cobol, CICS, IMS, DB2, etc. The 1980s brought in the new mini-
computer platforms such as AS/400 and VAX/VMS. The late eighties and early nineties made
UNIX a popular server platform with the introduction of client/server architecture.
Despite all the changes in the platforms, architectures, tools, and technologies, a remarkably
large number of business applications continue to run in the mainframe environment of the
1970s. By some estimates, more than 70 percent of business data for large corporations still
resi.
Enterprise Storage Solutions for Overcoming Big Data and Analytics ChallengesINFINIDAT
Big Data and analytics workloads represent a new frontier for organizations. Data is being collected from sources that did not exist 10 years ago. Mobile phone data, machine-generated data, and website interaction data are all being collected and analyzed. In addition, as IT budgets are already being pressured down, Big Data footprints are getting larger and posing a huge storage challenge.
This paper provides information on the issues that Big Data applications pose for storage systems and how choosing the correct storage infrastructure can streamline and consolidate Big Data and analytics applications without breaking the bank.
InfiniBox bridges the gap between high performance and high capacity for Big Data applications. InfiniBox allows an organization implementing Big Data and Analytics projects to truly attain its business goals: cost reduction, continual and deep capacity scaling, and simple and effective management — and without any compromises in performance or reliability. All of this to effectively and efficiently support Big Data applications at a disruptive price point.
Learn more at www.infinidat.com.
Web services basics : What Are Web Services? Types of Web Services Distributed computing infrastructure, overview of XML, SOAP, Building Web Services with JAX-WS, Registering and Discovering Web Services, Service Oriented Architecture, Web Services Development Life Cycle, Developing and consuming simple Web Services across platform
The REST Architectural style : Introducing HTTP, The core architectural elements of a RESTful system, Description and discovery of RESTful web services, Java tools and frameworks for building RESTful web services, JSON message format and tools and frameworks around JSON, Build RESTful web services with JAX-RS APIs, The Description and Discovery of RESTful Web Services, Design guidelines for building RESTful web services, Secure RESTful web services
Developing Service-Oriented Applications with WCF : What Is Windows Communication Foundation, Fundamental Windows Communication Foundation Concepts, Windows Communication Foundation Architecture, WCF and .NET Framework Client Profile, Basic WCF Programming, WCF Feature Details. Web Service QoS
What Is AI: Foundations, History and State of the Art of AI.
Intelligent Agents: Agents and Environments, Nature of Environments, Structure of Agents.
Problem Solving by searching: Problem-Solving Agents, Example Problems,Searching for Solutions, Uninformed Search Strategies, Informed (Heuristic) Search Strategies, Heuristic Functions.
Learning from Examples: Forms of Learning, Supervised Learning, Learning Decision Trees, Evaluating and Choosing the Best Hypothesis, Theory of Learning, Regression and Classification with Linear Models, Artificial Neural Networks, Nonparametric Models, Support Vector Machines, Ensemble Learning, Practical Machine Learning
Learning probabilistic models: Statistical Learning, Learning with Complete Data, Learning with Hidden Variables: The EM Algorithm. Reinforcement learning: Passive Reinforcement Learning, Active Reinforcement Learning, Generalization in Reinforcement Learning, Policy Search, Applications of Reinforcement Learning.
Attribute data input and data display :
Attribute data in GIS, Relational model, Data entry, Manipulation of fields
and attribute data, cartographic symbolization, types of maps, typography,
map design, map production
Data Input and Geometric transformation:
Existing GIS data, Metadata, Conversion of existing data, Creating new
data, Geometric transformation, RMS error and its interpretation,
Resampling of pixel values.
Spatial Data Concepts:
Introduction to GIS, Geographically referenced data, Geographic, projected
and planer coordinate system, Map projections, Plane coordinate systems,
Vector data model, Raster data model
Unit-I Conventional Software Management: The waterfall model, conventional
software Management performance.
Evolution of Software Economics: Software Economics, pragmatic software
cost estimation.
Improving Software Economics: Reducing Software product size, improving
software processes, improving team effectiveness, improving automation,
Achieving required quality, peer inspections.
10
Lectures
Unit-II The old way and the new: The principles of conventional software
Engineering, principles of modern software management, transitioning to an
iterative process.
Life cycle phases: Engineering and production stages, inception, Elaboration,
construction, transition phases.
Artifacts of the process: The artifact sets, Management artifacts, Engineering
artifacts, programmatic artifacts.
Model based software architectures: A Management perspective and technical
perspective.
10
Lectures
Unit-III Work Flows of the process: Software process workflows, Iteration workflows.
Checkpoints of the process: Major mile stones, Minor Milestones, Periodic
status assessments.
Iterative Process Planning: Work breakdown structures, planning guidelines,
cost and schedule estimating, Iteration planning process, Pragmatic planning.
10
Lectures
Unit-IV Project Organizations and Responsibilities: Line-of-Business Organizations,
Project Organizations, evolution of Organizations.
Process Automation: Automation Building blocks, The Project Environment.
10
Lectures
Unit-V Project Control and Process instrumentation: The seven core Metrics,
Management indicators, quality indicators, life cycle expectations, pragmatic
Software Metrics, Metrics automation.
Tailoring the Process: Process discriminants.
10
Lectures
Unit-VI Future Software Project Management: Modern Project Profiles, Next
generation Software economics, modern process transitions.
10
Lectures
An Introduction to Oracle Warehouse Builder: Installation of the
database and OWB, About hardware and operating systems, Installing
Oracle database software, Configuring the listener, Creating the database,
Installing the OWB standalone software, OWB components and
architecture, Configuring the repository and workspaces.
Defining and Importing Source Data Structures: An overview of
Warehouse Builder Design Center, Importing/defining source metadata,
Creating a project, Creating a module, Creating an Oracle Database module,
Creating a SQL Server database module, Importing source metadata from a
database, Defining source metadata manually with the Data Object Editor,
Importing source metadata from files.
Validating, Generating, Deploying, and Executing Objects: Validating,
Validating in the Design Center, Validating from the editors, Validating in
the Data Object Editor, Validating in the Mapping, Editor, Generating,
Generating in the Design Center, Generating from the editors, Generating in
the Data Object Editor, Generating in the Mapping Editor, Deploying, The
Control Center Service, Deploying in the Design Center and Data Object
Editor, The Control Center Manager, The Control Center Manager window
overview, Deploying in the Control Center ,Manager, Executing, Deploying
and executing remaining objects, Deployment Order, Execution order.
ETL: Transformations and Other Operators: STORE mapping, Adding
source and target operators, Adding Transformation Operators, Using a Key
Lookup operator, Creating an external table, Creating and loading a lookup
table, Retrieving the key to use for a Lookup Operator, Adding a Key
Lookup operator, PRODUCT mapping, SALES cube mapping, Dimension
attributes in the cube, Measures and other attributes in the cube, Mapping
values to cube attributes, Mapping measures' values to a cube, Mapping
PRODUCT and STORE dimension values to the cube, Mapping
DATE_DIM values to the cube, Features and benefits of OWB.
Validating, Generating, Deploying, and Executing Objects:
Designing and building an ETL mapping: Designing our staging area,
Designing the staging area contents, Building the staging area table with the
Data Object Editor, Designing our mapping, Review of the Mapping Editor,
Creating a mapping.
বাংলাদেশের অর্থনৈতিক সমীক্ষা ২০২৪ [Bangladesh Economic Review 2024 Bangla.pdf] কম্পিউটার , ট্যাব ও স্মার্ট ফোন ভার্সন সহ সম্পূর্ণ বাংলা ই-বুক বা pdf বই " সুচিপত্র ...বুকমার্ক মেনু 🔖 ও হাইপার লিংক মেনু 📝👆 যুক্ত ..
আমাদের সবার জন্য খুব খুব গুরুত্বপূর্ণ একটি বই ..বিসিএস, ব্যাংক, ইউনিভার্সিটি ভর্তি ও যে কোন প্রতিযোগিতা মূলক পরীক্ষার জন্য এর খুব ইম্পরট্যান্ট একটি বিষয় ...তাছাড়া বাংলাদেশের সাম্প্রতিক যে কোন ডাটা বা তথ্য এই বইতে পাবেন ...
তাই একজন নাগরিক হিসাবে এই তথ্য গুলো আপনার জানা প্রয়োজন ...।
বিসিএস ও ব্যাংক এর লিখিত পরীক্ষা ...+এছাড়া মাধ্যমিক ও উচ্চমাধ্যমিকের স্টুডেন্টদের জন্য অনেক কাজে আসবে ...
A review of the growth of the Israel Genealogy Research Association Database Collection for the last 12 months. Our collection is now passed the 3 million mark and still growing. See which archives have contributed the most. See the different types of records we have, and which years have had records added. You can also see what we have for the future.
The simplified electron and muon model, Oscillating Spacetime: The Foundation...RitikBhardwaj56
Discover the Simplified Electron and Muon Model: A New Wave-Based Approach to Understanding Particles delves into a groundbreaking theory that presents electrons and muons as rotating soliton waves within oscillating spacetime. Geared towards students, researchers, and science buffs, this book breaks down complex ideas into simple explanations. It covers topics such as electron waves, temporal dynamics, and the implications of this model on particle physics. With clear illustrations and easy-to-follow explanations, readers will gain a new outlook on the universe's fundamental nature.
Thinking of getting a dog? Be aware that breeds like Pit Bulls, Rottweilers, and German Shepherds can be loyal and dangerous. Proper training and socialization are crucial to preventing aggressive behaviors. Ensure safety by understanding their needs and always supervising interactions. Stay safe, and enjoy your furry friends!
How to Add Chatter in the odoo 17 ERP ModuleCeline George
In Odoo, the chatter is like a chat tool that helps you work together on records. You can leave notes and track things, making it easier to talk with your team and partners. Inside chatter, all communication history, activity, and changes will be displayed.
Safalta Digital marketing institute in Noida, provide complete applications that encompass a huge range of virtual advertising and marketing additives, which includes search engine optimization, virtual communication advertising, pay-per-click on marketing, content material advertising, internet analytics, and greater. These university courses are designed for students who possess a comprehensive understanding of virtual marketing strategies and attributes.Safalta Digital Marketing Institute in Noida is a first choice for young individuals or students who are looking to start their careers in the field of digital advertising. The institute gives specialized courses designed and certification.
for beginners, providing thorough training in areas such as SEO, digital communication marketing, and PPC training in Noida. After finishing the program, students receive the certifications recognised by top different universitie, setting a strong foundation for a successful career in digital marketing.
Delivering Micro-Credentials in Technical and Vocational Education and TrainingAG2 Design
Explore how micro-credentials are transforming Technical and Vocational Education and Training (TVET) with this comprehensive slide deck. Discover what micro-credentials are, their importance in TVET, the advantages they offer, and the insights from industry experts. Additionally, learn about the top software applications available for creating and managing micro-credentials. This presentation also includes valuable resources and a discussion on the future of these specialised certifications.
For more detailed information on delivering micro-credentials in TVET, visit this https://tvettrainer.com/delivering-micro-credentials-in-tvet/
This slide is special for master students (MIBS & MIFB) in UUM. Also useful for readers who are interested in the topic of contemporary Islamic banking.
Macroeconomics- Movie Location
This will be used as part of your Personal Professional Portfolio once graded.
Objective:
Prepare a presentation or a paper using research, basic comparative analysis, data organization and application of economic information. You will make an informed assessment of an economic climate outside of the United States to accomplish an entertainment industry objective.
Normal Labour/ Stages of Labour/ Mechanism of LabourWasim Ak
Normal labor is also termed spontaneous labor, defined as the natural physiological process through which the fetus, placenta, and membranes are expelled from the uterus through the birth canal at term (37 to 42 weeks
Strategies for Effective Upskilling is a presentation by Chinwendu Peace in a Your Skill Boost Masterclass organisation by the Excellence Foundation for South Sudan on 08th and 09th June 2024 from 1 PM to 3 PM on each day.
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...Levi Shapiro
Letter from the Congress of the United States regarding Anti-Semitism sent June 3rd to MIT President Sally Kornbluth, MIT Corp Chair, Mark Gorenberg
Dear Dr. Kornbluth and Mr. Gorenberg,
The US House of Representatives is deeply concerned by ongoing and pervasive acts of antisemitic
harassment and intimidation at the Massachusetts Institute of Technology (MIT). Failing to act decisively to ensure a safe learning environment for all students would be a grave dereliction of your responsibilities as President of MIT and Chair of the MIT Corporation.
This Congress will not stand idly by and allow an environment hostile to Jewish students to persist. The House believes that your institution is in violation of Title VI of the Civil Rights Act, and the inability or
unwillingness to rectify this violation through action requires accountability.
Postsecondary education is a unique opportunity for students to learn and have their ideas and beliefs challenged. However, universities receiving hundreds of millions of federal funds annually have denied
students that opportunity and have been hijacked to become venues for the promotion of terrorism, antisemitic harassment and intimidation, unlawful encampments, and in some cases, assaults and riots.
The House of Representatives will not countenance the use of federal funds to indoctrinate students into hateful, antisemitic, anti-American supporters of terrorism. Investigations into campus antisemitism by the Committee on Education and the Workforce and the Committee on Ways and Means have been expanded into a Congress-wide probe across all relevant jurisdictions to address this national crisis. The undersigned Committees will conduct oversight into the use of federal funds at MIT and its learning environment under authorities granted to each Committee.
• The Committee on Education and the Workforce has been investigating your institution since December 7, 2023. The Committee has broad jurisdiction over postsecondary education, including its compliance with Title VI of the Civil Rights Act, campus safety concerns over disruptions to the learning environment, and the awarding of federal student aid under the Higher Education Act.
• The Committee on Oversight and Accountability is investigating the sources of funding and other support flowing to groups espousing pro-Hamas propaganda and engaged in antisemitic harassment and intimidation of students. The Committee on Oversight and Accountability is the principal oversight committee of the US House of Representatives and has broad authority to investigate “any matter” at “any time” under House Rule X.
• The Committee on Ways and Means has been investigating several universities since November 15, 2023, when the Committee held a hearing entitled From Ivory Towers to Dark Corners: Investigating the Nexus Between Antisemitism, Tax-Exempt Universities, and Terror Financing. The Committee followed the hearing with letters to those institutions on January 10, 202
1. COURCES WE OFFER:
BSC(IT) FY,SY,TY
BSC(CS) FY,SY,TY
BSC(IT/CS) PROJECTS
MCA (ENTRANCE)
ENGG(IT/ELECTRONICS/EXTC)
ADDRESS: 302 PARANJPE UDYOG BHAVAN, NEAR KHANDELWAL SWEETS, THANE STATION, THANE
WEST.
TEL: 8097071144/55
STAY CONNECTED FOR MORE UPDATES AND STUDY NOTES
FACEBOOK : https://www.facebook.com/weittutorial
EMAIL: weit.tutorials@gmail.com
1PHONE:8097071144/55
2. UNIT 1
Introduction to Data Warehousing: Introduction, Necessity,
Framework of the datawarehouse, options, developing
datawarehouses, end points.
Data Warehousing Design Consideration and Dimensional
Modeling: Defining Dimensional Model, Granularity of Facts,
Additivity of Facts, Functional dependency of the Data, Helper Tables,
Implementation manyto-many relationships between fact and
dimensional modelling
PHONE:8097071144/55 2
3. What Is A Data Warehouse?
A data warehouse is a powerful database model that
significantly enhances the user’s ability to quickly
analyze large, multidimensional data sets.
It cleanses and organizes data to allow users to make
business decisions based on facts.
Creating data to be analytical requires that it be
subject-oriented, integrated, time-referenced, and
non-volatile.
PHONE:8097071144/55 3
4. Subject-Oriented Data
Data warehouses group data by subject rather than by
activity.
subjects— employees, accounts, sales, products.
This subject specific design helps in reducing the query
response time
Integrated Data
Integrated data refers to de-duplicating information and
merging it from many sources into one consistent
location.
Much of the transformation and loading work that goes
into the data warehouse is centered on integrating data
and standardizing it.
PHONE:8097071144/55 4
5. Time-Referenced Data
time-referenced data essentially refers to its time-valued
characteristic.
EG:the user may ask “What were the total sales of
product ‘A’ for the past three years on New Year’s Day
across region ‘Y ’?”.
This exploration activity is termed “data mining”
Non-Volatile Data
The non-volatility of data, characteristic of data
warehouse, enables users to dig deep into history and
arrive at specific business decisions based on facts.
PHONE:8097071144/55 5
6. Why A Data Warehouse?
The Data Access Crisis
Every day, organizations large and small, create billions
of bytes of data about all aspects of their business;
millions of individual facts about their customers,
products, operations and people. But for the most part,
this is locked up in a maze of computer systems and is
exceedingly difficult to get at.
This phenomenon has been described as “data in jail”.
PHONE:8097071144/55 6
7. Data Warehousing
Data warehousing is a field that has grown from the
integration of a number of different technologies and
experiences over the past two decades. These
experiences have allowed the IT industry to identify
the key problems that need to be solved.
PHONE:8097071144/55 7
8. Operational vs. Informational Systems
OPERATIONAL
Operational systems, as their name implies, are the
systems that help the every day operation of the
enterprise.
These are the backbone systems of any enterprise, and
include order entry, inventory, manufacturing, payroll
and accounting.
INFORMATIONAL
Informational systems deal with analyzing data and
making decisions
informational data needs often span a number of
different areas and need large amounts of related
operational data.
PHONE:8097071144/55 8
9. Framework Of The Data Warehouse
One of the reasons that data warehousing has taken
such a long time to develop is that it is actually a very
comprehensive technology.
In fact, it can be best represented as an enterprise-wide
framework for managing informational data within the
organization.
In order to understand how all the components involved
in a data warehousing strategy are related, it is essential
to have a Data Warehouse Architecture. In order to
understand how all the components involved in a data
warehousing strategy are related, it is essential to have a
Data Warehouse Architecture.
PHONE:8097071144/55 9
10. Data Warehouse Architecture
A Data Warehouse Architecture (DWA) is a way of
representing the overall structure of data,
communication, processing and presentation that
exists for end-user computing within the enterprise.
PHONE:8097071144/55 10
11. The architecture
is made up of a number of interconnected parts
Source system
Source data transport layer
Data quality control and data profiling layer
Metadata management layer
Data integration layer
Data processing layer
End user reporting layer
PHONE:8097071144/55 11
13. Data Warehouse Options
number of key factors that need to be considered to
develop data warehouses.
1. Scope of the data warehouse
The scope of a data warehouse may be as broad as all the
informational data for the entire enterprise from the
beginning of time, or it may be as narrow as a personal data
warehouse for a single manager for a single year.
broader the scope, the more valuable the warehouse is to the
enterprise and the more expensive and time consuming it is to
create and maintain.
PHONE:8097071144/55 13
14. 2. Data redundancy
three levels of data redundancy
“Virtual” or “point-to-point” data warehouses
End users are allowed to get at operational databases
directly
Central data warehouses
Central data warehouses are real. The data stored here is
accessible from one place and must be loaded and
maintained on a regular basis.
Distributed data warehouses
Distributed data warehouses are those in which certain
components are distributed across a number of different
physical databases.
PHONE:8097071144/55 14
15. 3. Type of End-user
Executives and managers
Power users (business and financial analysts, engineers)
Support users (clerical, administrative)
PHONE:8097071144/55 15
16. Developing Data Warehouses
Developing a good data warehouse is no different from any other IT
project— it requires careful planning, requirements definition, design,
prototyping and implementation.
Developing Strategy
There are a number of strategies by which organizations can get into data
warehousing.
Installing a set of data access, data directory and process management
facilities
Training the end-users
Monitoring how the data warehouse facilities are actually used
Based on actual usage, creating a physical data warehouse to support the
high-frequency requests
PHONE:8097071144/55 16
17. Evolving DWA
The DWA (Data Warehouse Architecture) is simply a framework for
understanding data warehousing and how the components of data warehouse
fit together.
One of the keys to data warehousing is flexibility
PHONE:8097071144/55 17
18. Designing Data Warehouses
Designing data warehouses is very different from
designing traditional operational systems.
1. needs as operational users.
2. thinking in terms of much broader, and more difficult
to define
3. quite close to Business Process Reengineering (BPR).
4. design strategy for a data warehouse
PHONE:8097071144/55 18
19. Managing Data Warehouses
how they want their warehouses to perform.
also recognize that the maintenance of the data
warehouse structure
IT management must understand that if they embark
on a data warehousing program
PHONE:8097071144/55 19
20. End Points
Data warehousing is growing by leaps and bounds and
it is becoming increasingly difficult to estimate what
new developments are most likely to affect it.
development of parallel DB servers with improved
query engines is likely to be one of the most
important.
Parallel servers will make it possible to access huge
data bases in much less time.
data warehouse planners and developers have a clear
idea of what they are looking for and then choose
strategies and methods that will provide them with
performance today and flexibility for tomorrow.
PHONE:8097071144/55 20
21. Goals
Provide Easy Access to Corporate Data
must be easy to use
Access should be graphic
They must easily get answers to their questions and ask new
questions, all without getting the IT team involved
The process of getting and analyzing data must be fast.
Provide Clean and Reliable Data for Analysis
For consistent analysis, the data environment must be stable
One department doing an analysis must get the same result as
any other
Source conflicts must be resolved.
Historical analysis must be possible, so that data can be
analyzed across a span of time
PHONE:8097071144/55 21
23. Warehouses support business decisions by collecting,
consolidating, and organizing data for reporting and
analysis with tools such as online analytical processing
(OLAP) and data mining models.
Although data warehouses are built on relational
database technology, the design of a data warehouse
data model and subsequent physical implementation
differs substantially from the design of an online
transaction processing (OLTP) system.
PHONE:8097071144/55 23
24. How do these two systems differ and what design considerations
should be kept in mind while designing a data warehouse data
model?
OLTP Database Data Warehouse Database
Designed for real-time business
transactions and processes.
Designed for analysis o f business
measures by subject area, category and
attributes.
Optimized for a common and known
set o f transactions.
Optimized for bulk loads and large
complex, unpredictable queries.
Designed for validation o f data during
transactions,
Designed to be loaded with consistent,
valid data; uses very minimal validation
Supports few concurrent users relative
to the OLTP environment.
Supports large user bases often
distributed across geographies.
Houses very minimal historical data Houses a mix o f most current
information as well as historical data
PHONE:8097071144/55 24
25. Defining Dimensional Model
The purpose of dimensional model is to improve
performance by matching data structures to queries.
Users query the data warehouse looking for data like
Total sales in volume and revenue for the NE region for
product ‘XYZ’ for a certain period this year compared to
the same period last year
The central theme of a dimensional model is the star schema, which
consists of a central fact table, containing measures, surrounded by
qualifiers or descriptors called ‘dimensions’.
PHONE:8097071144/55 25
26. In a star schema, if a dimension is complex and
contains relationships such as hierarchies, it is
compressed or flattened to a single dimension.
Another version of star schema is a snowflake schema. In a
snowflake schema complex dimensions are normalized.
Here, dimensions maintain relationships with other levels
of the same dimension.
PHONE:8097071144/55 26
29. Granularity Of Facts
The granularity of a fact is the level of detail at which it
is recorded. If data is to be analyzed effectively, it must
be all at the same level of granularity.
As a general rule, data should be kept at the highest
(most detailed) level of granularity.
PHONE:8097071144/55 29
31. Granularity is determined by:
Number of parts to a key
Granularity of those parts
Adding elements to an existing key always increases
the granularity of the data.
removing any part of an existing key decreases its
granularity
Using customer sales to illustrate this, a key of
customer ID and period ID is less granular than
customer ID, product ID and period ID.
PHONE:8097071144/55 31
32. Additivity Of Facts
A fact is additive over a particular dimension if adding
it through or over the dimension results in a fact with
the same essential meaning as the original, but is now
relative to the new granularity.
A fact is said to be fully additive if it is additive over
every dimension of its dimensionality
partially additive if additive over at least one but not all
of the dimensions
non-additive if not additive over any dimension
PHONE:8097071144/55 32
33. Functional Dependency Of The Data
Functional dependency of data means that the
attributes within a given entity are fully dependent on
the entire primary key of the entity— no more, no less.
(cust_id,cust_name,cust_add,…,…)
PHONE:8097071144/55 33
34. Helper Tables
Helper tables usually take one of two forms:
Help for multi-valued dimensions
Helper tables for complex hierarchies
PHONE:8097071144/55 34
35. Multi-Valued Dimensions
Take a situation where a household can own many
insurance policies, yet any policy could be owned by
multiple households.
The simple approach to this is the traditional
resolution of the many-to-many relationship, called an
associative entity.
The traditional way to resolve a many-to-many
relationship is to create an associative entity whose key
is formed from the keys of each participating parent.
PHONE:8097071144/55 35
37. Complex Hierarchies
A hierarchy is a tree structure, such as an organization
chart. Hierarchies can involve some form of recursive
relationship.
Recursive relationships come in two forms— “self’
relationships (1:M) and “bill of materials” relationships
(M: M). A self relationship involves one table whereas
a bill of materials involves two.
PHONE:8097071144/55 37