SlideShare a Scribd company logo
1 of 43
Download to read offline
Information & Knowledge
  Management - Class 3
        Marielba Zacarias
       Prof. Auxiliar DEEI
    FCT I, Gab 2.69, Ext. 7749
         Data-warehousing
          mzacaria@ualg.pt
    http://w3.ualg.pt/~mzacaria
Summary

Data-warehouses
 The architected environment
 Design Process
 Data-modeling schemas
Data Warehousing
Data collection for analysis and
reporting taks
Historical data
Stored in a distinct environment from
operational data
Structure different from data-bases
Why
Operational and analitical data have
different requirements in terms of
 usage (frequency, response time)
 hardware
 software
 structure
Data-warehousing
     Users
Before Data-Warehouses....
      The “spider web”




            6
The “arquitected” environment”

                           Atomic                  Dept.              individual
 operational
                             dw                     dw                   dw
                                               “data-marts”
       Detailed                                                           temporal
                         More granular               derived,
         daily                                                             Ad-hoc
                           Temporal              Some primitive
    current value                                                         Heuristic
                          Integrated           Typical of Marketing
  High access prob.                                                    Não-repetitive
                        Subject oriented           Engineering
 Application oriented                                                 Oriented to PC or
                          Sumarized                 Production
                                                                        workstations
                                                   Accounting



                                           7
Type of questions
                  Atomico
  operacional                     Dept.        individual
                    dw


  J. Jones         1986-87
                                Jan – 4101    Clientes
123 Main St.       J. Jones
                                Fev – 4209   Desde 1982
 Credit - AA     456 High St.
                                Mar- 4175    Com saldos
                  Credit - B
                                Apr - 4215    > 5,000
   Jones                                      e crédito
   Credit?         1987-89
                                 Monthly        >= B
                   J. Jones
                 456 High St.    Sales?
                  Credit - A

                 1989 – pte.                 Client types
       Jones       J. Jones                  in analysis?
       Credit    123 Main St.
      History?    Credit - AA
                            8
Architected Environment
                Production
               Environment




 Operational                  Analitical
 environment                 Environment


                   9
Data-warehouse design
 Requirement         Performance Tuning
 Gatherings          Query
 Physical            Optimization
 Environment Setup   Quality Assurance
 Data Modeling       Rolling out to
 ETL                 Production
 OLAP Cube Design    Production
 Front End           Maintenance
 Development         Incremental
                     Enhancements
 Report
 Development
Requirements
       Gathering
Take into account users
  Executive with little time and knowledge about
  technical terms
  Interviews, JAD sessions
    User Reporting/Analysis Requirements
    Hardware, training requirements
    Data source identification
    Concrete project plan
Physical Environment
        Setup
Setup Servers, DBMS and databases,
ETL, OLAP Cubes and reporting services
Create three environments
 development, testing, production
Data-modeling
            Depends on initial data source identification
            Conceptual, logical and physical data modeling




 Should be related
to the information
  architecture!!!!
Data Modeling
  Dimensional Approach
Transactional data is partitioned in facts
  Numeric transaction data
    products ordered, price
Dimensions
  provide context for facts
    order date, customer name, product
    number, location info, salesperson
Dimensional Approaches
 Star
   Fact table (typically a transaction)
   Dimensions (context of the transaction)
 Snowflake
   Dimensions indirectly linked to fact
   tables
Star Metaphor
Star Schema
Relational model
Star schema
Snow-flake schema
OLAP Cube Design
Specification of detailed reporting needs
in terms of the multi-dimensional
structure previously defined (star or
snowflake), but regarded as a n-
dimensional cube
star/snowflake and cubes are pretty
much the same thing
cubes are more appropriate for not IT
users
The Cube Metaphor
Slicing
Dicing
Rotating
ETL

Extraction
Transformation
Loading
SQL Server
Integration Services
SQL Server
Integration Examples
SQL Server
Integration Examples II
    Qualitative data
                 Description term                 ActionId
                 team meeting                          18
                 hr distribution                       19
                 project list                          19
                 team meeting                          19
                 hr distribution                       26
                 project list                          26
                 claims application                    27
                 claims application                    28
                 cards application maintenance         29
                 claims application integration        30
                 hr distribution                       31
                 project list                          31
                 claims application                    34
                 claims application                    35
                 hr distribution                       36
                 project list                          36
SQL Server
Integration Examples III
   Fuzzy Transformations
Front-end development
 Front-ends range from
   in-house development with scripting
   languages php, asp, or perl
   to off-the-shelf products such as Crystal
   Reports or higher-end products such as
   Actuate
   OLAP vendors also offer front-ends of their
   own
Report Development
Derived from requirements
Main point of contact between the data-
warehouse and users
User customization
Report Delivery (web, e-mail, sms, file
formats)
Access privileges
Performance Tuning
ETL
Query Processing
 Users loose interest after 30 sec!
 Query optimization
Report Delivery
Query Optimization
Understand how your DBMS executes queries
Store intermediate results in temporary tables
Query Optimization tips
  Use indexes
  Partition tables (vertically and horizontally)
  De-normalize (less joins)
  Server Tuning
Quality Assurance
Test plan with quality criteria for data
Critical success factor
Often overlooked
Performed by people with knowledge of
the business data not data-warehouses
  Resistance
Rolling to production

Seems easy but..
Putting everyone online may take a full
week in some cases
Online access can be as simple as
sending a link by e-mail
Production Maintenance
 Backup and recovery processes
 Crisis Management
 Monitoring end-user usage
  Capture runaways queries before
  whole system is slowed down
  To measure usage for ROI calculations
  and future enhancements
Incremental enhancements

  Accomplish small changes such as
  changing original geographical
  designations
   A company may add new sales regions
  No matter how simple, never do them
  directly in production environment
Architected environment
Architected Enviroment
Architected
Environment
Architected environment
Tools for unstructured
information management
 Content Management Systems
 Record Management Systems
 Digital Image Management Systems
 Digital Asset Management Systems
 Digital Imaging Systems

More Related Content

What's hot

Collaborate 2012-business data transformation and consolidation for a global ...
Collaborate 2012-business data transformation and consolidation for a global ...Collaborate 2012-business data transformation and consolidation for a global ...
Collaborate 2012-business data transformation and consolidation for a global ...Chain Sys Corporation
 
Microsoft SQL Server - How to Collaboratively Manage Excel Data
Microsoft SQL Server - How to Collaboratively Manage Excel DataMicrosoft SQL Server - How to Collaboratively Manage Excel Data
Microsoft SQL Server - How to Collaboratively Manage Excel DataMark Ginnebaugh
 
Accel Partners New Data Workshop 7-14-10
Accel Partners New Data Workshop 7-14-10Accel Partners New Data Workshop 7-14-10
Accel Partners New Data Workshop 7-14-10keirdo1
 
NASA Facilities GIS
NASA Facilities GISNASA Facilities GIS
NASA Facilities GISrjinterr
 
Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831Cana Ko
 
Case Study: Using SAP to Streamline Operations of a Manufacturer
Case Study: Using SAP to Streamline Operations of a ManufacturerCase Study: Using SAP to Streamline Operations of a Manufacturer
Case Study: Using SAP to Streamline Operations of a ManufacturerAndrew Ho
 
Informatica World 2006 - MDM Data Quality
Informatica World 2006 - MDM Data QualityInformatica World 2006 - MDM Data Quality
Informatica World 2006 - MDM Data QualityDatabase Architechs
 
Liquidity Risk Management powered by SAP HANA
Liquidity Risk Management powered by SAP HANALiquidity Risk Management powered by SAP HANA
Liquidity Risk Management powered by SAP HANASAP Technology
 
Scaling your applications with the ims catalog
Scaling your applications with the ims catalogScaling your applications with the ims catalog
Scaling your applications with the ims catalogYuhui Li
 
The fillmore-group-aese-presentation-111810
The fillmore-group-aese-presentation-111810The fillmore-group-aese-presentation-111810
The fillmore-group-aese-presentation-111810Gennaro (Rino) Persico
 
January 2006 Document Scanning Considerations Presentation
January 2006 Document Scanning Considerations PresentationJanuary 2006 Document Scanning Considerations Presentation
January 2006 Document Scanning Considerations PresentationJohn Wang
 
Vbmca204821311240
Vbmca204821311240Vbmca204821311240
Vbmca204821311240Ayushi Jain
 

What's hot (17)

data archiving
data archivingdata archiving
data archiving
 
My C.V
My C.VMy C.V
My C.V
 
Oracle: Fundamental Of DW
Oracle: Fundamental Of DWOracle: Fundamental Of DW
Oracle: Fundamental Of DW
 
Oracle Data Warehouse
Oracle Data WarehouseOracle Data Warehouse
Oracle Data Warehouse
 
Collaborate 2012-business data transformation and consolidation for a global ...
Collaborate 2012-business data transformation and consolidation for a global ...Collaborate 2012-business data transformation and consolidation for a global ...
Collaborate 2012-business data transformation and consolidation for a global ...
 
Plm Data Migration
Plm Data MigrationPlm Data Migration
Plm Data Migration
 
Microsoft SQL Server - How to Collaboratively Manage Excel Data
Microsoft SQL Server - How to Collaboratively Manage Excel DataMicrosoft SQL Server - How to Collaboratively Manage Excel Data
Microsoft SQL Server - How to Collaboratively Manage Excel Data
 
Accel Partners New Data Workshop 7-14-10
Accel Partners New Data Workshop 7-14-10Accel Partners New Data Workshop 7-14-10
Accel Partners New Data Workshop 7-14-10
 
NASA Facilities GIS
NASA Facilities GISNASA Facilities GIS
NASA Facilities GIS
 
Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831
 
Case Study: Using SAP to Streamline Operations of a Manufacturer
Case Study: Using SAP to Streamline Operations of a ManufacturerCase Study: Using SAP to Streamline Operations of a Manufacturer
Case Study: Using SAP to Streamline Operations of a Manufacturer
 
Informatica World 2006 - MDM Data Quality
Informatica World 2006 - MDM Data QualityInformatica World 2006 - MDM Data Quality
Informatica World 2006 - MDM Data Quality
 
Liquidity Risk Management powered by SAP HANA
Liquidity Risk Management powered by SAP HANALiquidity Risk Management powered by SAP HANA
Liquidity Risk Management powered by SAP HANA
 
Scaling your applications with the ims catalog
Scaling your applications with the ims catalogScaling your applications with the ims catalog
Scaling your applications with the ims catalog
 
The fillmore-group-aese-presentation-111810
The fillmore-group-aese-presentation-111810The fillmore-group-aese-presentation-111810
The fillmore-group-aese-presentation-111810
 
January 2006 Document Scanning Considerations Presentation
January 2006 Document Scanning Considerations PresentationJanuary 2006 Document Scanning Considerations Presentation
January 2006 Document Scanning Considerations Presentation
 
Vbmca204821311240
Vbmca204821311240Vbmca204821311240
Vbmca204821311240
 

Viewers also liked

Amiel pangilinan how to use ge.tt
Amiel pangilinan how to use ge.ttAmiel pangilinan how to use ge.tt
Amiel pangilinan how to use ge.ttAmiel Pangilinan
 
Разработка кросс-платформенных мобильных приложений с использованием Appceler...
Разработка кросс-платформенных мобильных приложений с использованием Appceler...Разработка кросс-платформенных мобильных приложений с использованием Appceler...
Разработка кросс-платформенных мобильных приложений с использованием Appceler...Gennadiy Potapov
 
Forest-poverty-commodity links in the Congo Basin: A value chain perspective
Forest-poverty-commodity links in the Congo Basin: A value chain perspectiveForest-poverty-commodity links in the Congo Basin: A value chain perspective
Forest-poverty-commodity links in the Congo Basin: A value chain perspectiveVerina Ingram
 
Visitation neptune
Visitation neptuneVisitation neptune
Visitation neptuneLisa Baird
 
Corporate wellbeing
Corporate wellbeingCorporate wellbeing
Corporate wellbeingRavi Samuel
 
How to use spybot search and destroy
 How to use spybot search and destroy How to use spybot search and destroy
How to use spybot search and destroyAmiel Pangilinan
 
Reported statements
Reported statementsReported statements
Reported statementsVicky
 
Mission mercury
Mission mercuryMission mercury
Mission mercuryLisa Baird
 
CESSI en Information Technology - Exportar conocimiento, la clave para crecer
CESSI en Information Technology - Exportar conocimiento, la clave para crecerCESSI en Information Technology - Exportar conocimiento, la clave para crecer
CESSI en Information Technology - Exportar conocimiento, la clave para crecerCESSI ArgenTIna
 
How to use spagepark billing
How to use spagepark billingHow to use spagepark billing
How to use spagepark billingAmiel Pangilinan
 
Plan Amsterdam, over de brettenzone en Sloterdijk met onze bijdrage!
Plan Amsterdam, over de brettenzone en Sloterdijk met onze bijdrage!Plan Amsterdam, over de brettenzone en Sloterdijk met onze bijdrage!
Plan Amsterdam, over de brettenzone en Sloterdijk met onze bijdrage!Wouter Valkenier
 
A National Management Plan for a protected non-timber CITES listed tree speci...
A National Management Plan for a protected non-timber CITES listed tree speci...A National Management Plan for a protected non-timber CITES listed tree speci...
A National Management Plan for a protected non-timber CITES listed tree speci...Verina Ingram
 

Viewers also liked (20)

Amiel pangilinan how to use ge.tt
Amiel pangilinan how to use ge.ttAmiel pangilinan how to use ge.tt
Amiel pangilinan how to use ge.tt
 
Разработка кросс-платформенных мобильных приложений с использованием Appceler...
Разработка кросс-платформенных мобильных приложений с использованием Appceler...Разработка кросс-платформенных мобильных приложений с использованием Appceler...
Разработка кросс-платформенных мобильных приложений с использованием Appceler...
 
Community Marketing 2.0
Community Marketing 2.0Community Marketing 2.0
Community Marketing 2.0
 
SD92 Nisga'a Language & Culture Presentation
SD92 Nisga'a Language & Culture PresentationSD92 Nisga'a Language & Culture Presentation
SD92 Nisga'a Language & Culture Presentation
 
Forest-poverty-commodity links in the Congo Basin: A value chain perspective
Forest-poverty-commodity links in the Congo Basin: A value chain perspectiveForest-poverty-commodity links in the Congo Basin: A value chain perspective
Forest-poverty-commodity links in the Congo Basin: A value chain perspective
 
Glaciers
GlaciersGlaciers
Glaciers
 
Visitation neptune
Visitation neptuneVisitation neptune
Visitation neptune
 
2014 CityMatters Survey Results
2014 CityMatters Survey Results2014 CityMatters Survey Results
2014 CityMatters Survey Results
 
The Halifax Index 2012 Summary
The Halifax Index 2012 Summary The Halifax Index 2012 Summary
The Halifax Index 2012 Summary
 
#myHFXpledge
#myHFXpledge#myHFXpledge
#myHFXpledge
 
Andy warhol
Andy warholAndy warhol
Andy warhol
 
The galaxies
The galaxiesThe galaxies
The galaxies
 
Corporate wellbeing
Corporate wellbeingCorporate wellbeing
Corporate wellbeing
 
How to use spybot search and destroy
 How to use spybot search and destroy How to use spybot search and destroy
How to use spybot search and destroy
 
Reported statements
Reported statementsReported statements
Reported statements
 
Mission mercury
Mission mercuryMission mercury
Mission mercury
 
CESSI en Information Technology - Exportar conocimiento, la clave para crecer
CESSI en Information Technology - Exportar conocimiento, la clave para crecerCESSI en Information Technology - Exportar conocimiento, la clave para crecer
CESSI en Information Technology - Exportar conocimiento, la clave para crecer
 
How to use spagepark billing
How to use spagepark billingHow to use spagepark billing
How to use spagepark billing
 
Plan Amsterdam, over de brettenzone en Sloterdijk met onze bijdrage!
Plan Amsterdam, over de brettenzone en Sloterdijk met onze bijdrage!Plan Amsterdam, over de brettenzone en Sloterdijk met onze bijdrage!
Plan Amsterdam, over de brettenzone en Sloterdijk met onze bijdrage!
 
A National Management Plan for a protected non-timber CITES listed tree speci...
A National Management Plan for a protected non-timber CITES listed tree speci...A National Management Plan for a protected non-timber CITES listed tree speci...
A National Management Plan for a protected non-timber CITES listed tree speci...
 

Similar to Gic2011 aula3-ingles

Day 02 sap_bi_overview_and_terminology
Day 02 sap_bi_overview_and_terminologyDay 02 sap_bi_overview_and_terminology
Day 02 sap_bi_overview_and_terminologytovetrivel
 
A Hybrid Technology Platform for Increasing the Speed of Operational Analytics
A Hybrid Technology Platform for Increasing the Speed of Operational AnalyticsA Hybrid Technology Platform for Increasing the Speed of Operational Analytics
A Hybrid Technology Platform for Increasing the Speed of Operational AnalyticsIBMGovernmentCA
 
AIDC NY: BODO AI Presentation - 09.19.2019
AIDC NY: BODO AI Presentation - 09.19.2019AIDC NY: BODO AI Presentation - 09.19.2019
AIDC NY: BODO AI Presentation - 09.19.2019Intel® Software
 
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Denodo
 
Data Virtualization for Data Architects (New Zealand)
Data Virtualization for Data Architects (New Zealand)Data Virtualization for Data Architects (New Zealand)
Data Virtualization for Data Architects (New Zealand)Denodo
 
Demantra Case Study Doug
Demantra Case Study DougDemantra Case Study Doug
Demantra Case Study Dougsichie
 
How we evolved data pipeline at Celtra and what we learned along the way
How we evolved data pipeline at Celtra and what we learned along the wayHow we evolved data pipeline at Celtra and what we learned along the way
How we evolved data pipeline at Celtra and what we learned along the wayGrega Kespret
 
Introduction to Modern Data Virtualization 2021 (APAC)
Introduction to Modern Data Virtualization 2021 (APAC)Introduction to Modern Data Virtualization 2021 (APAC)
Introduction to Modern Data Virtualization 2021 (APAC)Denodo
 
ADV Slides: Comparing the Enterprise Analytic Solutions
ADV Slides: Comparing the Enterprise Analytic SolutionsADV Slides: Comparing the Enterprise Analytic Solutions
ADV Slides: Comparing the Enterprise Analytic SolutionsDATAVERSITY
 
rough-work.pptx
rough-work.pptxrough-work.pptx
rough-work.pptxsharpan
 
Anexinet Big Data Solutions
Anexinet Big Data SolutionsAnexinet Big Data Solutions
Anexinet Big Data SolutionsMark Kromer
 
IBM Insight 2013 - Aetna's production experience using IBM DB2 Analytics Acce...
IBM Insight 2013 - Aetna's production experience using IBM DB2 Analytics Acce...IBM Insight 2013 - Aetna's production experience using IBM DB2 Analytics Acce...
IBM Insight 2013 - Aetna's production experience using IBM DB2 Analytics Acce...Daniel Martin
 
OLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingOLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingPrithwis Mukerjee
 
FlexPod Datacenter for Oracle’s JD Edwards EnterpriseOne
FlexPod Datacenter for Oracle’s JD Edwards EnterpriseOneFlexPod Datacenter for Oracle’s JD Edwards EnterpriseOne
FlexPod Datacenter for Oracle’s JD Edwards EnterpriseOneNetApp
 
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and DatabricksSelf-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and DatabricksGrega Kespret
 
Exploring Neo4j Graph Database as a Fast Data Access Layer
Exploring Neo4j Graph Database as a Fast Data Access LayerExploring Neo4j Graph Database as a Fast Data Access Layer
Exploring Neo4j Graph Database as a Fast Data Access LayerSambit Banerjee
 
Data-Centric Approach for Project Delivery
Data-Centric Approach for Project DeliveryData-Centric Approach for Project Delivery
Data-Centric Approach for Project DeliveryAVEVA Group plc
 
Making Big Data Analytics with Hadoop fast & easy (webinar slides)
Making Big Data Analytics with Hadoop fast & easy (webinar slides)Making Big Data Analytics with Hadoop fast & easy (webinar slides)
Making Big Data Analytics with Hadoop fast & easy (webinar slides)Yellowfin
 

Similar to Gic2011 aula3-ingles (20)

Day 02 sap_bi_overview_and_terminology
Day 02 sap_bi_overview_and_terminologyDay 02 sap_bi_overview_and_terminology
Day 02 sap_bi_overview_and_terminology
 
A Hybrid Technology Platform for Increasing the Speed of Operational Analytics
A Hybrid Technology Platform for Increasing the Speed of Operational AnalyticsA Hybrid Technology Platform for Increasing the Speed of Operational Analytics
A Hybrid Technology Platform for Increasing the Speed of Operational Analytics
 
AIDC NY: BODO AI Presentation - 09.19.2019
AIDC NY: BODO AI Presentation - 09.19.2019AIDC NY: BODO AI Presentation - 09.19.2019
AIDC NY: BODO AI Presentation - 09.19.2019
 
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
 
Data Virtualization for Data Architects (New Zealand)
Data Virtualization for Data Architects (New Zealand)Data Virtualization for Data Architects (New Zealand)
Data Virtualization for Data Architects (New Zealand)
 
Demantra Case Study Doug
Demantra Case Study DougDemantra Case Study Doug
Demantra Case Study Doug
 
How we evolved data pipeline at Celtra and what we learned along the way
How we evolved data pipeline at Celtra and what we learned along the wayHow we evolved data pipeline at Celtra and what we learned along the way
How we evolved data pipeline at Celtra and what we learned along the way
 
Introduction to Modern Data Virtualization 2021 (APAC)
Introduction to Modern Data Virtualization 2021 (APAC)Introduction to Modern Data Virtualization 2021 (APAC)
Introduction to Modern Data Virtualization 2021 (APAC)
 
ADV Slides: Comparing the Enterprise Analytic Solutions
ADV Slides: Comparing the Enterprise Analytic SolutionsADV Slides: Comparing the Enterprise Analytic Solutions
ADV Slides: Comparing the Enterprise Analytic Solutions
 
rough-work.pptx
rough-work.pptxrough-work.pptx
rough-work.pptx
 
Axug
AxugAxug
Axug
 
Anexinet Big Data Solutions
Anexinet Big Data SolutionsAnexinet Big Data Solutions
Anexinet Big Data Solutions
 
BI Introduction
BI IntroductionBI Introduction
BI Introduction
 
IBM Insight 2013 - Aetna's production experience using IBM DB2 Analytics Acce...
IBM Insight 2013 - Aetna's production experience using IBM DB2 Analytics Acce...IBM Insight 2013 - Aetna's production experience using IBM DB2 Analytics Acce...
IBM Insight 2013 - Aetna's production experience using IBM DB2 Analytics Acce...
 
OLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingOLAP Cubes in Datawarehousing
OLAP Cubes in Datawarehousing
 
FlexPod Datacenter for Oracle’s JD Edwards EnterpriseOne
FlexPod Datacenter for Oracle’s JD Edwards EnterpriseOneFlexPod Datacenter for Oracle’s JD Edwards EnterpriseOne
FlexPod Datacenter for Oracle’s JD Edwards EnterpriseOne
 
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and DatabricksSelf-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
 
Exploring Neo4j Graph Database as a Fast Data Access Layer
Exploring Neo4j Graph Database as a Fast Data Access LayerExploring Neo4j Graph Database as a Fast Data Access Layer
Exploring Neo4j Graph Database as a Fast Data Access Layer
 
Data-Centric Approach for Project Delivery
Data-Centric Approach for Project DeliveryData-Centric Approach for Project Delivery
Data-Centric Approach for Project Delivery
 
Making Big Data Analytics with Hadoop fast & easy (webinar slides)
Making Big Data Analytics with Hadoop fast & easy (webinar slides)Making Big Data Analytics with Hadoop fast & easy (webinar slides)
Making Big Data Analytics with Hadoop fast & easy (webinar slides)
 

More from Marielba-Mayeya Zacarias (18)

Gic2012 aula7-ingles
Gic2012 aula7-inglesGic2012 aula7-ingles
Gic2012 aula7-ingles
 
Gic2012 aula2-ingles
Gic2012 aula2-inglesGic2012 aula2-ingles
Gic2012 aula2-ingles
 
Gic2011 aula10-ingles
Gic2011 aula10-inglesGic2011 aula10-ingles
Gic2011 aula10-ingles
 
Gic2011 aula9-ingles
Gic2011 aula9-inglesGic2011 aula9-ingles
Gic2011 aula9-ingles
 
Gic2011 aula8-ingles
Gic2011 aula8-inglesGic2011 aula8-ingles
Gic2011 aula8-ingles
 
Gic2011 aula8-ingles
Gic2011 aula8-inglesGic2011 aula8-ingles
Gic2011 aula8-ingles
 
Gic2011 aula7-ingles-theory
Gic2011 aula7-ingles-theoryGic2011 aula7-ingles-theory
Gic2011 aula7-ingles-theory
 
Group5 ppt
Group5 pptGroup5 ppt
Group5 ppt
 
Group1 ppt
Group1 pptGroup1 ppt
Group1 ppt
 
Gic2011 aula6-ingles
Gic2011 aula6-inglesGic2011 aula6-ingles
Gic2011 aula6-ingles
 
Gic2011 aula5-ingles
Gic2011 aula5-inglesGic2011 aula5-ingles
Gic2011 aula5-ingles
 
Gic2011 aula05-ingles
Gic2011 aula05-inglesGic2011 aula05-ingles
Gic2011 aula05-ingles
 
Gic2011 aula4-ingles-tool section
Gic2011 aula4-ingles-tool sectionGic2011 aula4-ingles-tool section
Gic2011 aula4-ingles-tool section
 
Gic2011 aula4-ingles-theory
Gic2011 aula4-ingles-theoryGic2011 aula4-ingles-theory
Gic2011 aula4-ingles-theory
 
Gic2011 aula3-ingles
Gic2011 aula3-inglesGic2011 aula3-ingles
Gic2011 aula3-ingles
 
Gic2011 aula1-ingles
Gic2011 aula1-inglesGic2011 aula1-ingles
Gic2011 aula1-ingles
 
Gic2011 aula1-ingles
Gic2011 aula1-inglesGic2011 aula1-ingles
Gic2011 aula1-ingles
 
Gic2011 aula0-ingles
Gic2011 aula0-inglesGic2011 aula0-ingles
Gic2011 aula0-ingles
 

Recently uploaded

Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsKarinaGenton
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
PSYCHIATRIC History collection FORMAT.pptx
PSYCHIATRIC   History collection FORMAT.pptxPSYCHIATRIC   History collection FORMAT.pptx
PSYCHIATRIC History collection FORMAT.pptxPoojaSen20
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 

Recently uploaded (20)

Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.Compdf
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its Characteristics
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
PSYCHIATRIC History collection FORMAT.pptx
PSYCHIATRIC   History collection FORMAT.pptxPSYCHIATRIC   History collection FORMAT.pptx
PSYCHIATRIC History collection FORMAT.pptx
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 

Gic2011 aula3-ingles

  • 1. Information & Knowledge Management - Class 3 Marielba Zacarias Prof. Auxiliar DEEI FCT I, Gab 2.69, Ext. 7749 Data-warehousing mzacaria@ualg.pt http://w3.ualg.pt/~mzacaria
  • 2. Summary Data-warehouses The architected environment Design Process Data-modeling schemas
  • 3. Data Warehousing Data collection for analysis and reporting taks Historical data Stored in a distinct environment from operational data Structure different from data-bases
  • 4. Why Operational and analitical data have different requirements in terms of usage (frequency, response time) hardware software structure
  • 6. Before Data-Warehouses.... The “spider web” 6
  • 7. The “arquitected” environment” Atomic Dept. individual operational dw dw dw “data-marts” Detailed temporal More granular derived, daily Ad-hoc Temporal Some primitive current value Heuristic Integrated Typical of Marketing High access prob. Não-repetitive Subject oriented Engineering Application oriented Oriented to PC or Sumarized Production workstations Accounting 7
  • 8. Type of questions Atomico operacional Dept. individual dw J. Jones 1986-87 Jan – 4101 Clientes 123 Main St. J. Jones Fev – 4209 Desde 1982 Credit - AA 456 High St. Mar- 4175 Com saldos Credit - B Apr - 4215 > 5,000 Jones e crédito Credit? 1987-89 Monthly >= B J. Jones 456 High St. Sales? Credit - A 1989 – pte. Client types Jones J. Jones in analysis? Credit 123 Main St. History? Credit - AA 8
  • 9. Architected Environment Production Environment Operational Analitical environment Environment 9
  • 10. Data-warehouse design Requirement Performance Tuning Gatherings Query Physical Optimization Environment Setup Quality Assurance Data Modeling Rolling out to ETL Production OLAP Cube Design Production Front End Maintenance Development Incremental Enhancements Report Development
  • 11. Requirements Gathering Take into account users Executive with little time and knowledge about technical terms Interviews, JAD sessions User Reporting/Analysis Requirements Hardware, training requirements Data source identification Concrete project plan
  • 12. Physical Environment Setup Setup Servers, DBMS and databases, ETL, OLAP Cubes and reporting services Create three environments development, testing, production
  • 13. Data-modeling Depends on initial data source identification Conceptual, logical and physical data modeling Should be related to the information architecture!!!!
  • 14. Data Modeling Dimensional Approach Transactional data is partitioned in facts Numeric transaction data products ordered, price Dimensions provide context for facts order date, customer name, product number, location info, salesperson
  • 15. Dimensional Approaches Star Fact table (typically a transaction) Dimensions (context of the transaction) Snowflake Dimensions indirectly linked to fact tables
  • 21. OLAP Cube Design Specification of detailed reporting needs in terms of the multi-dimensional structure previously defined (star or snowflake), but regarded as a n- dimensional cube star/snowflake and cubes are pretty much the same thing cubes are more appropriate for not IT users
  • 29. SQL Server Integration Examples II Qualitative data Description term ActionId team meeting 18 hr distribution 19 project list 19 team meeting 19 hr distribution 26 project list 26 claims application 27 claims application 28 cards application maintenance 29 claims application integration 30 hr distribution 31 project list 31 claims application 34 claims application 35 hr distribution 36 project list 36
  • 30. SQL Server Integration Examples III Fuzzy Transformations
  • 31. Front-end development Front-ends range from in-house development with scripting languages php, asp, or perl to off-the-shelf products such as Crystal Reports or higher-end products such as Actuate OLAP vendors also offer front-ends of their own
  • 32. Report Development Derived from requirements Main point of contact between the data- warehouse and users User customization Report Delivery (web, e-mail, sms, file formats) Access privileges
  • 33. Performance Tuning ETL Query Processing Users loose interest after 30 sec! Query optimization Report Delivery
  • 34. Query Optimization Understand how your DBMS executes queries Store intermediate results in temporary tables Query Optimization tips Use indexes Partition tables (vertically and horizontally) De-normalize (less joins) Server Tuning
  • 35. Quality Assurance Test plan with quality criteria for data Critical success factor Often overlooked Performed by people with knowledge of the business data not data-warehouses Resistance
  • 36. Rolling to production Seems easy but.. Putting everyone online may take a full week in some cases Online access can be as simple as sending a link by e-mail
  • 37. Production Maintenance Backup and recovery processes Crisis Management Monitoring end-user usage Capture runaways queries before whole system is slowed down To measure usage for ROI calculations and future enhancements
  • 38. Incremental enhancements Accomplish small changes such as changing original geographical designations A company may add new sales regions No matter how simple, never do them directly in production environment
  • 43. Tools for unstructured information management Content Management Systems Record Management Systems Digital Image Management Systems Digital Asset Management Systems Digital Imaging Systems