Why Be Normal

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Favorites, Groups & Events

    Why Be Normal - Presentation Transcript

    1. Why Be Normal: Understanding the benefits of a solid data model Rob Armstrong Teradata, Director of Data Warehouse Support
    2. What’s new? Lots of migrations from other platforms – Forklift old models – Data mart consolidations Database versus Company messages – Database doesn’t care – Experiences illustrate the business value Speed for Business Agility and Active Data Warehousing
    3. Big Points to Keep in Mind Logical Models are about relationships – Independent of function – Independent of technical limits Physical models are about functions – Performance – Data Management Your Physical model should preserve relationships while improving function
    4. Logical Modeling Normalized – Third Normal Form is enough while being useful – No surrogate or identity keys – No history or summary tables – Preserves relationships between entities Dimensional – Looks at usage of data – Embeds “dimensions” into “fact” tables – Logical Model typically retro-fitted from Physical Design
    5. What is the difference? Relationships are constant – Who provides what? – Who pays for what? – Where is service provided? – When is transaction effective? Functions constantly change – What customers paid with Cash? – What customers have not contacted the call center in the past 12 months?
    6. The benefits of Normal Models Referential Integrity is inherent to the model and therefore can be instantiated at the core level Transactional system like normalized models due to less data replication, making ETL and ELT easier Cost are lowered by less replication, minimized data management, and quicker application development
    7. The benefits of Normal Models Relationships are preserved, therefore new analytics are readily supported Supports natural growth as new subject areas are prioritized for inclusion into the enterprise model Normalized models support native unbalanced or ragged hierarchies
    8. The benefits of Normal Models Normalized models enable data mining and statistical analytics Supports complex analytics which are based on relational algebra Creates environment of “what if” instead of “how come”
    9. The REAL benefit of a Normal Model Supports change over time – Integration of new subject areas – Effective dating eliminates slowing changing dimensions – Provides multiple views of same data with consistency – New applications and user communities are absorbed with little effort
    10. To be fair, the benefits of Denormalized Tuned for the known access paths to give higher performance Model reflects the output to minimize data manipulation Easier for users to navigate and understand Can be built quickly
    11. The optimization escalation Normalized Model Views, Indexes, and Priority Cross functional denormalization and aggregation Specific denormalization and aggregation Extract, Expand, Examine
    12. Recent enhancements to help Recursive statement in SQL PPI (and Multi-level PPI) – Possibly remove cube builds to a great degree Bulk Merge – Removes obstacles for advanced indexes and multi-load In database OLAP processing – Advanced AJI’s, wizards, SAS procs TASM – Workload based and Service level goal reporting
    13. Mixed Workload Optimization SLA Base PSA 1 TDWM 4 Wrkld 1 Wrkld 3 CPU Cap Final Tactical 90% < 1 sec 56.61 44.09 56.60 56.23 65.15 90.46 91.50 BAM 95% < 60 sec 40 29.41 48.50 63.00 42.85 88.57 100.00 DSS 85% < 600 Sec 73.7 88.98 90.50 94.04 86.31 88.80 87.60 Tactical 2000/hr. 5292 4654 4316 5274 7002 31750 7874 BAM 60/hr 70 68 70 60 70 70 70 DSS 200/hr 122 236 274 302 190 268 306 Mini Batch 10/sec 11.1 22.2 22.2 33.3 33.3 44.44 33.33 Sales_Txn 50/sec 88 53 55.5 44.4 91.57 79.49 144.44 Sales_txn_line 25/sec 16.4 7.7 19.89 17.27 26.67 13.11 30.73
    14. Pitfalls to avoid LDM to PDM – Over compromising for known queries – Addition of indexes and summary tables – Use of history tables Primary Index selection – Model is correct but PI is wrong – Distribution first, access path second Data Integrity – Missing referential integrity leads to outer-joins – Data type inconsistency leads to over- processing
    15. Other modeling points to watch Surrogate Columns – Used to “simplify” joins – Have to be ingrained everywhere – Rarely known for access purposes Identity Columns – Definitions – Same problems as surrogates Intelligent Keys – Embedded information within larger datatypes – Ex. VIN number – Creates maintenance obstacles if parts need to change
    16. Going Forward… Remember Data Warehousing is to drive change and therefore must support constant change Data relationships and transactions are constant, it is access and output that change For processes to change quickly, the data manipulation must be removed from the path Have the model reflect the atomic data relationship and historical relevance
    17. Now what? New migrations – Get model correct if at all possible – If consolidating, realize integrating is the next, and more important, step – At least get major data elements consistent Existing systems – Look at subject areas with high overlap – Look for the analytics that are proving tricky – Work to show the value of normalization with more cross functional analytics
    SlideShare Zeitgeist 2009

    + Teradata  CorporationTeradata Corporation Nominate

    custom

    129 views, 0 favs, 0 embeds more stats

    Learn the benefits of a solid data warehouse in thi more

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 129
      • 129 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 3
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories