SlideShare a Scribd company logo
1 of 21
Microsoft Analysis Services
Physical Design

James Snape
Application Development Consulting
Microsoft Limited
The Kimball Process
Agenda

 Hardware
 Dimensions
 Facts
 Relational stuff
 Performance tuning next steps

 NB: Relational design not complete –
 logging, auditing etc discussed in next
 session
Prime Directive:

  Sequential IO Good,

  Random IO Bad
Hardware

 SQL Server Fast Track Data Warehouse
   www.microsoft.com/sqlserver/2008/en/us/fasttrack.aspx


 Pre-tested hardware configurations
 Specific disk, filegroup, layouts
 Minimal indexing

 To feed CPU at maximum capacity
Dimensions vs Facts

  Dimension
    Small (relatively)
    Repeating data


  Fact
    Large
    Numeric data + keys


  Treat them differently
Dimensions in Relational Terms
                    Customer
  Table structure
                    Full Name
  Keys              Post Code
                    City
  Indexes           State
                    Country
  Null handling     Gender
                    Occupation
  Managing change                    Customer
                    Marital Status
                                     Geography
                    Email Address
  Processing
                                     1.   Country
                                     2.   State
                                     3.   City
                                     4.   Post Code
                                     5.   Full Name
Star vs. Snowflake Schemas

                      dbo.Customer
 dbo.Customer         CustomerKey
 CustomerKey          GeographyKey
 FullName             FullName
 PostCode             Gender
 City                 Occupation
 State                MaritalStatus
                 OR
 Country              EmailAddress
 Gender
 Occupation
 MaritalStatus                        dbo.Geography
 EmailAddress
                                      GeographyKey
                                      PostCode
                                      City
 NB: both are denormalized,           State
 one more than the other              Country
Primary Keys

  Use smallest possible integer as surrogate
  primary key
    Primary key is a “row identifier”
    Multiple row “versions” are possible
    “None” and “Unknown” special values are useful
  Do NOT use business/source system keys
  Clustered primary key is OK for dimensions
Dimension Indexes

  Dimension processing queries of the form:
    SELECT DISTINCT .... FROM ....


  WHERE (filter) clauses never used
  WHERE (join) clauses are used in snowflake
  dimensions

  Non-processing queries may end up in SQL
    ROLAP dimensions
    Direct to SQL queries
Null Handling in Dimensions

  By default NULL converts to 0 or an empty
  string
  NULL attribute keys can invoke special
  “Unknown Member” handling
  Prefer to create a specific “Unknown” row
    CustomerKey   FullName     City      Country
    -1            Unknown      Unknown   Unknown
    -2            None         None      None
    1243          John Smith   London    United Kingdom
    1244          Mary Jones   Glasgow   United Kingdom
Dimension Attributes

  Attributes have keys, names (and values)
  Integer attribute keys are smaller and faster
  Keys must be unique
                  Attribute       Key        Name         (Value)
                  Year            2009       CY 2009      2009
                  Month           4          April        4
                  Month of Year   20090400   April 2009   4


  SELECT [Month] as [Month],
          [Month] + „ „ + [Year] as [Month of Year]
  FROM dbo.Time
Slowly Changing Dimensions

  PK = row identifier      dbo.Customer
                           CustomerKey
  Multiple rows =          FullName
  multiple versions        PostCode
                           City
                           State
                           Country
  Add effective dating     Gender
  columns                  Occupation
                           MaritalStatus
    Which can be exposed   EmailAddress
    as new dimensional
                           EffectiveFrom (smalldatetime)
    attributes
                           EffectiveTo (smalldatetime)
                           CurrentFlag (tinyint)
Facts in Relational Terms

  Keys
                            Internet Sales
  Indexing                  Sales Amount
                            Order Quantity
  Partitioning              Tax Amount
                            Unit Price
  Processing                Transaction Count




  Consider Row and Page compression
Fact Keys and Indexes

  Is a surrogate/primary key required?
  Beware the clustered index/primary key
    Prefer the date FK as the clustered index


  Add NO CHECK to foreign keys

  Indexes are usually not useful
    Unless processing degenerate dimensions
    Or servicing ROLAP/direct to SQL queries
Fact Partitioning – Why?

  Parallel processing
  Only process most recent data
  Multiple storage engine threads during query
  Archive off data
  Multiple aggregation strategies



  NB: Partitions require Enterprise Edition
Fact Partitioning – Guidelines

  Partition when fact tables are 50-100GB+
  Ideal partition size 2M-20M rows
  Less than 1000 partitions per measure group
    This wins over partition size


  Prefer to partition over time
  Can not aggregate higher than partition grain

  Align AS and SQL partitions!
    Calculated time keys become very useful
Fact Storage

   MOLAP, ROLAP or HOLAP

Source Data     Facts          Aggregations




 Relational         Multidimensional
Proactive Caching

  Cube = “Cache”
  Automatic invalidation of cube
  Automatic rebuild of cube

                                      Query


                 SQL Query   Valid?   Valid?
Quick Storage Engine Tuning

  Ensure attribute relations are implemented
  Turn on query log
  Run Usage Based Optimisation (UBO) wizard
© 2007 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other
                                                                                           countries.
  The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to
changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of
                                                                                      this presentation.
                               MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

More Related Content

Viewers also liked

Deloitte Cem Ams 25 May2011
Deloitte Cem Ams 25 May2011Deloitte Cem Ams 25 May2011
Deloitte Cem Ams 25 May2011Theo Slaats
 
Is It Alright With You Text
Is It Alright With You TextIs It Alright With You Text
Is It Alright With You Textrlwoods
 
Oslo Enterprise MeetUp May 12th 2010 - Jan Høydahl
Oslo Enterprise MeetUp May 12th 2010 - Jan HøydahlOslo Enterprise MeetUp May 12th 2010 - Jan Høydahl
Oslo Enterprise MeetUp May 12th 2010 - Jan HøydahlCominvent AS
 
Building Your Business With Google
Building Your Business With GoogleBuilding Your Business With Google
Building Your Business With GoogleNoah Boswell
 
DigimarcDiscover_TractionReport_2013_FNL
DigimarcDiscover_TractionReport_2013_FNLDigimarcDiscover_TractionReport_2013_FNL
DigimarcDiscover_TractionReport_2013_FNLdkinpdx
 
Axfood Q4 2010
Axfood Q4 2010Axfood Q4 2010
Axfood Q4 2010Axfood
 
Content Management System JOOMLA
Content Management System JOOMLAContent Management System JOOMLA
Content Management System JOOMLASławomir Stańczuk
 
Polskie Sklepy Internetowe Raport Okazje.Info I Opineo
Polskie Sklepy Internetowe Raport Okazje.Info I OpineoPolskie Sklepy Internetowe Raport Okazje.Info I Opineo
Polskie Sklepy Internetowe Raport Okazje.Info I OpineoSławomir Stańczuk
 
Presentasjon Bekas
Presentasjon BekasPresentasjon Bekas
Presentasjon Bekasenergien
 
Analiza rozwoju portali społecznościowych w Internecie
Analiza rozwoju portali społecznościowych w InternecieAnaliza rozwoju portali społecznościowych w Internecie
Analiza rozwoju portali społecznościowych w InternecieSławomir Stańczuk
 
How did client-side technologies emerged
How did client-side technologies emergedHow did client-side technologies emerged
How did client-side technologies emergedAla' Yasin Abuhijleh
 
Falcon Stor Enables Virtual SANs For V Mware
Falcon Stor Enables Virtual SANs For V MwareFalcon Stor Enables Virtual SANs For V Mware
Falcon Stor Enables Virtual SANs For V MwarePaul Skach
 

Viewers also liked (15)

Deloitte Cem Ams 25 May2011
Deloitte Cem Ams 25 May2011Deloitte Cem Ams 25 May2011
Deloitte Cem Ams 25 May2011
 
Is It Alright With You Text
Is It Alright With You TextIs It Alright With You Text
Is It Alright With You Text
 
Oslo Enterprise MeetUp May 12th 2010 - Jan Høydahl
Oslo Enterprise MeetUp May 12th 2010 - Jan HøydahlOslo Enterprise MeetUp May 12th 2010 - Jan Høydahl
Oslo Enterprise MeetUp May 12th 2010 - Jan Høydahl
 
Building Your Business With Google
Building Your Business With GoogleBuilding Your Business With Google
Building Your Business With Google
 
Raport Finanse w Internecie
Raport Finanse w InternecieRaport Finanse w Internecie
Raport Finanse w Internecie
 
DigimarcDiscover_TractionReport_2013_FNL
DigimarcDiscover_TractionReport_2013_FNLDigimarcDiscover_TractionReport_2013_FNL
DigimarcDiscover_TractionReport_2013_FNL
 
Axfood Q4 2010
Axfood Q4 2010Axfood Q4 2010
Axfood Q4 2010
 
Content Management System JOOMLA
Content Management System JOOMLAContent Management System JOOMLA
Content Management System JOOMLA
 
Itb Chap 04
Itb Chap 04Itb Chap 04
Itb Chap 04
 
Polskie Sklepy Internetowe Raport Okazje.Info I Opineo
Polskie Sklepy Internetowe Raport Okazje.Info I OpineoPolskie Sklepy Internetowe Raport Okazje.Info I Opineo
Polskie Sklepy Internetowe Raport Okazje.Info I Opineo
 
Presentasjon Bekas
Presentasjon BekasPresentasjon Bekas
Presentasjon Bekas
 
Analiza rozwoju portali społecznościowych w Internecie
Analiza rozwoju portali społecznościowych w InternecieAnaliza rozwoju portali społecznościowych w Internecie
Analiza rozwoju portali społecznościowych w Internecie
 
Filthy people
Filthy peopleFilthy people
Filthy people
 
How did client-side technologies emerged
How did client-side technologies emergedHow did client-side technologies emerged
How did client-side technologies emerged
 
Falcon Stor Enables Virtual SANs For V Mware
Falcon Stor Enables Virtual SANs For V MwareFalcon Stor Enables Virtual SANs For V Mware
Falcon Stor Enables Virtual SANs For V Mware
 

Similar to Microsoft Analysis Services Physical Design

VSSML17 L5. Basic Data Transformations and Feature Engineering
VSSML17 L5. Basic Data Transformations and Feature EngineeringVSSML17 L5. Basic Data Transformations and Feature Engineering
VSSML17 L5. Basic Data Transformations and Feature EngineeringBigML, Inc
 
PowerBI importance of power bi in data analytics field
PowerBI importance of power bi in data analytics fieldPowerBI importance of power bi in data analytics field
PowerBI importance of power bi in data analytics fieldshubham299785
 
Overview of business intelligence
Overview of business intelligenceOverview of business intelligence
Overview of business intelligenceAhsan Kabir
 
NORM: No ORM Framework
 NORM: No ORM Framework NORM: No ORM Framework
NORM: No ORM FrameworkEDB
 
MondCloud Semantic Data Hub for Insurance
MondCloud Semantic Data Hub for InsuranceMondCloud Semantic Data Hub for Insurance
MondCloud Semantic Data Hub for InsuranceGeetha Sreedhar, MBA
 
VSSML18. Data Transformations
VSSML18. Data TransformationsVSSML18. Data Transformations
VSSML18. Data TransformationsBigML, Inc
 
Kishore jaladi-dw
Kishore jaladi-dwKishore jaladi-dw
Kishore jaladi-dwsam2sung2
 
BSSML17 - Basic Data Transformations
BSSML17 - Basic Data TransformationsBSSML17 - Basic Data Transformations
BSSML17 - Basic Data TransformationsBigML, Inc
 
FME Server for High Quality On-Demand PDF Printing
FME Server for High Quality On-Demand PDF PrintingFME Server for High Quality On-Demand PDF Printing
FME Server for High Quality On-Demand PDF PrintingSafe Software
 
Data Access Tech Ed India
Data Access   Tech Ed IndiaData Access   Tech Ed India
Data Access Tech Ed Indiarsnarayanan
 
MongoDB Evenings Minneapolis: MongoDB is Cool But When Should I Use It?
MongoDB Evenings Minneapolis: MongoDB is Cool But When Should I Use It?MongoDB Evenings Minneapolis: MongoDB is Cool But When Should I Use It?
MongoDB Evenings Minneapolis: MongoDB is Cool But When Should I Use It?MongoDB
 
Enterprise Serverless Adoption. An Experience Report
Enterprise Serverless Adoption. An Experience ReportEnterprise Serverless Adoption. An Experience Report
Enterprise Serverless Adoption. An Experience ReportSheenBrisals
 
Strata Software Architecture NY: The Data Dichotomy
Strata Software Architecture NY: The Data DichotomyStrata Software Architecture NY: The Data Dichotomy
Strata Software Architecture NY: The Data DichotomyBen Stopford
 
Kafka Summit NYC 2017 - The Data Dichotomy: Rethinking Data and Services with...
Kafka Summit NYC 2017 - The Data Dichotomy: Rethinking Data and Services with...Kafka Summit NYC 2017 - The Data Dichotomy: Rethinking Data and Services with...
Kafka Summit NYC 2017 - The Data Dichotomy: Rethinking Data and Services with...confluent
 
Data Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesData Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesIvo Andreev
 
Dqs mds-matching 15042015
Dqs mds-matching 15042015Dqs mds-matching 15042015
Dqs mds-matching 15042015Neil Hambly
 
Sound Data Quality for CRM
Sound Data Quality for CRMSound Data Quality for CRM
Sound Data Quality for CRMDivya Malik
 
IBM THINK 2019 - What? I Don't Need a Database to Do All That with SQL?
IBM THINK 2019 - What? I Don't Need a Database to Do All That with SQL?IBM THINK 2019 - What? I Don't Need a Database to Do All That with SQL?
IBM THINK 2019 - What? I Don't Need a Database to Do All That with SQL?Torsten Steinbach
 
CloneTab Advanced DataScrambling.pptx
CloneTab Advanced DataScrambling.pptxCloneTab Advanced DataScrambling.pptx
CloneTab Advanced DataScrambling.pptxCLONETAB
 

Similar to Microsoft Analysis Services Physical Design (20)

VSSML17 L5. Basic Data Transformations and Feature Engineering
VSSML17 L5. Basic Data Transformations and Feature EngineeringVSSML17 L5. Basic Data Transformations and Feature Engineering
VSSML17 L5. Basic Data Transformations and Feature Engineering
 
PowerBI importance of power bi in data analytics field
PowerBI importance of power bi in data analytics fieldPowerBI importance of power bi in data analytics field
PowerBI importance of power bi in data analytics field
 
Overview of business intelligence
Overview of business intelligenceOverview of business intelligence
Overview of business intelligence
 
NORM: No ORM Framework
 NORM: No ORM Framework NORM: No ORM Framework
NORM: No ORM Framework
 
MondCloud Semantic Data Hub for Insurance
MondCloud Semantic Data Hub for InsuranceMondCloud Semantic Data Hub for Insurance
MondCloud Semantic Data Hub for Insurance
 
VSSML18. Data Transformations
VSSML18. Data TransformationsVSSML18. Data Transformations
VSSML18. Data Transformations
 
Kishore jaladi-dw
Kishore jaladi-dwKishore jaladi-dw
Kishore jaladi-dw
 
BSSML17 - Basic Data Transformations
BSSML17 - Basic Data TransformationsBSSML17 - Basic Data Transformations
BSSML17 - Basic Data Transformations
 
FME Server for High Quality On-Demand PDF Printing
FME Server for High Quality On-Demand PDF PrintingFME Server for High Quality On-Demand PDF Printing
FME Server for High Quality On-Demand PDF Printing
 
Data Access Tech Ed India
Data Access   Tech Ed IndiaData Access   Tech Ed India
Data Access Tech Ed India
 
MongoDB Evenings Minneapolis: MongoDB is Cool But When Should I Use It?
MongoDB Evenings Minneapolis: MongoDB is Cool But When Should I Use It?MongoDB Evenings Minneapolis: MongoDB is Cool But When Should I Use It?
MongoDB Evenings Minneapolis: MongoDB is Cool But When Should I Use It?
 
Enterprise Serverless Adoption. An Experience Report
Enterprise Serverless Adoption. An Experience ReportEnterprise Serverless Adoption. An Experience Report
Enterprise Serverless Adoption. An Experience Report
 
Strata Software Architecture NY: The Data Dichotomy
Strata Software Architecture NY: The Data DichotomyStrata Software Architecture NY: The Data Dichotomy
Strata Software Architecture NY: The Data Dichotomy
 
Kafka Summit NYC 2017 - The Data Dichotomy: Rethinking Data and Services with...
Kafka Summit NYC 2017 - The Data Dichotomy: Rethinking Data and Services with...Kafka Summit NYC 2017 - The Data Dichotomy: Rethinking Data and Services with...
Kafka Summit NYC 2017 - The Data Dichotomy: Rethinking Data and Services with...
 
Data Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesData Warehouse Design and Best Practices
Data Warehouse Design and Best Practices
 
How We Use Functional Programming to Find the Bad Guys
How We Use Functional Programming to Find the Bad GuysHow We Use Functional Programming to Find the Bad Guys
How We Use Functional Programming to Find the Bad Guys
 
Dqs mds-matching 15042015
Dqs mds-matching 15042015Dqs mds-matching 15042015
Dqs mds-matching 15042015
 
Sound Data Quality for CRM
Sound Data Quality for CRMSound Data Quality for CRM
Sound Data Quality for CRM
 
IBM THINK 2019 - What? I Don't Need a Database to Do All That with SQL?
IBM THINK 2019 - What? I Don't Need a Database to Do All That with SQL?IBM THINK 2019 - What? I Don't Need a Database to Do All That with SQL?
IBM THINK 2019 - What? I Don't Need a Database to Do All That with SQL?
 
CloneTab Advanced DataScrambling.pptx
CloneTab Advanced DataScrambling.pptxCloneTab Advanced DataScrambling.pptx
CloneTab Advanced DataScrambling.pptx
 

Recently uploaded

Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 

Recently uploaded (20)

Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 

Microsoft Analysis Services Physical Design

  • 1. Microsoft Analysis Services Physical Design James Snape Application Development Consulting Microsoft Limited
  • 3. Agenda Hardware Dimensions Facts Relational stuff Performance tuning next steps NB: Relational design not complete – logging, auditing etc discussed in next session
  • 4. Prime Directive: Sequential IO Good, Random IO Bad
  • 5. Hardware SQL Server Fast Track Data Warehouse www.microsoft.com/sqlserver/2008/en/us/fasttrack.aspx Pre-tested hardware configurations Specific disk, filegroup, layouts Minimal indexing To feed CPU at maximum capacity
  • 6. Dimensions vs Facts Dimension Small (relatively) Repeating data Fact Large Numeric data + keys Treat them differently
  • 7. Dimensions in Relational Terms Customer Table structure Full Name Keys Post Code City Indexes State Country Null handling Gender Occupation Managing change Customer Marital Status Geography Email Address Processing 1. Country 2. State 3. City 4. Post Code 5. Full Name
  • 8. Star vs. Snowflake Schemas dbo.Customer dbo.Customer CustomerKey CustomerKey GeographyKey FullName FullName PostCode Gender City Occupation State MaritalStatus OR Country EmailAddress Gender Occupation MaritalStatus dbo.Geography EmailAddress GeographyKey PostCode City NB: both are denormalized, State one more than the other Country
  • 9. Primary Keys Use smallest possible integer as surrogate primary key Primary key is a “row identifier” Multiple row “versions” are possible “None” and “Unknown” special values are useful Do NOT use business/source system keys Clustered primary key is OK for dimensions
  • 10. Dimension Indexes Dimension processing queries of the form: SELECT DISTINCT .... FROM .... WHERE (filter) clauses never used WHERE (join) clauses are used in snowflake dimensions Non-processing queries may end up in SQL ROLAP dimensions Direct to SQL queries
  • 11. Null Handling in Dimensions By default NULL converts to 0 or an empty string NULL attribute keys can invoke special “Unknown Member” handling Prefer to create a specific “Unknown” row CustomerKey FullName City Country -1 Unknown Unknown Unknown -2 None None None 1243 John Smith London United Kingdom 1244 Mary Jones Glasgow United Kingdom
  • 12. Dimension Attributes Attributes have keys, names (and values) Integer attribute keys are smaller and faster Keys must be unique Attribute Key Name (Value) Year 2009 CY 2009 2009 Month 4 April 4 Month of Year 20090400 April 2009 4 SELECT [Month] as [Month], [Month] + „ „ + [Year] as [Month of Year] FROM dbo.Time
  • 13. Slowly Changing Dimensions PK = row identifier dbo.Customer CustomerKey Multiple rows = FullName multiple versions PostCode City State Country Add effective dating Gender columns Occupation MaritalStatus Which can be exposed EmailAddress as new dimensional EffectiveFrom (smalldatetime) attributes EffectiveTo (smalldatetime) CurrentFlag (tinyint)
  • 14. Facts in Relational Terms Keys Internet Sales Indexing Sales Amount Order Quantity Partitioning Tax Amount Unit Price Processing Transaction Count Consider Row and Page compression
  • 15. Fact Keys and Indexes Is a surrogate/primary key required? Beware the clustered index/primary key Prefer the date FK as the clustered index Add NO CHECK to foreign keys Indexes are usually not useful Unless processing degenerate dimensions Or servicing ROLAP/direct to SQL queries
  • 16. Fact Partitioning – Why? Parallel processing Only process most recent data Multiple storage engine threads during query Archive off data Multiple aggregation strategies NB: Partitions require Enterprise Edition
  • 17. Fact Partitioning – Guidelines Partition when fact tables are 50-100GB+ Ideal partition size 2M-20M rows Less than 1000 partitions per measure group This wins over partition size Prefer to partition over time Can not aggregate higher than partition grain Align AS and SQL partitions! Calculated time keys become very useful
  • 18. Fact Storage MOLAP, ROLAP or HOLAP Source Data Facts Aggregations Relational Multidimensional
  • 19. Proactive Caching Cube = “Cache” Automatic invalidation of cube Automatic rebuild of cube Query SQL Query Valid? Valid?
  • 20. Quick Storage Engine Tuning Ensure attribute relations are implemented Turn on query log Run Usage Based Optimisation (UBO) wizard
  • 21. © 2007 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.