Exchange 2010 storage improvements

Uploaded on

A deck covering Exchange 2010 Storage improvements built and extended from some of the Microsoft ignite decks

A deck covering Exchange 2010 Storage improvements built and extended from some of the Microsoft ignite decks

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads


Total Views
On Slideshare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide
  • Currently 2TBMoving to 8TBRandom IO not getting quicker 15K RPM, 10K RPM, 7.2K RPMDensity is getting better so can read more data in the same timeFlash – SSD – Didn’t take that betOptimised for spinning media for E14Expensive – so use as Cache in SAN
  • JBOD – i.e. 1 disk per database and log setRAID less – disks will fail3 + copies
  • 2007 roughly the same at Exchange 4.0One database and then a couple of really large tablesMessage table and attachments – all messages per databaseMessage folder table per mailbox which does all the viewsThis gives the benefit of single instance storage – one copy in the message table with pointers from the message folder tableRandom IO!2010 schema is changed massivelyOnly one table per database Now data specific to mailbox so data can be kept sequential for quick retrieval from the same area of diskNow message view table instead of secondary indexes
  • Really important in reduction of IOUpdate the view when user views!
  • Page size is smallest section of IOBigger page means less small IO for a single message read2007 random layout of data on disk means 3 Ios201020K message – pull the same message – get the message header and body on one pageThis makes a huge difference to IOPage size will be fine for handling large messages – 12-15K is mean size of message currently
  • As you add larger page sizes, and lay things out for sequential IO DB grows by 20%.Same as OST grew in SP2 for office 2007Now compress message headers and text/hmtl bodies – limited to this for speedCan bring database back to same size as 2007 or even less if bulk of HMTL messagesNow have many more tables and fewer bigger pagesThere is also Cache compressionSo when you pull a 32KB page – the smallest element of Exchange data but that page only holds 16KB of data the free space will be compresses so that only 16KB of cache is used.
  • Can do coalescing when pages are not next to each other2007 needs 3 ios to get pages off disk – random IO2010 bring all five up – a stream of IOThen evict the middle pages
  • Cleanup was done using online maintenance and defragThis has changes and cleanup is done when tombstone or dumpster cleanup happens – Page 0 happens automatically as because it occurs when the write is being done anyway, there is no additional IO2003 and 2007 are great for compaction2007 SP1 changed this slightly to reduce IO during the maintenance window.2010 this has changed a lot – it is done at run time as space is seenContiguity has never been a concern until now - compaction has always not worried about continuity to make it small2010 makes trade offs on size as we’ve mentioned to ensure contiguity – analysis happens continuouslyDB check summing
  • Utility MSFT built to track contiguity of the new DBThis is showing a Message folder table of the inbox on 2007This is massively Random2010 is contiguous as pages are laid out sequentially so that reading a huge folder full of 10000 items is quick and easy!


  • 1. Exchange 2010 Storage Improvements
    Nathan Winters – Exchange MVP
  • 2. Agenda
    A Brief History of Exchange Storage
    The new ethos
    Feature Deep Dive
  • 3. History
  • 4. History
    ESE/JET Blue
    IOPS – Random IO application
    Why? – Small Expensive drives
    1.6GB disk $400 in 1996
    SCSI 2GB and 4GB 100 IOPS
    Single Instance Storage
    Clustering with Shared Storage
    Backup an issue
    Single Point of Failure
    32 bit
    Not enough RAM
    Ram limited number of users per server
  • 5. History - Exchange 2007
    Big improvements in Exchange Server 2007
    Reduce storage input/output (I/O) (70%)
    Use large amounts of memory (64 bit)
    Increased page size (4 kilobyte (KB) -> 8 KB)
    Lower storage costs
    Support large mailboxes (> 1 gigabyte (GB))
    Provide fast search (CI)
    Continuous replication (log shipping)
    High Availability (HA) + fast recovery
    Eliminate single points of failure
  • 6. New Ethos
  • 7. Email Usage
    Radicati seeing 165 mails per day growing to 230 over next couple of years
    Users used to large free storage
    5GB 3 years of mail
    Triage once per year to archive
    Not once per day!
    Mail available through all clients
    Cached Mode/Performance issues
    High Item counts – 5000, 20000, 100000
  • 8. Disk Technology
    Currently 2TB
    Moving to 8TB
    Random IO not getting quicker
    15K RPM, 10K RPM, 7.2K RPM
    Density is getting better so can read more data in the same time
    Flash – SSD – Didn’t take that bet
    Optimised for spinning media for E14
    Expensive – so use as Cache in SAN
  • 9. Exchange Server 2010 Storage Vision
    IO Reduction
    Sequential IO
    SATA/Tier 2 Disk Optimization
    Large, Fast, Low-cost Mailboxes
    Storage Design Flexibility
    RAID-less Storage (JBOD)
  • 10. Exchange Server 2010 HA Storage Design Flexibility
    DAS (SAS)
    DAS (SATA)
    HA = Shared Storage Clustering
    +1.0 IOPS/Mailbox
    3.5” 15K 146GB FC Disks
    RAID10 for DB & Logs
    Dedicated Spindles
    Multi-path (HBA’s, FC Switches, SAN array controllers)
    Backup = Streaming off active
    Fast Recovery = Hardware VSS (Snapshots/Clones)
    HA = CCR
    .33 IOPS/Mailbox
    2.5” 146GB 10K SAS Disks
    RAID5 for DB
    RAID10 for Logs
    SAS Array Controller (/w BBU)
    Backup = VSS Snapshot
    Fast Recovery = CCR
    HA = DAG (2 DB copies)
    .11 IOPS/Mailbox
    3.5” 2TB 7.2K SATA/SAS Disks
    RAID10 for DB & Logs
    SAS Array Controller (/w BBU)
    Backup = Optional/VSS
    Fast Recovery = Database Failover
    HA = DAG (3+ DB copies)
    .11 IOPS/Mailbox
    3.5” 2TB 7.2K SATA/SAS Disks
    1 DB = 1 Disk
    Backup = Optional/VSS
    Fast Recovery = Database Failover
    More options to reduce storage cost
  • 11. JBOD/RAID-less Storage: Now An Option
    JBOD : 1 disk = 1 database (with logs)
    Requires Exchange Server 2010 High Availability (3+ DB Copies)
    Annual Disk Failure Rate (AFR) = 5%
  • 12. Exchange Server 2010 HA
    Simplified mailbox High Availability and disaster recovery with new unified platform
    New York
    San Jose
    Mailbox Server
    Mailbox Server
    Mailbox Server
    Replicate databases to remote datacenter
    Recover quickly from disk and database failures
    Evolution of continuous replication technology (database mobility)
    Easier than traditional clustering to deploy and manage
    Allows each database to have 16 replicated copies
    Provides full redundancy of Exchange roles on as few as two servers
  • 13. Deep Dive
  • 14. Exchange 2010 Features
    Move to Sequential IO
    Change Table structure
    Lazy View
    Page size 32KB
    Database Compression (LVC)
    Read/Write Coalescing
    Database Contiguity
    Cache Compression
    Storage Groups Gone
    Single Point of Failure Gone
    Optimised for huge mailboxes
  • 15. Random vs. Sequential Disk IO
    Random IO
    Disk head has to move to process subsequent IO
    Head movement = High IO latency
    Seek Latency limits IO (IOPS)
    Sequential IO
    Disk head does not move to process subsequent IO
    Stationary head = low IO latency
    Disk RPM speed limits I/O per second (IOPS)
    Disk Head
    7.2K SATA Disk (20ms Latency)
    Random = 50 IOPS
    Sequential = +300 IOPS
  • 16. IO Reduction: Store Table Architecture
    Per Database
    Per Folder
    Exchange Server 2007
    Secondary Indexes used for Views
    Per Database
    Per Mailbox
    Per View
    Exchange Server 2010
    New store schema = no more single instance storage within a database
  • 17. Exchange 2007
    Nickel & Dime Approach
    Many, random, IOs (1 per update)
    DB I/O
    M1 arrives
    M2 arrives
    M1 flagged
    M3 arrives
    M2 deleted
    User uses OWA/Outlook Online and
    switches to this view
    Exchange 2010
    Pay to Play Approach
    Fewer, sequential, IOs (1 per view)
    Store Schema Changes: Lazy View Updates
  • 18. IO Reduction: Database Page Size Increased to 32 KB
    Exchange Server 2007 DB Read 20 KB Message
    3 Read IO’s
    8 KB Pages
    Exchange Server 2010 DB Read 20 KB Message
    1 Read IO
    32 KB Pages
  • 19. Mitigate DB Space Growth: Database Compression
    Problem:Store Schema change, space hints, B+Tree Defrag and 32 KB page size combine to increase DB file size by 20%
    Solution: Growth is 100% mitigated by Database Compression
    Targeted compression for message headers and text/html bodies (7bit/Express)
    DB Space Analysis
    DB File Size Comparison
    Msg Views
    32KB Pages
    1 Database, 750 x 250MB mailboxes
    RTF = RTF Compressed, Mix = 77% HTML, 15% RTF, 8% Text
    Avg. Message size = ~50KB
  • 20. IO Reduction: Read IO Gap Coalescing
    Exchange Server 2007 DB Read Behavior
    3 Read IO’s
    Exchange Server 2010 DB Read Behavior
    1 Read IO
  • 21. IO Reduction: Maintain Contiguity Over Time
    New Database Maintenance Architecture:
    Database B+Tree Defragmentation (aka OLD2):
    Background/throttled process that maintains space and contiguity of database tables
  • 22. IO Reduction: Database Contiguity Results
    Exchange Server 2007 Message Header Table (aka MFT)
    DB Page Numbers
    Random deletes at the tail
    Exchange Server 2010 Message Header Table (aka MsgHeader)
    *Production/Dogfood database analysis
    Blue = contiguous (good)
    Red = fragmented (bad)
  • 23. Summary
  • 24. Exchange IO Trend
    +90% Reduction!
  • 25. Putting It All Together: Mailboxes/Disk
    Exchange Server 2010 storage improvements cannot be quantified in IOPS reductions alone
    +4X Mailboxes/Disk!
    250 MB Mailbox Size, 3MB DB Cache/user, 12 x 7.2k SATA disks (DB/Logs on same spindles), Loadgen Outlook 2007 Online Very Heavy Profile, measured at <20ms RPC Average latency
  • 26. Summary
    Exchange Server 2010 store has…
    Reduced DB IOPS by +70%...again!
    Optimized for large mailboxes (+10 GB) and 100K item counts
    Optimized for large/slow/low-cost disks (SATA/Tier2)
    Made JBOD/RAID-less storage a viable option
    Enables unmatched storage flexibility to push storage Capex costs down
    Provides many more backup/DR options