RAC+ASM: Stories to Share
Upcoming SlideShare
Loading in...5
×
 

RAC+ASM: Stories to Share

on

  • 10,459 views

RAC+ASM: Lessons learned after 2 years in production...

RAC+ASM: Lessons learned after 2 years in production

Managing over 70 databases for 4 major customers, I have some good stories to share. Running almost all possible combinations of ASM, RAC, NETAPP and NFS.

Success, failure and gotchas. This presentation is the equivalent of years of experience, condensed in major highlights in 45 minutes. To list a few stories:

Statistics

Views

Total Views
10,459
Views on SlideShare
8,725
Embed Views
1,734

Actions

Likes
3
Downloads
559
Comments
0

6 Embeds 1,734

http://www.pythian.com 1703
http://www.slideshare.net 25
http://wwwdev.pythian.com 3
http://relationalnews.com 1
http://202.165.105.226 1
http://webcache.googleusercontent.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

RAC+ASM: Stories to Share RAC+ASM: Stories to Share Presentation Transcript

  • RAC+ASM 3 years in production Stories to share
    Presented by: Christo Kutrovsky
  • Who Am I
    2
    • Oracle ACE
    • 10 years in Oracle field
    • Joined Pythian 2003
    • Part of Pythian Consulting Team
    • Special projects
    • Performance tuning
    • Critical services
    “oracle pinup”
  • Pythian Facts
    Founded in 1997
    90 employees
    120 customers worldwide
    20 customers more than $1 billion in revenue
    5 offices in 5 countries
    10 years profitable private company
  • What Pythian does
    Pythian provides database and application infrastructure services.
  • Agenda
    5
    • 2 nodes RAC
    • ASMLIB with multipathing
    • Migrating to new servers with ASM
    • Thin provisioning
    • ASM + restores = danger
    • Device naming conventions
    • spfile location
    • JBOD configuration
  • 6
    2 Node RAC for High Availability
  • 2 Node RACs for HA
    7
    • Two node RAC nodes
    • 13 databases
    • Dev databases
    • Shutdown databases (and ASM) on node1
    • Perform maintenance
    • Unplug the interconnect cable
    • What happens?
  • 2 nodes RAC
    8
    VIP
    VIP
    Node 1
    Node 2
    SID_A1
    SID_A2
    SID_B1
    SID_B2
    Interconnect
    ASM
    DG
    Fibre Channel
    OCR/V
  • 2 nodes RAC
    9
    VIP
    VIP
    Node 1
    Node 2
    SID_A1
    SID_A2
    SID_B1
    SID_B2
    Interconnect
    ASM
    DG
    Fibre Channel
    OCR/V
  • 10
    I can’t
    See Node 1
    I can’t
    See Node 2
    VIP
    VIP
    Node 1
    Node 2
    SID_A1
    SID_A2
    SID_B1
    SID_B2
    Interconnect
    ASM
    DG
    Fibre Channel
    OCR/V
  • One is not Quorum
    11
    • 50% chance your working node gets restarted
    • Depends on clusterware version
    • Who will shoot the other guy first
  • One is not Quorum
    12
    • Conclusion?
    • Turn off clusterware when you have only 2 nodes and performing maintenance
    • Upgrade to a more predicable clusterware
    • Lowest ‘leader’ always survives
    • Add a 3th tie-breaker node
    • doesn’t have to run a database, just clusterware (observer)
  • One is not Quorum
    13
    Production cases, what happens if
    • All Network dies on one node?
    • All disk dies on one node?
  • 14
    ASMLIBwith Multi Pathing
  • Building ASMLIB devices when multipathing is present
    15
    • Devices used for creating asmlib
    • /dev/emcpowerc1
    • /dev/mapper/raid10_data_disk
    • Devices used to create asmdiskgroup
    • ASMLIB
    • The reboot changes everything
    • ASMLIB re-discovers the devices without multipath
    • Difficult to diagnose
  • Visual
    16
    /dev/mapper/data1
    /dev/mapper/data2
    /dev/sdb
    /dev/sdc
    /dev/sdd
    /dev/sde
    HBA1
    HBA2
    LUN_1
    LUN_2
  • Building ASMLIB devices when multipathing is present
    17
    • Do not use ASMLIB
    • If you have to (why?)
    • Must setup “ORACLEASM_SCANORDER”
    • asm_diskstring parameter
    • Permissions
    • Udev files
    • Boot/startup script
  • Removing ASMLIB
    18
    • Why
    • Extra layer
    • Requires new driver for every new kernel
    • Can cause downtime if not careful
    • ASMLIB header is the same as ASM DISK header
    • Just has extra field for ASMLIB name
    • Disks can be accessed directly, without ASMLIB without having to drop/recreate them
  • Removing ASMLIB
    19
    • Unmount all affected diskgroups
    • Change or set asm_diskstring
    • Remount diskgroups via new paths
    • Can be done in rolling fashion in RAC
  • 20
    SAN Migration
  • Migrating from EMC to 3PAR
    21
    • New SAN
    • New concept
    • Thin provisioning
    • A big project
    • Or not
  • Add/drop/go home
    22
    • No brainer
    • Thin provisioning rocks
    • SA adds disks
    • Add disk to diskgroup
    • Drop all old disks
    • Wait
    • Never be paged on space
  • 23
    Server Migration
  • Server migration
    24
    • Current setup
    • 2 nodes RAC with ASM
    • New servers
    • Better, Faster, Stronger
    • Fastest (effort wise) way to migrate, with minimal downtime
    • Possible with zero downtime
  • Server migration options
    25
    • Create standby on new server
    • Requires extra copy of data
    • Add the new nodes, drop existing ones
    • Possible clusterware issues
    • Move the LUNs
    • Easy
    • New servers tested
  • Lun Migration
    26
    • Install clusterware and create RAC database with same name
    • Test hardware / wiring / configuration
    • Migrate
    • Stop production
    • Re-assigning LUNs
    • Start production
  • 27
    ASM Restore creates database black hole
  • ASM + Same host restore = DANGER
    28
    • Production database
    • Diskgroup +PROD
    • Snapshot database
    • Diskgroup +SNAP
    • Rebuild monthly via duplicate database
    • Except this one time…
  • The concept
    29
    • “SNAP” backups not taken
    • If a given “SNAP” backup is to be restored, simply re-create the given “PROD” backup
    • Independent from Production
  • Restore with ASM
    30
    • Restore FRA files into separate directory
    • Startup SNAP instance
    • Catalog backup files
    • Restore into SNAP diskgroup
    • The missing piece?“restore” writes into original backup file location
    • Must use “set new name for datafile” in run block
  • Restore with ASM – the result
    31
    • Unrecoverable corruption on production database
    • Lost about 3-4 hours of changes
    • If this was filesystem and not ASM, no corruption would have occured
  • Corruption – what happened
    32
    SGA
    5
    rows
    5
    rows
    2
    rows
    BLK1 add Row 6
    BLK3 add Row 3
    5
    rows
    5
    rows
    2
    rows
    2
    rows
    Disk
    Partially overwritten datafile
    5
    rows
    5
    rows
    5 rows
    5 rows
    REDO
    Disk
    Original datafile
  • Corruption – what should’ve hap.
    33
    SGA
    5
    rows
    5
    rows
    5
    rows
    BLK1 add Row 6
    BLK3 add Row 3
    5
    rows
    5
    rows
    2
    rows
    2
    rows
    BLK3 add Row 6
    Disk
    Partially overwritten datafile
    5
    rows
    5
    rows
    5 rows
    5 rows
    REDO
    Disk
    Original datafile
  • Corruption – what happened
    34
  • Corruption
    35
    • Why this wouldn’t have happened with filesystem?
    • File names are just pointers to data stream
    • If a file is re-created, a new data streams is associated with it
    • Processes that have the file currently open still use the old data stream
    • This is why “undelete” is possible
    • My blog about undeleting files
  • Corruption
    Open “file 1”
    36
    Process 1
    File 1
    Stream X1
  • Corruption – recreate File 1
    Open “file 1”
    37
    Process 1
    File 1
    Stream X1
    Stream X2
  • 38
    Device names convention causes user error
  • Device naming conventions
    39
    • Using /dev/mapper/<name>
    • Asm uses <name>p1 – first partition
    • Permissions set script uses: “*p1”
    • Then came /dev/mapper/backup1
    • First partition is: /dev/mapper/backup1p1
  • Device naming conventions
    40
    • V$ASM_DISK
    PATH HEADER_STATUS
    --------------------------- -------------
    /dev/mapper/backup1 CANDIDATE
    /dev/mapper/redop1 MEMBER
    /dev/mapper/backup1p1 MEMBER
    /dev/mapper/data2p1 MEMBER
    /dev/mapper/data1p1 MEMBER
  • Naming conventions
    41
    ADDED
    DISK
    Partition 1
    IN USE
  • New convention
    42
    • Now we use generic names, as we do re-assign disks
    • We also use prefix and suffix with a clear dilimiter
    /dev/mapper/asm-raid5-dev01-part1
  • 43
    spfile location in RAC
  • spfile location
    44
    • Intended configuration
    • init.oraspfile=‘+ASM_DSKGRP/dbname.spfile’
    • no spfile
  • Changing parameters in masses
    45
    • create pfile=‘your_initials.ora’ from spfile;
    • edit
    • create spfile=‘+ASM_DSK/spfile’ from pfile=‘ck.ora’
  • What not to do
    46
    • create pfile from spfile
    • edit
    • create spfile from pfile;
  • Result
    47
    • One node uses local spfile
    • Other(s) uses global spfile
    • Parameter changes to “BAD” node are sent to other nodes
    • not persistent on GOOD nodes
    • persistent on BAD nodes
    • Paramer changes on GOOD nodes have reversed behaviour
  • 48
    Adding ASM disks crashes databases
  • Adding disks
    49
    • Must be visible on all servers
    • Otherwise your diskgroup gets dismounted on the nodes that don’t see the disk
    • All databases using this diskgroup crash
  • ASM add disk process
    50
    Is the disk visible locally?
    Initialize disk header, add it to diskgroup
    Notify all nodes to rescan disks and add the new disk
    If one or more nodes cannot see the disk, raise error
    Dismount diskgroup on all nodes not seeing the new disk
  • 51
    ASM with JBODwelcomes simplicity
  • JBOD Configuration
    52
    • Linux Datawarehouse
    • 10 TB space
    • 28 disks of 430/285 GB
    • All redundancy/striping provided by ASM
  • JBOD Configuration
    53
    • Simplicity
    • No ASMLIB
    • Straight devices
    • Naming convention – use only 1 partition, and use partition 4
    • /dev/sd*4
    • is ASM partition
    • is permissions wildcard
    • is asm_diskstring
  • Testing your speed
    54
    • Verify read speed of each device
    • Verifies each device is performing as expected
    • Verify read speed from all devices
    • Verify your total bandwith
    • Verify read speed from all devices, towards the end of the device
    • Disk read speed is not linear
  • Read Speed of a single disk
    55
    * Courtesy google image search
  • Testing your speed
    56
    • One device at a time
    for dsk in /dev/sd[c-q]; do echo $dsk; dd if=$dsk of=/dev/null iflag=direct bs=2M count=100; done
    • All devices (total bandwith)
    for dsk in /dev/sd[c-q]; do echo $dsk; dd if=$dsk of=/dev/null iflag=direct bs=2M count=100 &; done
    • Test end speed
    • Add SKIP=x
  • Sample output
    57
    /dev/sdc100+0 records in100+0 records out209715200 bytes (210 MB) copied, 1.60325 seconds, 131 MB/s/dev/sdd100+0 records in100+0 records out209715200 bytes (210 MB) copied, 1.60188 seconds, 131 MB/s/dev/sde100+0 records in100+0 records out209715200 bytes (210 MB) copied, 1.60067 seconds, 131 MB/s/dev/sdf100+0 records in100+0 records out209715200 bytes (210 MB) copied, 1.59928 seconds, 131 MB/s/dev/sdg100+0 records in100+0 records out
  • JBOD configuration
    58
    • Disk adding/removal is very easy
    • Add disks in bulk:alter diskgroup XXX add disk ‘/dev/sd[c-q]4’;
    • Performance rocks
    • controller speed
    • Diagnostic is easy
    • Iostat –x 5 /dev/sd*4
    • Manageability is easy
    • 1 diskgroup – no filenames, no mountpoints
  • Final Thoughts
    59
    • RAC for HA requires 3 nodes
    • ASM
    • Keep it simple
    • Reduce layers
    • Runs fast
    • Still need to be carefull
  • 60
    The End
    Thank You
    Questions?
    I blog at
    http://www.pythian.com/news/author/kutrovsky/
  • 61
    Transition/Section Marker