Your SlideShare is downloading. ×
High Availability Options for Modern Oracle Infrastructures
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

High Availability Options for Modern Oracle Infrastructures

1,983
views

Published on

Today's enterprise architect has a bewildering array of choices when it comes to building a highly available infrastructure to run Oracle. This presentation considers approaches using the Oracle …

Today's enterprise architect has a bewildering array of choices when it comes to building a highly available infrastructure to run Oracle. This presentation considers approaches using the Oracle technology layer, resilient virtualisation (Oracle and other vendors), hardware clustering and storage replication. It covers the core Oracle Database and Fusion Middleware products and, based on practical experience, aims to give attendees a broad picture of alternatives with their pros and cons.

Delivered on 5 December 2011 at UKOUG 2011 by Simon Haslam and Julian Dyke.

Published in: Technology

0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,983
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
0
Comments
0
Likes
3
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. High Availability Options forModern Oracle Infrastructures Simon Haslam Julian Dyke Veriton Limited juliandyke.com 1 (1.2h) ©2011 Veriton Limited juliandyke.com
  • 2. Simon Haslam / VeritonSpecialised consultant & Oracle Partner,established for 15 yearsOracle Fusion Middleware(Java EE, SSO, OAM, OID, clustering)ADF Applications (esp. strategy & admin)Database & related technologies(Solaris/Linux, load balancers, firewalls, …) 2 (1.2h) ©2011 Veriton Limited juliandyke.com
  • 3. Julian Dyke / juliandyke.comIndependent database consultant specialising inOracle performance tuning and HA, includingRAC and Data Guard 3 (1.2h) ©2011 Veriton Limited juliandyke.com
  • 4. Agenda1. High Availability Outline2. Generic HA3. Database HA4. Middleware HA5. Summary
  • 5. High Availability Definition Wikipedia:“ High availability is a system design approach and associated service implementation that ensures a prearranged level of operational performance will be met during a contractual measurement period ” http://en.wikipedia.org/wiki/High_availability 5 (1.2h) ©2011 Veriton Limited juliandyke.com
  • 6. Corollary“ Paradoxically, adding more components to an overall system design can undermine efforts to achieve high availability. That is because complex systems inherently have more potential failure points and are more difficult to implement correctly ” http://en.wikipedia.org/wiki/High_availability 6 (1.2h) ©2011 Veriton Limited juliandyke.com
  • 7. Complexityis the enemy of availability 7 (1.2h) ©2011 Veriton Limited juliandyke.com
  • 8. Contrast HA with Disaster Recovery• DR triggered by catastrophic loss of primary data centre (i.e. all or nothing)• Cost of running a DR site means that more often now it has a semi-active, or even fully active, role• WANs/MANs are getting faster & more affordable• => techniques for HA & DR are merging 8 (1.2h) ©2011 Veriton Limited juliandyke.com
  • 9. HA covers failures of…• Hardware (the most common use case) – e.g. server failure – Note: within servers many components are redundant (power supplies, disks, sometimes controllers, NICs/HBAs/HCAs, even memory & processors)• Software – unresponsive components 9 (1.2h) ©2011 Veriton Limited juliandyke.com
  • 10. HA does not protect against…• Loss of data centre (fire, flood, power, etc)• Human error Buncefield, UK Dec. 2005 http://simpsons.wikia.com/wiki/Barney_Gumble 10 (1.2h) ©2011 Veriton Limited juliandyke.com
  • 11. Typical Requirements for HA• Business: – An assured level of availability (probably different between LOBs/applications) – Environment isolation ( ‘it’s ours’) – Reduced capital expenditure (esp. licences)• IT: – low maintenance – standard construction – low complexity – easy to monitor and troubleshoot 11 (1.2h) ©2011 Veriton Limited juliandyke.com
  • 12. From the ‘Old’ Days to Today Servers Servers + Storage + Storage Servers Servers Shared Storage 12 (1.2h) ©2011 Veriton Limited juliandyke.com
  • 13. Just because something is big doesn’t mean it can’t fail! Virtual Server Virtual Server Cloud Shared Storage 13 (1.2h) ©2011 Veriton Limited juliandyke.com
  • 14. High Availability• HA = as available as your business needs• Makes things more complicated• List of HA approaches we’ve used or just seen… not necessarily complete 14 (1.2h) ©2011 Veriton Limited juliandyke.com
  • 15. Agenda1. High Availability Outline2. Generic HA3. Database HA4. Middleware HA5. A Look Ahead & Summary
  • 16. Generic HA techniques• Active/Passive Clusters• Virtualisation Clusters• Storage Replication 16 (1.2h) ©2011 Veriton Limited juliandyke.com
  • 17. Active / Passive aka Cold Failover Cluster• The oldest form of HA• Primary plus standby server(s)• Only one server ever active at once• Active/Passive solutions available from 3rd party vendors, operating system vendors and Oracle• A/P plus P/A, or A/P plus -/A for test not unusual• Advantages – Simplicity – Software cost• Disadvantages – Hardware cost/power – Failover time (depending on reqs.) 17 (1.2h) ©2011 Veriton Limited juliandyke.com
  • 18. Active / Passive Primary Standby Shared Storage18 (1.2h) ©2011 Veriton Limited juliandyke.com
  • 19. Active / Passive + - / Active Primary Dev/Test Primary Standby Production Shared Storage * Note about prod vs test storage 19 (1.2h) ©2011 Veriton Limited juliandyke.com
  • 20. Virtualisation HA• Relocating virtual machine – suspend, move, resume• Automatic relocation – Move contents of vRAM to target host – E.g. vMotion, OVM live migration• Advantages – Generic across all IT services – Appears simple• Disadvantages – Underlying products don’t know what’s happening – Support if it all goes wrong 20 (1.2h) ©2011 Veriton Limited juliandyke.com
  • 21. Storage (bit out of scope, but…) • Replication can be done various ways – SAN/NAS provider, e.g. EMC SRDF, RecoverPoint, ZFS – Virtualisation provider, e.g. VMware Storage vMotion – OS provider, e.g. DRBD – Probably lots of others… • Advantages – Generic – Elegance in simplicity • Disadvantages – May be expensive, especially if need to license both ends – May be new technology – Probably sensitive to network stability (latency, throughput) – “Under the covers” technique the Oracle products don’t know about – Manual failover? Typically invoking DR procedure. 21 (1.2h) ©2011 Veriton Limited juliandyke.com
  • 22. Agenda1. High Availability Outline2. Generic HA3. Database HA4. Middleware HA5. A Look Ahead & Summary
  • 23. Active / Passive – Database Cluster Protects against server failure Does not protect against site failure Consists of Two servers; one active and one passive Database files on shared storage Heartbeat network to monitor cluster health Under normal operation Database instances run on active server On server failure Passive server becomes active server Cluster manager fails across all instances to new active server23 23 (4.1h) ©2011 Julian Dyke ©2011 Veriton Limited juliandyke.com
  • 24. Active / Passive Database Cluster Before After SERVER1 SERVER2 SERVER1 SERVER2 A A A B B B C C C CLUSTER MANAGER CLUSTER MANAGER STORAGE STORAGE SITE1 SITE124 24 (4.1h) ©2011 Julian Dyke ©2011 Veriton Limited juliandyke.com
  • 25. Active / Passive Cluster Examples Veritas IBM HACMP HP Service Guard Sun Cluster Advantages Administered by system administrators Only requires Oracle licence on active server Disadvantages Administered by system administrators Under-utilization of hardware Cluster manager requires licence Maximum 10 days per calendar year on unlicensed server Still popular with large users Some customers downgrading from RAC to active/passive to reduce costs25 25 (4.1h) ©2011 Julian Dyke ©2011 Veriton Limited juliandyke.com
  • 26. Oracle Clusterware HA Cluster Protects against server failure Does not protect against site failure Consists of Two (or more) servers Database files on shared storage - ASM Application files on shared storage - ACFS Private network to manage cluster Under normal operation Instances run on preferred servers On server failure Clusterware fails across instances from failed server to surviving server26 26 (4.1h) ©2011 Julian Dyke ©2011 Veriton Limited juliandyke.com
  • 27. Oracle Clusterware HA Cluster Before After SERVER1 SERVER2 SERVER1 SERVER2 A A A B B B C C C ASM / ACFS ASM / ACFS ORACLE CLUSTERWARE ORACLE CLUSTERWARE STORAGE STORAGE27 27 (4.1h) ©2011 Julian Dyke ©2011 Veriton Limited juliandyke.com
  • 28. Oracle Clusterware HA Cluster Advantages Administered by database administrators Based on known and trusted technology stack (Oracle RAC) Better utilization of hardware during normal operations Supports non-Oracle applications Disadvantages Administered by database administrators May require additional licences for Oracle Clusterware ACFS Oracle RDBMS Still relatively rarely implemented Licencing confused by new Oracle Cloud File System product28 28 (4.1h) ©2011 Julian Dyke ©2011 Veriton Limited juliandyke.com
  • 29. Oracle RAC Cluster Protects against server failure Does not protect against site failure Consists of Two (or more) servers Database files on shared storage – ASM Application files on shared storage – additional cost Private network to manage cluster Under normal operation Instances run on preferred servers On server failure Instances on failed server are lost Instances on surviving server remain29 29 (4.1h) ©2011 Julian Dyke ©2011 Veriton Limited juliandyke.com
  • 30. Oracle RAC Cluster Before After SERVER1 SERVER2 SERVER1 SERVER2 A A A B B B C C C ASM ASM ORACLE CLUSTERWARE ORACLE CLUSTERWARE STORAGE STORAGE30 30 (4.1h) ©2011 Julian Dyke ©2011 Veriton Limited juliandyke.com
  • 31. Oracle RAC Cluster Advantages Administered by database administrators Known and trusted technology stack Better utilization of hardware during normal operations Instances can scale across multiple servers Disadvantages Administered by database administrators Database must be licenced on each server May require additional licenses for Oracle RAC option Scaling may affect performance Business-as-usual clustering solution Foundation of Exadata and Oracle Database Appliance Complex to implement, but well understood and reliable in most cases31 31 (4.1h) ©2011 Julian Dyke ©2011 Veriton Limited juliandyke.com
  • 32. Oracle RAC One-Node Protects against server failure Does not protect against site failure Consists of Two (or more) servers Database files on shared storage – ASM Private network to manage cluster Under normal operation Instances run on preferred servers On server failure Clusterware fails across instances from failed server to surviving server32 32 (4.1h) ©2011 Julian Dyke ©2011 Veriton Limited juliandyke.com
  • 33. Oracle RAC One-Node Before After SERVER1 SERVER2 SERVER1 SERVER2 A A A B B B C C C ASM / ACFS ASM / ACFS ORACLE CLUSTERWARE ORACLE CLUSTERWARE STORAGE STORAGE33 33 (4.1h) ©2011 Julian Dyke ©2011 Veriton Limited juliandyke.com
  • 34. Oracle RAC One-Node Advantages Administered by database administrators Known and trusted technology stack Database can be unlicensed on one server Can be converted into Oracle RAC cluster Disadvantages Administered by database administrators Requires additional RAC one-node licences Under-utilization of hardware Maximum 10 days per calendar year on unlicensed server Really just another licensing option Rarely deployed in my experience34 34 (4.1h) ©2011 Julian Dyke ©2011 Veriton Limited juliandyke.com
  • 35. Data Guard Physical Standby Protects against server failure and site failure Consists of Two data centres in physically separate locations Servers and storage at each location Network between data centres Under normal operation Instances run on primary servers Database changes transported from primary server to standby server Database changes applied to standby server On server failure Instances failed over from failed server to standby server35 35 (4.1h) ©2011 Julian Dyke ©2011 Veriton Limited juliandyke.com
  • 36. Data Guard Physical Standby Before After SERVER1 SERVER2 SERVER1 SERVER2 A A A B B B C C C STORAGE STORAGE STORAGE STORAGE SITE1 SITE2 SITE1 SITE236 36 (4.1h) ©2011 Julian Dyke ©2011 Veriton Limited juliandyke.com
  • 37. Data Guard Physical Standby Advantages Protects against site failure Known and trusted technology Does not require heartbeat network Does not require shared storage Failover can be automated using Data Guard Broker Disadvantages Both sites must be licenced Requires Enterprise Edition database licences Under utilization of hardware and licences Applications must be available at both sites Failover process may be complex – requires testing Easily the most popular DR configuration Relatively simple to implement and very reliable when correctly configured37 37 (4.1h) ©2011 Julian Dyke ©2011 Veriton Limited juliandyke.com
  • 38. Active Data Guard Protects against site and server failure Consists of Two data centres in physically separate locations Storage at each location Network between data centres Under normal operation Read-write instance runs on primary server Redo transported and applied to standby server Standby server open for read-only operations Read-consistency maintained on standby server On site failure Read-write instance failed over to standby server38 38 (4.1h) ©2011 Julian Dyke ©2011 Veriton Limited juliandyke.com
  • 39. Active Data Guard Before After SERVER1 SER VER2 SERVER1 SERVER2 A A A STORAGE STORAGE STORAGE STORAGE SITE1 SITE2 SITE1 SITE239 39 (4.1h) ©2011 Julian Dyke ©2011 Veriton Limited juliandyke.com
  • 40. Active Data Guard Advantages Similar to Data Guard Physical Standby Better utilization of hardware Additional read-only capacity Changes available on standby server in near real-time Changes only applied on primary server => reduced contention Disadvantages Similar to Data Guard Physical Standby Requires Active Data Guard licenses at both sites Failover may result in reduced capacity Simpler architecture to implement than RAC Performance monitoring and tuning difficult on standby database Many sites implementing caching functionality in application tier40 40 (4.1h) ©2011 Julian Dyke ©2011 Veriton Limited juliandyke.com
  • 41. Extended RAC Cluster Protects against site and server failure Consists of Two data centres in physically separate locations Shared storage at each location Network between data centres Storage network between data centres Under normal operation Instances run on all servers Database changes are written to storage at both data centres On site failure Instances on failed site are lost Instances remain at surviving site41 41 (4.1h) ©2011 Julian Dyke ©2011 Veriton Limited juliandyke.com
  • 42. Extended RAC Cluster Before SERVER1 SERVER2 SERVER3 SERVER4 A A A A B B B B C C C C D D D D STORAGE STORAGE SITE1 SITE242 42 (4.1h) ©2011 Julian Dyke ©2011 Veriton Limited juliandyke.com
  • 43. Extended RAC Cluster After SERVER1 SERVER2 SERVER3 SERVER4 A A B B C C D D STORAGE STORAGE SITE1 SITE243 43 (4.1h) ©2011 Julian Dyke ©2011 Veriton Limited juliandyke.com
  • 44. Extended RAC Cluster Advantages Better utilization of hardware and licences Applications maintained at both locations Reduced failover testing required Disadvantages May require RAC licences at both sites Additional I/O may impact performance Increased latencies may impact performance Complex solution requires additional management skills Oracle commitment to solution is dubious44 44 (4.1h) ©2011 Julian Dyke ©2011 Veriton Limited juliandyke.com
  • 45. And there’s more… Oracle Restart Clusterware / ASM on a single server Replication Database links / Remote queries Materialized Views Advanced Queuing Oracle Streams Golden Gate45 45 (4.1h) ©2011 Julian Dyke ©2011 Veriton Limited juliandyke.com
  • 46. Agenda1. High Availability Outline2. Generic HA3. Database HA4. Middleware HA5. Summary
  • 47. Types of Middleware Data (11g+)• Binaries – Read only ($MW_HOME, $ORACLE_HOME)• Configuration/logs (inc deployed apps) – Read/write ($DOMAIN_HOME, $ORACLE_INSTANCE)• State data – Java Session – JMS messages – JTA transactions• Application data(?) 47 (1.2h) ©2011 Veriton Limited juliandyke.com
  • 48. State data in memory (& on disk)…• Java Session objects – stay in memory (e.g. contents of my basket) – very common (historical – JVM size) – replicate to other WebLogic servers using either WebLogic clustering or Coherence*Web• JMS messages – Java messages (e.g. reserve this item in warehouse) – can choose to store on filesystem or in database• JTA transactions – Java transactions (e.g. checkout) – NEW! WebLogic 12c can choose to store in database 48 (1.2h) ©2011 Veriton Limited juliandyke.com
  • 49. Active / Passive vs Active / Active• Active / Active more common in middleware tier – Lightweight servers (cd database) – Processes more likely to fail – Low interaction between users – Active / active used for horizontal scalability 49 (1.2h) ©2011 Veriton Limited juliandyke.com
  • 50. WLS 11g A/A +A/P Load Balancing or Web Tier Managed Managed Server(s) Server(s) VIP Admin Server Node Mgr Node Mgr Shared StorageNote: I prefer to have Admin Servers on a separate management node 50 (1.2h) ©2011 Veriton Limited juliandyke.com
  • 51. Active / Passive CFC & ASCRS• Oracle Clusterware – Around since Oracle Database 10g – (CRS code base much more mature)• 10g: You must install with everything listening on VIP• 11g: ‘transform’ steps – ASCRS is new “wrapper” (uses Clusterware 11.1), but its future is unclear to me• See my UKOUG 2010 presentation: Building Active/Passive Clusters with Oracle Fusion Middleware 11g http://www.veriton.co.uk/content/haslam_events.shtml 51 (1.2h) ©2011 Veriton Limited juliandyke.com
  • 52. Active / Passive CFC VIP iAS OC4J Primary Standby OPMN Shared Storage 52 (1.2h) ©2011 Veriton Limited juliandyke.com
  • 53. WLS Whole Service/Server Migration• Service or Server running against VIP• Node Manager co-ordinates service or server restart with Admin Server 53 (1.2h) ©2011 Veriton Limited juliandyke.com
  • 54. Whole Server Migration VIP WLS Primary Standby Node Node AS Mgr Mgr Shared Storage 54 (1.2h) ©2011 Veriton Limited juliandyke.com
  • 55. HA for Layered Products • More difficult • Mainly application level clustering (e.g. OIM, OAM) • Legacy products little, or product-specific options – Chunks of C code • Newer products: – With SOA/BPM 11g uses Coherence for HA – Needs to co-ordinate with database failoverNote: 10g AS Guard has gone – more generic approach now ☺ 55 (1.2h) ©2011 Veriton Limited juliandyke.com
  • 56. Agenda1. High Availability Outline2. Generic HA3. Database HA4. Middleware HA5. Summary
  • 57. • Hardware HA – traditional, simple active/passive• Database HA – Oracle products• Virtualisation HA – treat with caution• Middleware HA – review in ‘WebLogic world’ 57 (1.2h) ©2011 Veriton Limited juliandyke.com
  • 58. Thanks for listening! Twitter: @simon_haslam Blog: http://simonhaslam.co.uk info@juliandyke.com Twitter: @julian_dykeBlog: http://juliandyke.wordpress.com 58 (1.2h) ©2011 Veriton Limited juliandyke.com