Your SlideShare is downloading. ×
UTHOC2 - Under The Hood of Oracle Clusterware 2.0 - Grid Infrastructure by Alex Gorbachev Pythian
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

UTHOC2 - Under The Hood of Oracle Clusterware 2.0 - Grid Infrastructure by Alex Gorbachev Pythian

6,013
views

Published on

Under the Hoold of Oracle Clusterware 2.0 - updated version of the slides covering Oracle Database 11.2

Under the Hoold of Oracle Clusterware 2.0 - updated version of the slides covering Oracle Database 11.2

Published in: Technology

0 Comments
9 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
6,013
On Slideshare
0
From Embeds
0
Number of Embeds
9
Actions
Shares
0
Downloads
260
Comments
0
Likes
9
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Under the Hoodof Oracle Clusterware 2.0:Grid InfrastructureAlex Gorbachev29 August, 2012
  • 2. Agenda • Place of Grid Infrastructure in Oracle RAC • Node membership and evictions • Clusterware architecture & startup sequence • Resources Management and troubleshooting2 © 2009/2010 Pythian
  • 3. Agenda Th High th e e le m ss or yo e yo Need to memorize u u ne u ed nd to ers m tan em d, or ize Low Shallow In-depth Understanding2 © 2009/2010 Pythian
  • 4. Single Instance Oracle Database APP Query/DML/DDL SERVER INSTANCE Memory (SGA, PGA) Processes (PMON, SMON, LGWR and etc. + multiple shadow processes) Read/write Datafiles Controlfiles redo logs flashback logs, change tracking and etc... Database3 © 2012 Pythian
  • 5. Single Instance Oracle Database APP SERVER INSTANCE Database4 © 2012 Pythian
  • 6. Oracle RAC Database APP SERVER 1 INSTANCE 1 Database5 © 2012 Pythian
  • 7. Oracle RAC Database APP SERVER 1 SERVER 2 INSTANCE 1 INSTANCE 2 Database5 © 2012 Pythian
  • 8. Oracle RAC Database APP SERVER 1 SERVER 2 SERVER 3 INSTANCE 1 INSTANCE 2 INSTANCE 3 Database5 © 2012 Pythian
  • 9. Oracle RAC Database APP SERVER 1 SERVER 2 SERVER 3 INSTANCE 1 INSTANCE 2 INSTANCE 3 Database5 © 2012 Pythian
  • 10. Oracle RAC Database APP SERVER 1 SERVER 2 SERVER 3 INSTANCE 1 INSTANCE 2 INSTANCE 3 Database5 © 2012 Pythian
  • 11. RAC looks simple. Eh?6 © 2012 Pythian
  • 12. Role of Grid Infrastructure OS OS OS VIP VIP VIP Listener Listener Listener Service Service Service Instance Instance Instance ASM ASM ASM Grid Infrastr. Grid Infrastr. Grid Infrastr. interconnect storage access OCR Voting disk Shared storage7 © 2009/2010 Pythian
  • 13. Role of Grid Infrastructure OS OS OS VIP VIP VIP Listener Listener Listener Service Service Service Instance Instance Instance ASM ASM ASM Grid Infrastr. Grid Infrastr. Grid Infrastr. interconnect storage access OCR Voting disk Shared storage7 © 2009/2010 Pythian
  • 14. OS OS Clusterware Clusterware CSSD CSSD interconnect OPROCD OPROCD8 © 2009/2010 Pythian
  • 15. OS OS Clusterware Clusterware CSSD CSSD interconnect OPROCD OPROCD8 © 2009/2010 Pythian
  • 16. OS OS Clusterware Clusterware CSSD CSSD interconnect OPROCD OPROCD8 © 2009/2010 Pythian
  • 17. OS OS Clusterware Clusterware CSSD CSSD interconnect OPROCD OPROCD Voting disk9 © 2009/2010 Pythian
  • 18. OS OS Clusterware Clusterware CSSD CSSD interconnect OPROCD OPROCD Voting disk9 © 2009/2010 Pythian
  • 19. OS OSShoot Clusterware ClusterwareTheOtherNode CSSD CSSD interconnectIn OPROCD OPROCDTheHead Voting disk9 © 2009/2010 Pythian
  • 20. OS OS Clusterware Clusterware VIP RACG EVMD CRSD CSSD CSSD interconnect OPROCD OPROCD Voting disk10 © 2009/2010 Pythian
  • 21. OS OS Clusterware Clusterware CSSD CSSD interconnect OPROCD OPROCD Voting disk11 © 2009/2010 Pythian
  • 22. OS OS Clusterware Clusterware CSSD CSSD interconnect OPROCD OPROCD Voting disk11 © 2009/2010 Pythian
  • 23. OS OS Clusterware Clusterware CSSD CSSD interconnect OPROCD OPROCD Voting disk11 © 2009/2010 Pythian
  • 24. OS Clusterware CSSD CSSD interconnect OPROCD Voting disk11 © 2009/2010 Pythian
  • 25. OS ClusterwareAskTheOtherCSSD CSSDNode interconnect OPROCDToReboot VotingItself (c) known quote disk11 © 2009/2010 Pythian
  • 26. 11gR2 Grid Infrastructure: CSSD attempts graceful shutdown12 © 2009/2010 Pythian
  • 27. OS OS Clusterware Clusterware CS SD CSSD interconnect OPROCD OPROCD Voting disk13 © 2009/2010 Pythian
  • 28. OS OS Clusterware Clusterware CSSD Monitor/Agent CS SD CSSD interconnect OPROCD OPROCD Voting disk13 © 2009/2010 Pythian
  • 29. OS Clusterware CSSD Monitor/Agent CSSD interconnect OPROCD Voting disk13 © 2009/2010 Pythian
  • 30. OS Clusterware CSSD CSSD interconnect OPROCD Voting disk14 © 2009/2010 Pythian
  • 31. OS Clusterware CSSD CSSD interconnect OPROCD Voting disk14 © 2009/2010 Pythian
  • 32. OS Clusterware CSSD CSSD interconnect OPROCD Voting disk14 © 2009/2010 Pythian
  • 33. OS Clusterware CSSD CSSD interconnect OPROCD/CSSD Mon OPROCD Voting disk14 © 2009/2010 Pythian
  • 34. OS Clusterware CSSD interconnect OPROCD/CSSD Mon OPROCD Voting disk14 © 2009/2010 Pythian
  • 35. OS OS Clusterware Clusterware CSSD CSSD interconnect OPROCD OPROCD Voting disk15 © 2009/2010 Pythian
  • 36. OS OS Clusterware Clusterware CSSD CSSD interconnect OPROCD OPROCD Voting disk15 © 2009/2010 Pythian
  • 37. OS OS Clusterware Clusterware CSSD CSSD interconnect OPROCD OPROCD Voting disk15 © 2009/2010 Pythian
  • 38. OS Clusterware CSSD CSSD interconnect OPROCD Voting disk15 © 2009/2010 Pythian
  • 39. OS OS Clusterware Clusterware CSSD CSSD interconnect OPROCD OPROCD Voting disk16 © 2009/2010 Pythian
  • 40. OS OS Clusterware Clusterware CSSD CSSD interconnect OPROCD OPROCD16 © 2009/2010 Pythian
  • 41. OS OS Clusterware Clusterware CSSD CSSD interconnect OPROCD OPROCD16 © 2009/2010 Pythian
  • 42. CSSD CSSD interconnect16 © 2009/2010 Pythian
  • 43. 11gR2 Grid Infrastructure: CSSD attempts graceful shutdown17 © 2009/2010 Pythian
  • 44. OS OS Clusterware Clusterware Instance Instance LMON CSSD CSSD interconnect OPROCD OPROCD18 © 2009/2010 Pythian
  • 45. OS OS Clusterware Clusterware Instance Instance LMON member kill CSSD CSSD interconnect OPROCD OPROCD18 © 2009/2010 Pythian
  • 46. OS OS Clusterware Clusterware Instance Instance LMON member kill CSSD CSSD interconnect OPROCD OPROCD18 © 2009/2010 Pythian
  • 47. OS OS Clusterware Clusterware Instance Instance LMON member kill CSSD CSSD interconnect OPROCD OPROCD18 © 2009/2010 Pythian
  • 48. OS OS Clusterware Clusterware Instance Instance LMON member kill CSSD CSSD interconnect OPROCD OPROCD18 © 2009/2010 Pythian
  • 49. OS OS Clusterware Clusterware Instance Instance LMON member kill CSSD CSSD interconnect OPROCD OPROCD Eviction by escalation of a member kill18 © 2009/2010 Pythian
  • 50. OS Clusterware Instance LMON member kill CSSD interconnect OPROCD Eviction by escalation of a member kill18 © 2009/2010 Pythian
  • 51. OS OS Clusterware Clusterware CSSD CSSD interconnect OPROCD OPROCD Voting disk19 © 2009/2010 Pythian
  • 52. OS OS Clusterware Clusterware CSSD CSSD interconnect OPROCD OPROCD 11gR2 Intelligent Platform Voting Management disk Interface19 © 2009/2010 Pythian
  • 53. OS OS Clusterware Clusterware CSSD CSSD interconnect OPROCD OPROCD 11gR2 Intelligent Platform Voting Management disk Interface19 © 2009/2010 Pythian
  • 54. OS OS Clusterware Clusterware CSSD CSSD interconnect OPROCD OPROCD 11gR2 Intelligent Platform Voting Management disk Interface19 © 2009/2010 Pythian
  • 55. OS OS Clusterware Clusterware CSSD CSSD interconnect OPROCD OPROCD Voting Exadata Fencing disk20 © 2009/2010 Pythian
  • 56. Grid Infrastructure Startup 10g / 11gR1 11gR2 • Linux & UNIX inittab • Linux & UNIX inittab init.cssd init.ohasd run init.evmd init.crsd • Linux & UNIX init.d • Linux & UNIX init.d init.crs ohasd start • Windows Services • Windows Services21 © 2009/2010 Pythian
  • 57. Startup in Linux & Unix [root@cheese2 ~]# ps -fe | grep init. | grep -v grep root 4283 1 0 02:52 ? 00:00:00 /bin/sh /etc/init.d/init.ohasd run [root@cheese2 ~]# tail -1 /etc/inittab h1:35:respawn:/etc/init.d/init.ohasd run >/dev/null 2>&1 </dev/null [root@cheese2 ~]# ls -l /etc/rc3.d/*ohasd* lrwxrwxrwx 1 root root 17 Sep 15 02:02 /etc/rc3.d/K15ohasd -> /etc/init.d/ohasd lrwxrwxrwx 1 root root 17 Sep 15 02:02 /etc/rc3.d/S96ohasd -> /etc/init.d/ohasd22 © 2009/2010 Pythian
  • 58. Pre 11gR2 Clusterware Startup/etc/oracle/scls_scr/{host}/root/cssrun /etc/oracle/scls_scr/{host}/root/crsstart • enable • disable init.crs start init.cssd autostart init.cssd oprodc oprocd init.cssd oclsomon oclsomon.bin init.cssd oclsvmon oclsvmon.bin init.cssd daemon ocssd.bin init.cssd fatal evmd.bin init.evmd run init.crsd run crsd.bin t23 © 2009/2010 Pythian
  • 59. Grid Infrastructure Startup t24 © 2009/2010 Pythian
  • 60. Grid Infrastructure Startup init.ohasd run t24 © 2009/2010 Pythian
  • 61. Grid Infrastructure Startup/etc/oracle/scls_scr/{host}/root/ohasdrun init.ohasd run t24 © 2009/2010 Pythian
  • 62. Grid Infrastructure Startup/etc/oracle/scls_scr/{host}/root/ohasdrun init.ohasd run t24 © 2009/2010 Pythian
  • 63. Grid Infrastructure Startup/etc/oracle/scls_scr/{host}/root/ohasdrun /etc/init.d/ohasd init.ohasd run t24 © 2009/2010 Pythian
  • 64. Grid Infrastructure Startup/etc/oracle/scls_scr/{host}/root/ohasdrun /etc/init.d/ohasd /etc/oracle/scls_scr/{host}/root/ohasdstr • enable • disable init.ohasd run t24 © 2009/2010 Pythian
  • 65. Grid Infrastructure Startup/etc/oracle/scls_scr/{host}/root/ohasdrun /etc/init.d/ohasd /etc/oracle/scls_scr/{host}/root/ohasdstr • enable • disable init.ohasd run t24 © 2009/2010 Pythian
  • 66. Grid Infrastructure Startup/etc/oracle/scls_scr/{host}/root/ohasdrun init.ohasd run t24 © 2009/2010 Pythian
  • 67. Grid Infrastructure Startup/etc/oracle/scls_scr/{host}/root/ohasdrun OHAS init.ohasd run t24 © 2009/2010 Pythian
  • 68. Grid Infrastructure Startup/etc/oracle/scls_scr/{host}/root/ohasdrun Ora Agent OHAS init.ohasd run t24 © 2009/2010 Pythian
  • 69. Grid Infrastructure Startup/etc/oracle/scls_scr/{host}/root/ohasdrun Ora Agent OHAS Root Agent init.ohasd run t24 © 2009/2010 Pythian
  • 70. Grid Infrastructure Startup/etc/oracle/scls_scr/{host}/root/ohasdrun Ora Agent OHAS Root Agent CSSD Agent init.ohasd run t24 © 2009/2010 Pythian
  • 71. Grid Infrastructure Startup/etc/oracle/scls_scr/{host}/root/ohasdrun Ora Agent OHAS Root Agent CSSD Agent init.ohasd run CSSD Monitor t24 © 2009/2010 Pythian
  • 72. Grid Infrastructure Startup/etc/oracle/scls_scr/{host}/root/ohasdrun Ora Agent OHAS Root Agent CSSD Agent CSS init.ohasd run CSSD Monitor t24 © 2009/2010 Pythian
  • 73. Grid Infrastructure Startup/etc/oracle/scls_scr/{host}/root/ohasdrun Ora Agent ACFS Drivers CTSS Disk Monitor OHAS Root Agent CRS CSSD Agent CSS init.ohasd run CSSD Monitor t24 © 2009/2010 Pythian
  • 74. Grid Infrastructure Startup/etc/oracle/scls_scr/{host}/root/ohasdrun ASM EVM GPnP GIPC Ora Agent MDNS ACFS Drivers CTSS Disk Monitor OHAS Root Agent CRS CSSD Agent CSS init.ohasd run CSSD Monitor t24 © 2009/2010 Pythian
  • 75. Grid Infrastructure Startup/etc/oracle/scls_scr/{host}/root/ohasdrun ASM EVM GPnP CRS Root Agent GIPC •VIP •SCAN IP Ora Agent MDNS •Network •GNS •ACFS Registry ACFS Drivers CTSS Disk Monitor OHAS Root Agent CRS CSSD Agent CSS init.ohasd run CSSD Monitor t24 © 2009/2010 Pythian
  • 76. Grid Infrastructure Startup/etc/oracle/scls_scr/{host}/root/ohasdrun ASM EVM GPnP CRS Root Agent GIPC •VIP •SCAN IP Ora Agent MDNS •Network •GNS •ACFS Registry ACFS Drivers CTSS CRS Ora Agent Disk Monitor •Database •Instance OHAS Root Agent CRS •Listener •Services •Diskgroups CSSD Agent CSS •ONS •eONS •SCAN Listener init.ohasd run CSSD Monitor t24 © 2009/2010 Pythian
  • 77. Grid Infrastructure Startup (static slide) ASM EVM GPnP CRS Root Agent GIPC •VIP •SCAN IP Ora Agent MDNS •Network •GNS •ACFS Registry ACFS Drivers CTSS CRS Ora Agent Disk Monitor •Database •Instance OHAS Root Agent CRS •Listener •Services •Diskgroups CSSD Agent CSS •ONS •eONS •SCAN Listener init.ohasd run CSSD Monitor t25 © 2009/2010 Pythian
  • 78. Grid Infrastructure Log Files $GRID_HOME/log/{hostname}/ • alert<host>.log • ctssd • ohasd • diskmon • crsd • gipcd • cssd • gnsd • agent/ohasd/oraagent_oracle • gpnpd • agent/ohasd/oracssdagent_root • mdnsd • agent/ohasd/oracssdmonitor_root • racg • agent/ohasd/orarootagent_root • agent/crsd/oraagent_oracle • agent/crsd/orarootagent_root26 © 2009/2010 Pythian
  • 79. Oracle Cluster Registry • Repository for all shared configuration data • Except OCR location itself • OCR is accessed mostly read-only • Every component reads OCR • OCR is written only by CRS • only from a single OCR master node • 11gR2 - Oracle Local Registry (OLR) • managed by ohasd27 © 2009/2010 Pythian
  • 80. DEMO Interconnect Failure • Simulate with “ifconfig eth1 down” on node 2 • Both nodes notice the loss • Racing to evict each other • from voting disk => 2 equal sub-clusters (cohorts) • survives the one with the lowest leader # • leader is the node with lowest # in sub-cluster • Winner evicts another node • Setting kill-block in voting disk • 11gR2 new feature: CSSD does clean restart28 © 2009/2010 Pythian
  • 81. DEMO Voting Disk Failure • Simulate with storage interface down or on NFS server • CSSD detects voting disk IO stale • disktimeout setting - 200 seconds default • CSSD starts eviction • 11gR2 new feature: CSSD does clean restart29 © 2009/2010 Pythian
  • 82. DEMO CSSD is not healthy • Simulate using kill -STOP <cssd.bin pid> (and try kill -9) • Another node observes NHB loss • After misscount seconds => attempt eviction • but CSSD is frozen and can’t commit suicide • CSSD Monitor detects CSSD timeout • Commit suicide30 © 2009/2010 Pythian
  • 83. Homework Host sick - CPU stalled • Used to simulate by pausing OPROCD • kill -STOP <oprocd pid> • sleep 1 or 2 • kill -CONT <oprocd pid> • OPROCD is now a thread in CSSD Monitor • kill -STOP {cssdmonitor.bin} ; sleep 1 ; kill -CONT {cssdmonitor.bin} • Doesn’t produce any visible results! Is there still OPROCD?31 © 2009/2010 Pythian
  • 84. DEMO Startup troubleshooting • Break before starting up • Interconnect, voting disk, Grid Home missing • Check processes using “ps -fe | grep init” • Check syslog (/var/log/messages) • boot sequence • Clusterware log files • if *.bin processes are running already • crsctl check crs • crsctl status resource -t -init32 © 2009/2010 Pythian
  • 85. 11gR2 Cluster Resources • Introduced resource type • APPLICATION was the only supported resource type in 11gR1 • Resource types using resource agents • Cluster and local resources • Sophisticated resources dependencies (stop & start) • hard • weak • attraction • pullup • dispersion33 © 2009/2010 Pythian
  • 86. Grid Infrastructure Processes • ora.cluster_interconnect.haip ASM CRS Managed Resources EVM OHAS Internal Resources GPnP CRS Root Agent GIPC • VIP • SCAN IP Ora Agent MDNS • Network • GNS • ACFS Registry ACFS Drivers CTSS CRS Ora Agent Disk Monitor • Database • Instance OHAS Root Agent CRS • Listener • Services • Diskgroups CSSD Agent CSS • ONS • eONS • SCAN Listener init.ohasd run CSSD Monitor t34 © 2009/2010 Pythian
  • 87. Troubleshooting Something Down • OHASD up? • OHASD Agents up? • Internal resources up? • CRSD Agents up? • Managed resources up? • Listener & VIP • Database & ASM instance • Services • Have the nodes rebooted? • Have resources re-started?35 © 2009/2010 Pythian
  • 88. 11gR2 Grid Infrastructure References • Oracle Clusterware Administration and Deployment Guide • MOS 1053147.1 • 11gR2 Clusterware and Grid Home - What You Need to Know • MOS 1050908.1 • How to Troubleshoot Grid Infrastructure Startup Issues • MOS 1053970.1 • Troubleshooting 11.2 Grid Infastructure Installation Root.sh Issues • MOS 1050693.1 • Troubleshooting 11.2 Clusterware Node Evictions (Reboots) • MOS 942166.1 • How to Proceed from Failed 11gR2 Grid Infrastructure Installation36 © 2009/2010 Pythian