Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Ireland OUG Meetup May 2017

2,149 views

Published on

Slides from the Ireland OUG Meetup May 2017

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Ireland OUG Meetup May 2017

  1. 1. DataGuard with RAC and Multi-tenant experiences & Oracle Analytics Cloud Service Thursday 12th January, 2017
  2. 2. Thursday 14th September SQL & PL/SQL Development
  3. 3. Multi-Tenant Dataguard A Bug’s Tale…
  4. 4. Who I am Simon Holt • DBA for 25 years+, started with version 6.0.36 • OUG Tech SIG chair and Meet-ups co-organiser • DBA architect at ESB Networks
  5. 5. Oracle SuperCluster at ESB Networks Half Rack (with extra Storage Cells) • Two T5-8 Compute Nodes with 4 x 16 Core CPUs • 1 TB Memory per node • 6 Exadata High Capacity Storage Cells, 288 Tb raw • High Speed InfiniBand Network • 3 x 36-port InfiniBand switches • 1 x 48-port 1-Gigabit Ethernet Switch • ZS3 Storage 80 Tb raw • 4 x LDOM (2 for DB , 2 for App)
  6. 6. ESB Networks Architecture HA SPARC Superclusters with GeoCluster Platinum Support Gateway Oracle Secure Backup Media Server Oracle Secure Backup Admin Server tape library InfiniBand Network Ethernet OEM Ethernet Active Dataguard ZFS Replication OEM Ethernet Platinum Support Gateway InfiniBand Network Oracle Secure Backup Media Server tape library Oracle Secure Backup Admin Server Ethernet Sparc Supercluster • Two T5-8 Compute Nodes • 4 x16 Core CPUs per Node • 1 TB Memory per node • 6 Exadata High Capacity Storage Cells, 288 Tb raw • High Speed InfiniBand Network Sparc Supercluster • Two T5-8 Compute Nodes • 4 x 16 Core CPUs per Node • 1 TB Memory per node • 6 Exadata High Capacity Storage Cells, 288 Tb raw • High Speed InfiniBand Network Geo-Cluster
  7. 7. Temporary Tablespace Tantrum Oracle Data Guard Concepts and Administration 10.3.1 Adding a Data File or Creating a Tablespace The STANDBY_FILE_MANAGEMENT database initialization parameter controls whether the addition of a data file to the primary database is automatically propagated to a physical standby databases. If the STANDBY_FILE_MANAGEMENT database parameter on the physical standby database is set to AUTO, any new data files created on the primary database are automatically created on the physical standby database. But.. This does not apply to TEMPFILE creation, so newly created tempfiles will not exist in standby.
  8. 8. Adding a Temporary Tablespace to Primary On CREATE at primary, the tablespace meta-data will be transported in redo, file creation instruction will not be. At Standby, a tablespace operation like CREATE or DROP results in: ORA-16000: database or pluggable database open for read-only access. • But the meta-data for the tablespace is already there, so no CREATE necessary! • Would need to add tempfile to the standby tablespace to match it up with old Primary. Primary CREATE TEMPORARY TABLESPACE temp_test TEMPFILE ‘+TEST_DATA’ SIZE 100M EXTENT MANAGEMENT LOCAL UNIFORM SIZE 1M; SELECT TABLESPACE_NAME FROM dba_tablespaces WHERE tablespace_name = ‘TEMP_TEST’; TABLESPACE_NAME ----------------------------- TEMP_TEST SELECT name, bytes FROM dba_temp_files WHERE tablespace_name = ‘TEMP_TEST’; NAME BYTES -------------------------------- --------------- +TEST_DATA/…/temp_test.269.943312435 1048576 Standby SELECT TABLESPACE_NAME FROM dba_tablespaces WHERE tablespace_name = ‘TEMP_TEST’; TABLESPACE_NAME ----------------------------- TEMP_TEST SELECT name, bytes FROM dba_temp_files WHERE tablespace_name = ‘TEMP_TEST’; no rows selected Requires an ALTER TABLESPACE .. ADD TEMPFILE command.
  9. 9. The Set-Up Default temporary tablespaces in PDBs were under-sized, unaligned with ASM extents, smallfile, and not locally managed. To rectify this, we need to “shuffle” the default temporary tablespace in each PDB: Can’t drop a default temporary tablespace, so do the usual procedure of 1. Create a temporary temporary tablespace X 2. Change the database default temporary tablespace to X from TEMP 3. Drop TEMP tablespace 4. Recreate TEMP tablespace as desired 5. Alter database default temporary tablespace to TEMP 6. Drop temporary temporary tablespace X So in our case, run the following in each PDB: CREATE TEMPORARY TABLESPACE temp_shuffle TEMPFILE SIZE 10G; ALTER DATABASE &1 DEFAULT TEMPORARY TABLESPACE temp_shuffle; DROP TABLESPACE temp; CREATE BIGFILE TEMPORARY TABLESPACE temp TEMPFILE SIZE 10G AUTOEXTEND ON NEXT 1G MAXSIZE 128G EXTENT MANAGEMENT LOCAL UNIFORM SIZE 4M; ALTER DATABASE &1 DEFAULT TEMPORARY TABLESPACE temp; DROP TABLESPACE temp_shuffle;
  10. 10. The Discovery Standby Dictionary consistent with Primary – all TEMP tablespaces same storage characteristics. Would fully expect to add suitable tempfiles to the re-created TEMP tablespaces at the standby, when it becomes new primary after a switchover happens. However, when we do switchover… DISASTER…! Old Standby / New Primary Database • Fails to mount CDB • All PDBs returning an ORA-600 [kcffo_pdb_tempfiles: pdbid] error • Both RAC instances crash, terminated by smon. Old Primary / New Standby • Completed role transition without error – now in “PHYSICAL STANDBY” role.
  11. 11. The Data Gathering Sample message from alert log: ORACLE Instance LIVENMS1 (pid = 37) - Error 600 encountered while recovering transaction (24, 10). Errors in file /u01/app/oracle/diag/rdbms/dca_livenms/LIVENMS1/trace/LIVENMS1_smon_12686.trc: ORA-00600: internal error code, arguments: [kcffo_pdb_tempfiles: pdbid], [204], [5], [], [], [], [], [], [], [], [], [] 2017-01-31 16:43:40.108000 +00:00 Dumping diagnostic data in directory=[cdmp_20170131164340], requested by (instance=1, osid=4294979982 (SMON)), summary=[abnormal process termination]. The above message is repeated for every PDB. Smon retries a few times then the instance crashes Both instances do this independently. Noted it appears to be crashing on instance recovery by SMON. SMON Trace files contain a series of ORA-600s with the following pattern: ORA-00600: internal error code, arguments: [kcffo_pdb_tempfiles: pdbid], [204], [5], [], [], [], [], [], [], [], [], [] ORA-00600: internal error code, arguments: [kcffo_pdb_tempfiles: pdbid], [204], [5], [], [], [], [], [], [], [], [], [] ORA-00600: internal error code, arguments: [krt_drop_oftf_01], [204], [2], [2], [12], [12], [3], [5], [], [], [], []
  12. 12. The Investigation Support asked us to re-create the tempfiles for the PDB temporary tablespaces. • Not possible: CDB attempts to MOUNT but then instance crashes, so no commands possible • PDBs had their state saved at open, attempt to open automatically on database start, and fail as described. • If PDBs had not been set to open automatically, would have gotten to mount state. (But would still have hit this bug!) Support suggest could be Bug 20686344 : ORA-00600 [KRT_DROP_OFTF_01] ON CONVERTING TO SNAPSHOT STANDBY IN CONTAINER DB … but not wholly convinced… Support suggest setting hidden parameter _smu_debug_mode = 1024 • Will stop smon transaction recovery, and hopefully open database • Further investigation / transaction clearing could then be done • Didn’t allow open, but did allow CDB to get to MOUNT state and stay there! • Database role found to be “PRIMARY”. New bug opened to get development involved with investigation • Oracle Development advised to apply patch for bug 24601586
  13. 13. The Investigation (II) Asked to check contents of X$KCCPDB (Controlfile PDB records) and X$KCCTF (Controlfile tempfile records) select CON_ID, PDBTFP from X$KCCPDB; select CON_ID, TFNUM from X$KCCTF; PDBTFP and TFNUM should be the same for each CON_ID: In our case, they didn’t match up at all. In the controlfile PDB records (X$KCCPDB): CON_ID PDBTFP 3 5 (pdb#3 points at tempfile#5) 4 3 (pdb#4 points at tempfile#3) 5 6 (pdb#5 points at tempfile#6) In the controlfile tempfile records (X$KCCTF): TFNUM CON_ID 3 3 (tempfile#3 points at pdb#3) 4 4 (tempfile#4 points at pdb#4) 5 5 (tempfile#5 points at pdb#5) 6 5 (tempfile#6 points at pdb#5) Support action plan when cause positively identified: • Recreate the controlfile to fix the pointers • Use ALTER TABLESPACE ... ADD TEMPFILE to re-create the tempfile at the new primary.
  14. 14. The Bug and Solution Development identify the situation as an exact match for: Bug 24601586 ORA-600 [kcffo_pdb_tempfiles: pdbid] or ORA-600 [krt_drop_oftf_01] on Multitenant Database after Flashback Database Listed for Flashback database event, but also applies to situation where temp files dropped / recreated on Primary in PDBs. Advised to apply patch for bug 24601586 at both Standby and Primary: • Would prevent error happening again, but not cure actual situation. • Control file inconsistencies introduced by bug cause ORA-600 Remedy for current situation confirmed as • Recreate the controlfile to fix the pointers • Use ALTER TABLESPACE ... ADD TEMPFILE to re-create the tempfile at the new primary.
  15. 15. Lessons Learned Apply patch 24601586 or bundle containing it before any pdb temporary tablespace maintenance at primary Note 24601568.8 contains brief bug description, but does not reference DROP /CREATE, only Flashback. Check ORA-600 errors: If getting [krt_drop_oftf_01] and the last two arguments do not match, then hitting this bug, because in the past there has either been a flashback database, or temp tablespaces dropped and recreated. [krt_drop_oftf_01], [204], [2], [2], [12], [12], [3], [5] If situation already exists: • Apply patch 24601586 • Re-create controlfile at new primary: • Backup to trace to generate CREATE CONTROLFILE statement • Add NORESETLOGS • Comment out standby redo log section (returns ORA-01967: invalid option for CREATE CONTROLFILE otherwise) • Execute the CREATE CONTROLFILE script • On successful open: • For each PDB, use ALTER TABLESPACE <x> ADD TEMPFILE to create files • Re-create standby redo logs

×