Direct SGA access without SQL

1,970 views
1,745 views

Published on

0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,970
On SlideShare
0
From Embeds
0
Number of Embeds
9
Actions
Shares
0
Downloads
83
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide
  • Notice that column A and C are missing from x$foo. Not all values in a structure used in the SGA are made visible via SQL
  • If we mapped the structure on memory or dumped it in a file, we could find the different elements
  • If we mapped the structure on memory or dumped it in a file, we could find the different elements
  • Oracle doesn’t always expose all the fields in the structure thus if there are gaps in the offsets that are bigger than the field sizes then there is other information in the underlying structure that isn’t exposed in the X$ table. (in this case those address are exposed but in different X$ tables)
  • Direct SGA access without SQL

    1. 1. Oaktable Jonathan Lewis and ORACLE_TRACE Oracle_Trace crashes my Database I start the SGA attach by searching every offset Anjo Kolk says James Morle wrote a program using x$ksmmem I show James my first draft using x$ksmmem James is baffled by why Im hard coding offsets James says the offsets are in some X$ table I search, turn up a mail by Jonathan Lewis on x$kqfco Goldmine – all the offsets Thanks Mogens Nogard! Thanks to TomKytes Decimal to Hex
    2. 2. http://oraperf.sourceforge.net
    3. 3. Direct Oracle SGA Memory Access Reading data directly fromOracle’s shared memory segment using C code Wednesday, February 20, 2013
    4. 4. SGA on UNIX SMON S nnn D nnn P nnn PMON SGA CKPT Redo Log Shared Pool Database Buffer Cache Buffer DBWR ARCH LGWR oracle sqlplus Machine Memory
    5. 5. SGA on NT S nnn D nnn P nnn CKPT SMON Machine Shared Pool Database Buffer Cache Redo Log Buffer Memory PMON DBWR LGWR ARCH oracle Process Space sqlplus
    6. 6. What is the SGA Memory Cache Often Used Data Rapid Access Shareable Concurrently Access
    7. 7. SGA 4 main regions Fixed information – Users info – Database statistics – X$dual – etc Data block cache SQL cache ( library cache/shared pool) Redo log buffer
    8. 8. How is the SGA info Used? Automatically – data blocks cached – Log buffer – Sql cache – Updates of system and user statistics User Queries – User info v$session – System info v$parameter – Performance statistics v$sysstat, v$latch, v$system_event – Buffer cache headers, x$bh
    9. 9. Why Direct Access with C? Reading Hidden Information – Sort info on version 7 – OPS locking info version 8 – Contents of data blocks (only the headers or visible in X$) Access while Database is Hung High Speed Access – Sampling User Waits, catch ephemeral data – Scan LRU chain in X$bh – Statistically approximate statistics  SQL statistics per user Low overhead
    10. 10. Database Slow or HungOften happens at the largest sites when cutting edge support is expected. Shared Pool errors ORA 4031 Archiver or Log file Switch Hangs Hang Bugs Library Cache Latch contention ORA-00379: no free buffers available in buffer pool DEFAULT
    11. 11. Statistical Sampling By Rapidly Sampling SQL statistics and the users who have the statistics open, one can see how much work a particular user does with a particular SQL statement
    12. 12. Low Overhead Marketing Appeal Clients are sensitive about their production databases Heisenberg uncertainty affect – less overhead less affect monitoring has on performance which we are monitoring
    13. 13. SGA made visible through x$tables Most of the SGA is not visible X$KSMMEM Exception, Raw Dump of SGA Information Externalized through X$ tables Useful or Necessary information is Externalized Externalized publicly through V$ Tables
    14. 14. Machine Memory 0x80000000 SGA SGA
    15. 15. Buffer Cache Graphic SGASGA0x80000000 Fixed Area Buffer Cache Shared Pool Log Buffer
    16. 16. Fixed Area SGA X$KSUSECST- user waits 0x800000000x85251EF4
    17. 17. X$KSUSECST 170 Records 2328 bytes0x85251EF4 Row 1 Row 2 Row 3 …
    18. 18. X$KSUSECST RecordOne Record in X$KSUSECST 1276 2328 bytes
    19. 19. X$KSUSECST Fields1276 1278 1280 1284 1288 Seq # Event # p1 p2 p3
    20. 20. Externalization of C structs: X$tablesIf Structure foo was externalized in a X$SQL> describe x$fooColumn Name Type------------------------------ --------ADDR RAW(8)INDX NUMBERID NUMBERB NUMBER
    21. 21. SGA is One Large C Struct struct foo { int id; int A; int B; int C; }; struct foo foo[N];
    22. 22. Struct C code#include <stdio.h>#include <fcntl.h>#define N 20/* structure definition: */struct foo{ int id; int a; int b; int c;};/* end structure definition */
    23. 23. Struct Recordmain(){struct foo foo[20];int fptr; /* zero out memory of struct */ memset(foo,0,sizeof(foo)); foo[0].id=1; /* row 0 */ foo[0].a=12; foo[0].b=13; foo[0].c=13;
    24. 24. Struct Write to File foo[1].id=2; /* row 1 */ foo[1].a=22; foo[1].b=23; foo[1].c=24;/* write to file, simulate SGA */ if ((fptr = open("foo.out",O_WRONLY | O_CREAT,0777)) < 0 ) return -1; write(fptr,foo,sizeof(foo)); return 0;}
    25. 25. Simulate SGA with a File write(fp,foo,sizeof(foo));
    26. 26. Simulate SGA with a File Row 0 Row 1 ID A B C ID A …0 1 3 4 6 8 bits0 4 6 8 2 1 8 1 4 2 0 bytes0 4 8 C 2 1 6 1 0 hex bytes0 4 1 1 2 0 2 4 oct 0 Memory address 4 0 4 Increasing bytes
    27. 27. Struct File Contents$ ./foo$ ls -l foo.out-rw-r--r-- joe dba 320 Feb 10 19:41 foo.outint = 32 bitsInt = 4 bytes20 entries * 4 int * 4 bytes/int = 320 bytes
    28. 28. od – octal dump$ od -l foo.out0000000 1 12 13 130000020 2 22 23 240000040 0 0 0 0*0000500
    29. 29. Struct File Contents Address is in Hex Column 2 is the ID Column 3 is field A Column 4 is field B Column 5 is field C
    30. 30. X$ tables ? Ok, x$foo =~ foo[20] How do I get a list of x$ tables? Where is each X$ located? V$Fixed_Tables
    31. 31. V$Fixed_Table – list of X$ tablesSQL> desc v$fixed_table;Name Null? Type----------------------------------------- -------- -----------------NAME VARCHAR2(30)OBJECT_ID NUMBERTYPE VARCHAR2(5)TABLE_NUM NUMBER
    32. 32. Graphic: X$ Addresses SGA 0x800000000x8???????? X$????
    33. 33. V$Fixed_Tablespool addr.sqlselect select addr, ||||name|||| from || name || where rownum < 2;from v$fixed_tablewhere name like X%/spool off@addr.sql
    34. 34. Example: finding the addressselect a.addr , X$KSUSEfrom X$KSUSEwhere rownum < 2 ;
    35. 35. X$ layout6802B244 X$KSLEMAP6802B7EC X$KSLEI6820B758 X$KSURU6820B758 X$KSUSE - v$session6820B758 X$KSUSECST – v$session_wait6820B758 X$KSUSESTA – v$session_stat6820B758 X$KSUSIO6826FBD0 X$KSMDD6831EA0C X$KSRCHDL
    36. 36. Whats in these X$ views V$ views are documented V$ views are based often on X$ tables The map from v$ to X$ is described in : V$Fixed_View_Definition
    37. 37. V$Fixed_View_DefinitionSQL> desc V$Fixed_View_DefinitionName Type----------------------------------- --------------VIEW_NAME VARCHAR2(30)VIEW_DEFINITION VARCHAR2(4000)
    38. 38. Definition of V$Session_WaitSQL> select VIEW_DEFINITION from V$FIXED_VIEW_DEFINITION where view_name=GV$SESSION_WAIT;VIEW_DEFINITION-----------------------------------------------------------------------select s.inst_id,s.indx,s.ksussseq,e.kslednam, e.ksledp1,s.ksussp1,s.ksussp1r,e.ksledp2, s.ksussp2,s.ksussp2r,e.ksledp3,s.ksussp3,s.ksussp3r, decode(s.ksusstim,0,0,-1,-1,-2,-2, decode(round(s.ksusstim/10000),0,-1,round(s.ksusstim/10000))), s.ksusewtm, decode(s.ksusstim, 0, WAITING, -2, WAITED UNKNOWN TIME, -1, WAITED SHORT TIME, WAITED KNOWN TIME) from x$ksusecst s, x$ksled e where bitand(s.ksspaflg,1)!=0 and bitand(s.ksuseflg,1)!=0 and s.ksussseq!=0 and s.ksussopc=e.indx
    39. 39. The Fields in X$ tables OK, Ive picked an X$ Ive got the starting address Now, how do I get the fields?
    40. 40. X$KQFTA Kernel Query Fixed_view Table INDX use to find column information KQFTANAM X$ table names
    41. 41. X$KQFCO Kernel Query Fixed_view Column KQFCOTAB Join with X$KQFTA.INDX KQFCONAM Column name KQFCOOFF Offset from beginning of the row KQFCOSIZ Columns size in bytes
    42. 42. X$KSUSECST Fields1276 1278 1280 1284 1288 Address Seq # Event # p1 p2 p32 2 4 4 4 BYTES
    43. 43. SGA Contents in Resume In resume: Oracle takes the C structure defining the SGA and maps it onto a shared memory segment Memory address Increasing0x800000 0Fixed SGA Buffer Redo Library Cache Buffer Cache Oracle provides access to some of the SGA contents via X$ tables
    44. 44. **** Procedure *****1. Choose a V$ view2. Find base X$ Tables for v$ view3. Map X$ fields to V$ fields4. Get address of X$ table in SGA5. Get the size of each record in X$ table6. Get the number of records in X$ table7. Get offsets for each desired field in X$ table8. Get the base address of SGA
    45. 45. 1) V$SESSION_WAIT Example List of all users waiting Detailed information on the waits Data is ephemeral Useful in Bottleneck diagnostics High sampling rate candidate Event 10046 captures this info Good table for SGA sampling
    46. 46. V$SESSION_WAIT Description SQL> desc v$session_wait Name Type ----------------------------------------- -------------------------- SID ,NUMBER SEQ# ,NUMBER EVENT ,VARCHAR2(64) P1TEXT ,VARCHAR2(64) P1 ,NUMBER P1RAW ,RAW(4) P2TEXT ,VARCHAR2(64) P2 ,NUMBER P2RAW ,RAW(4) P3TEXT ,VARCHAR2(64) P3 ,NUMBER P3RAW ,RAW(4) WAIT_TIME ,NUMBER SECONDS_IN_WAIT ,NUMBER STATE ,VARCHAR2(19) )
    47. 47. V$SESSION_WAIT ShortSQL> desc v$session_waitName Type---------------------------- -------------SID NUMBERSEQ# NUMBEREVENT VARCHAR2(64)P1 NUMBERP2 NUMBERP3 NUMBER)
    48. 48. V$FIXED_VIEW_DEFINITIONGives mappings of V$ views to X$ tablesSQL> select VIEW_DEFINITION from V$FIXED_VIEW_DEFINITION where view_name=V$SESSION_WAIT‘;
    49. 49. V$SESSION_WAIT View DefinitionVIEW_DEFINITION---------------------------------------------------------------------selects.inst_id, s.indx, s.ksussseq, e.kslednam,e.ksledp1,s.ksussp1,s.ksussp1r,e.ksledp2,s.ksussp2,s.ksussp2r,e.ksledp3,s.ksussp3,s.ksussp3r,round(s.ksusstim / 10000),s.ksusewtm,decode(s.ksusstim, 0, WAITING, -2, WAITED UNKNOWN TIME, -1, WAITED SHORT TIME, WAITED KNOWN TIME)fromx$ksusecst s,x$ksled ewherebitand(s.ksspaflg,1)!=0 andbitand(s.ksuseflg,1)!=0 ands.ksussseq!=0 ands.ksussopc=e.indx
    50. 50. View Definition ShortVIEW_DEFINITION---------------------------------------------------------------------select s.indx, s.ksussseq, e.kslednam, s.ksussp1, s.ksussp2, s.ksussp3from x$ksusecst s, x$ksled ewhere s.ksussopc=e.indx
    51. 51. 2) V$SESSION_WAIT Based onX$KSUSECT VIEW_DEFINITION --------------------------------------------------- - select indx, ksussseq, ksussopc, ksussp1, ksussp2, ksussp3 from x$ksusecst
    52. 52. Equivalent SQL Statementsselect select indx, sid ksussseq, seq# ksussopc, event ksussp1, p1 ksussp2, p2 ksussp3 p3from from x$ksusecst v$session_wait ) Note: x$ksusecst. Ksussopc is the event # x$ksled.kslednam is a list of the event names where x$ksled.indx = x$ksusecst. ksussopc
    53. 53. 3) V$ to X$ Field Mapping
    54. 54. 4) Get base SGA address for X$ table Find the location of X$KSUSECST in the SGA SQL> select addr from x$ksusecst where rownum < 2 ADDR -------- 85251EF4
    55. 55. 5) Find the Size of Each RecordSQL> select ((to_dec(e.addr)-to_dec(s.addr))) row_sizefrom (select addr from x$ksusecst where rownum < 2) s, (select max(addr) addr from x$ksusecst where rownum < 3) e ;ROW_SIZE---------------- 2328
    56. 56. 6) Find the Number of Records in the structureSQL> select count(*) from x$ksusecst ;COUNT(*)-------------- 170
    57. 57. Get Offsets for Each Desired Field in X$ table SQL> select c.kqfconam field_name, c.kqfcooff offset, c.kqfcosiz sz from x$kqfco c, x$kqfta t where t.indx = c.kqfcotab and t.kqftanam=X$KSUSECST order by offset ;
    58. 58. X$KQFTA - X$ Tables NamesList of X$ tables INDX use to find column information KQFTANAM X$ table namesTo get Column information join with X$KQFCO X$KQFTA.INDX = X$KQFCO.KQFCOTAB
    59. 59. X$KQFCO – X$ Table Columns List of all the columns in X$ Tables KQFCOTAB Join with X$KQFTA.INDX KQFCONAM Column name KQFCOOFF Offset from beginning of the row KQFCOSIZ Columns size in bytes
    60. 60. Field OffsetsFIELD_NAME OFFSET SZ------------------------------ ---------- ----------ADDR 0 4INDX 0 4KSUSEWTM 0 4INST_ID 0 4KSSPAFLG 1 1KSUSSSEQ 1276 2KSUSSOPC 1278 2KSUSSP1 1280 4KSUSSP1R 1280 4KSUSSP2 1284 4KSUSSP2R 1284 4KSUSSP3 1288 4KSUSSP3R 1288 4KSUSSTIM 1292 4KSUSENUM 1300 2KSUSEFLG 1308 4
    61. 61. What are all the fields at OFFSET 0?These are all calculated values and not stored explicitly in the SGA. ADDR memory address INDX record number, like rownum INST_ID database instance ID KSUSEWTM calculated field
    62. 62. Unexposed FieldsWhat happens between OFFSET 1 and 1276?• Unexposed Fields• Sometimes exposed elsewhere, in our case • V$SESSION • V$SESSTAT
    63. 63. Fields at Same AddressWhy do some fields start at the same address? KSUSSP1 KSUSSP1RAre at the same addressEquivalent of V$SESSION_WAIT.P1 V$SESSION_WAIT.P1RAWThese are the same data, just exposed as Hex Decimal
    64. 64. 7) Offsets of Fields
    65. 65. 8) Get Base SGA AddressSQL> select addr from x$ksmmem where rownum < 2 ADDR-------------- 80000000
    66. 66. Results X$KSUSECST
    67. 67. Machine Memory 0x80000000 SGA SGA
    68. 68. Fixed Area SGA X$KSUSECST- user waits 0x800000000x85251EF4
    69. 69. X$KSUSECST 170 Records 2328 bytes0x85251EF4 Row 1 Row 2 Row 3 …
    70. 70. X$KSUSECST RecordOne Record in X$KSUSECST 1276 2328 bytes
    71. 71. X$KSUSECST Fields1276 1278 1280 1284 1288 Seq # Event # p1 p2 p3
    72. 72. Attaching to the SGA UNIX System Call “shmat”To attach to shared memory Unix as a system call void *shmat( int shmid, const void *shmaddr, int shmflg );
    73. 73. ID and Address arguments to “shmat” The arguments are: shmid – shared memory identifier specified shmaddr – starting address of the shared memory shmflg - flagsThe argument shmflg can be set to SHM_RDONLY . To avoid any possible data corruption the SGA should only be attached read only.The arguments shmid and shmaddr need to be set to Oracle’s SGA id and address.
    74. 74. Finding Oracle SGA’s ID andAddressUse ORADEBUG to find the SGA idSQL> oradebug setmypidStatement processed.SQL> oradebug ipcInformation written to trace file. 
    75. 75. Finding Trace FileSQL> show parameters user_dumpNAME VALUE----------------------- --------------------------------user_dump_dest /u02/app/oracle/admin/V901/udumpSQL> exit$ cd /u02/app/oracle/admin/V901/udump$ ls -ltr | tail -1-rw-r----- usupport dba Aug 24 18:01 v901_ora_23179.trc
    76. 76. Finding SHMID in Trace File$ vi v901_ora_23179.trc… Total size 004456c Minimum Subarea size 00000000 Area Subarea Shmid Stable Addr Actual Addr 0 0 34401 0080000000 0080000000…
    77. 77. Attaching to the SGAShmid 34401Shmaddr 0x80000000Shmflg SHM_RDONLYThe SGA attach call in C would be:Shmat(34401, 0x80000000, SHM_RDONLY);This call needs to be executed as a UNIX user who has read permission to the Oracle SGA
    78. 78. C Code Headers #include <stdio.h> #include <sys/ipc.h> #include <sys/shm.h> #include <errno.h> #include "event.h"event.h is for translating the event #s into event names
    79. 79. Events.hSpool events.hselect char event[][100]={ from dual;select "||name||", from v$event_name;select "" }; from dual;spool off
    80. 80. Define Base Addresses and Sizes/* SGA BASE ADDRESS */#define SGA_BASE 0x80000000/* START ADDR of KSUSECST(V$SESSION_WAIT) */#define KSUSECST_ADDR 0x85251EF4/* NUMBER of ROWS/RECORDS in KSUSECST */#define SESSIONS 150/* SIZE in BYTES of a ROW in KSUSECST */#define RECORD_SZ 2328
    81. 81. Define Offsets to Fields #define KSUSSSEQ 1276 /* sequence # */ #define KSUSSOPC 1278 /* event # */ #define KSUSSP1R 1280 /* p1 */ #define KSUSSP2R 1284 /* p2 */ #define KSUSSP3R 1288 /* p3 */
    82. 82. Set Up Variables main(argc, argv) int argc; char **argv; { void *addr; int shmid; int shmaddr; void *current_addr; long p1r, p2r, p3r; unsigned int i, seq, tim, flg, evn;
    83. 83. Attach to SGA/* ATTACH TO SGA */ shmid=atoi(argv[1]); shmaddr=SGA_BASE; if ( (void *)shmat( shmid, (void *)shmaddr, SHM_RDONLY) == (void *)-1 ) { printf("shmat: error attatching to SGAn"); exit(); }
    84. 84. Set Up Sampling Loop /* LOOP OVER ALL SESSIONS until CANCEL */ while (1) { /* set current address to beginning of Table */ current_addr=(void *)KSUSECST_ADDR; sleep(1); printf("^[[H ^[[J"); /* clear screen */ /* print page heading */ printf("%4s %8s %-20.20s %10s %10s %10s n", "sid", "seq", "wait","p1","p2","p3");
    85. 85. Loop over all Sessionsfor ( i=0; i < SESSIONS ; i++ ) { seq=*(unsigned short *)((int)current_addr+KSUSSSEQ); evn=*(short *) ((int)current_addr+KSUSSOPC); p1r=*(long *) ((int)current_addr+KSUSSP1R); p2r=*(long *) ((int)current_addr+KSUSSP2R); p3r=*(long *) ((int)current_addr+KSUSSP3R); if ( evn != 0 ) { printf("%4d %8u %-20.20s %10X %10X %10X n", i, seq, event[evn] ,p1r, p2r,p3r ); } current_addr=(void *)((int)current_addr+RECORD_SZ); } } }
    86. 86. Output$ sga_read_session_wait 34401sid seq wait p1 p2 p3 0 40582 pmon timer 12C 0 0 1 40452 rdbms ipc message 12C 0 0 2 43248 rdbms ipc message 12C 0 0 3 24706 rdbms ipc message 12C 0 0 4 736 smon timer 12C 0 0 5 88 rdbms ipc message 2BF20 0 0 8 178 SQL*Net message from 6265710 1 0
    87. 87. Pitfalls Byte Swapping 32 bit vs 64 bit Multiple Shared Memory Segments Segmented Memory Addresses are "unsigned int" Misaligned Access
    88. 88. Little Endian vs Big Endian Is low byte values first or high byte values first ? a byte is 8 bits – 00000000-11111111 bits,0 – 31 dec, 0x0 - 0xFF hex Big Endian is "normal" , highest bit first In ascii, the word "byte" is stored as – b = 62, y = 79, t = 74, e = 65 echo byte | od -x – b y t e – 62 79 74 65 Little Endian, ie byte swapped (Linux, OSF, Sequent, ? ) – y b e t – 79 62 65 74
    89. 89. Byte Swap ExampleShort = 2 bytes ie 16 bitsGoal, get the flag in the "second" byte#ifdef __linux uflg=*(short *)((int)sga_address)>>8;#else uflg=*(short *)((int)sga_address);#endif
    90. 90. Byte SwapBig Endian:00 00 00 00 00 00 00 01Little Endian00 00 00 01 00 00 00 00Solution, push the value over 8 places, to the right,ie >>8
    91. 91. 64 bit vs 32 bit SQL> desc x$ksmmem Name Type ------------------------------------- --------- ADDR RAW(4) INDX NUMBER INST_ID NUMBER KSMMMVAL RAW(4)-> 32 bitRaw(8) -> 64 bit
    92. 92. Segmented Memoryx$ksuse – can be dis-contiguousWork around:select int users[]={ from dual;select 0x||addr||, from x$ksuse;select 0x0}; from dual;
    93. 93. Misaligned Access Some platforms seg fault when addressing misaligned bytes, need to read in even bytes or units of 4 bytes depending on platform 1 2 3 4 5 6 7 8
    94. 94. x$ksusecst Record: Whats Missing?One Record in X$KSUSECST ??? ??? 1276 2328 bytes
    95. 95. Select Addr from X$? where Rownum<2;6802B244 X$KSLEMAP6802B7EC X$KSLEI6820B758 X$KSURU6820B758 X$KSUSE – v$session6820B758 X$KSUSECST – v$session_wait6820B758 X$KSUSESTA – v$sesstat6820B758 X$KSUSIO6826FBD0 X$KSMDD6831EA0C X$KSRCHDL
    96. 96. x$ksuse Record Containsx$ksusecstOne Record in X$Ksusecst v$session v$sesstat v$session_wait v$session 236 1276 2328 bytes x$ksusesta x$ksusecst x$ksuse
    97. 97. Getting v$sesstat addressesselect #define || upper(translate(s.name, :-()/*,________))|| || to_char(c.kqfcooff + STATISTIC# * 4 )from x$kqfco c, x$kqfta t, v$statname swhere t.indx = c.kqfcotab and ( t.kqftanam=X$KSUSESTA ) and c.kqfconam=KSUSESTV and kqfcooff > 0order by c.kqfcooff/
    98. 98. User Drilldown Query: 4 joinsselect w.sid sid, w.seq# seq, w.event event, w.p1raw p1, w.p2raw p2, w.p3raw p3, w.SECONDS_IN_WAIT ctime, s.sql_hash_value sqlhash, s.prev_hash_value psqlhash, st.value cpu from v$session s, v$sesstat st, v$statname sn, v$session_wait w where w.sid = s.sid and st.sid = s.sid and st.statistic# = sn.statistic# and sn.name = CPU used when call started and w.event != SQL*Net message from client order by w.sid;
    99. 99. Other Fun StuffThe next example is output from an SGA program that follows the LRU of the Buffer CacheThe program demonstrates the• insertion point of LRU• cold end of LRU• hot end of the LRU• Full Table Scan Insertion Point
    100. 100. LRU HOT
    101. 101. LRU COLD

    ×