Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

The life of a query (oracle edition)


Published on

Published in: Technology
  • Be the first to comment

The life of a query (oracle edition)

  1. 1. Jonah H. Harris Session #399 The Life of a Query (Oracle Edition)
  2. 2. Speaker Qualifications <ul><li>Currently the Sr. DBA at </li></ul><ul><li>Oracle DBA/developer starting with Oracle V7 </li></ul><ul><li>Database Internals Software Architect for 3 years </li></ul><ul><li>Hobby is Researching Oracle Internals </li></ul><ul><li>Speaker for IOUG, VOUG, NYOUG, SEOUC </li></ul><ul><li>Technical Reviewer for IOUG SELECT & several O'Reilly books on SQL </li></ul><ul><li>Blog about Oracle technology </li></ul><ul><ul><li> </li></ul></ul><ul><li>Developed patches for open-source databases: PostgreSQL, InnoDB, SAP DB, Firebird </li></ul>
  3. 3. Disclaimer <ul><li>This is my hobby </li></ul><ul><li>I’ve never been an Oracle insider </li></ul><ul><li>The material in this presentation has been based on years of researching Oracle internals as well as analyzing network traffic and trace files. </li></ul><ul><li>Do your own research! </li></ul><ul><li>Use at your own risk! </li></ul>
  4. 4. What’s in this for me? <ul><li>Get a better idea of what's happening behind-the-scenes. </li></ul><ul><li>Use this information to help troubleshoot issues </li></ul><ul><li>Use this information to help optimize queries or the system itself. </li></ul>
  5. 5. Oracle Architecture
  6. 6. Oracle Architecture <ul><li>It has layers! </li></ul><ul><li>Each layer depends on the layer beneath it </li></ul><ul><li>Developed primarily in the C programming language with platform-specific optimizations in assembly language </li></ul><ul><li>Controls all facets of query execution, utility processes, etc. </li></ul>
  7. 7. Oracle Kernel Layers
  8. 8. Connection <ul><li>Client requests a connection to a server SQL*Plus (OCI/UPI->NL [Network Layer]) </li></ul><ul><li>Network Naming (NN layer) resolves the network address/port to connect to (TNSNAMES, etc.) and determines which Network Transport (NT layer) adapter to use (TCP/IP, SDB/IP, ...) </li></ul><ul><li>Oracle Client software configures itself to perform communication using the Transparent Network Substrate (NS layer) protocol on the selected transport. </li></ul><ul><ul><li>The transport itself uses operating system dependent (OSD layer) code to handle portability differences between OSes like Windows, Linux, and various UNIX flavors. </li></ul></ul><ul><li>Client connects to the listener </li></ul>
  9. 9. Listener Communication <ul><li>The Listener accepts the connection request and validates that the service requested by the client is available and that the client has access to it (via host-based access control) </li></ul><ul><li>The Listener tells the client to reconnect to a different port or resend the request. </li></ul><ul><li>The Listener creates another Oracle process/thread and passes the connection information to it. </li></ul>
  10. 10. Handshaking <ul><li>The new process/thread accepts the client request for service and begins to perform handshaking </li></ul><ul><ul><li>Client and Server negotiate Additional Network Options (ANO) such as authentication, encryption, data integrity, etc. </li></ul></ul><ul><ul><li>Client and server negotiate a common version of the protocol to use; if none are compatible, you'll generally see things like ORA-03134: Connections to this server version are no longer supported </li></ul></ul>
  11. 11. Authentication <ul><li>Client sends User Name, Terminal Name, Machine Name, Program Name, ... </li></ul><ul><li>Server responds with challenge/response </li></ul><ul><ul><li>Server sends client an encrypted key via AUTH_SESSKEY </li></ul></ul><ul><ul><li>Client decrypts AUTH_SESSKEY using the user's password, and then uses the secret to encrypt and hash the user's password and sends the result back to the server as AUTH_PASSWORD </li></ul></ul><ul><ul><li>Server confirms valid/invalid password and proceeds accordingly. </li></ul></ul>
  12. 12. Session Creation <ul><li>After successful authentication, finish session creation and perform relevant actions </li></ul><ul><ul><li>Add session to global internal structures </li></ul></ul><ul><ul><li>Execute AFTER LOGON triggers </li></ul></ul><ul><li>Once session creation is complete, the server process waits for input from the client. </li></ul>
  13. 13. Client Sends a Query <ul><li>User issues a query UPDATE mytable SET foo = 'baz' WHERE foo = 'bar'; </li></ul><ul><li>Oracle Client packages up the query text into a specific Two-Task Interface (TTI) subpacket (OALL*) with relevant flags set </li></ul><ul><li>Oracle Client encapsulates the subpacket into a TNS data packet </li></ul><ul><li>Client sends packet to the server. </li></ul>
  14. 14. Server Receives the Query <ul><li>Server receives TNS data packet </li></ul><ul><li>Server checks to see if data contains a subpacket </li></ul><ul><li>Server parses the subpacket </li></ul><ul><li>Server performs the Oracle Programmatic Interface (OPI) function actions based on the subpacket </li></ul><ul><ul><li>Parse </li></ul></ul><ul><ul><li>Execute </li></ul></ul><ul><ul><li>Fetch </li></ul></ul><ul><ul><li>(about 170+ others) </li></ul></ul>
  15. 15. Parse the Query <ul><li>Oracle performs a hash based on the query text and performs a lookup in the Library Cache </li></ul><ul><li>If Oracle has already parsed the query and has a plan for it, re-use the plan (generally) </li></ul><ul><li>If no plan is found, perform a &quot;hard&quot; parse </li></ul><ul><ul><li>Tokenize the query </li></ul></ul><ul><ul><li>Parse the query using a recursive descent parser </li></ul></ul><ul><ul><ul><li>Started by Bob Miner and still contains a little of his code </li></ul></ul></ul><ul><ul><li>Build a parse tree representation of the statement </li></ul></ul><ul><ul><ul><li>Includes objects referenced in the query </li></ul></ul></ul><ul><ul><li>Perform some type checking and data type coercion </li></ul></ul><ul><li>Does the query make sense syntactically? </li></ul><ul><li>Does the query make sense semantically? </li></ul>
  16. 16. Optimize the Query <ul><li>Goal is to determine how to best execute the query </li></ul><ul><li>Query optimization is mathematical </li></ul><ul><ul><li>Discrete Math (Set Theory, Logic, Graph Theory, ...) </li></ul></ul><ul><ul><li>Application of Relational Algebra </li></ul></ul><ul><ul><li>Plan costs are based on weighted graph calculations </li></ul></ul><ul><ul><li>Best plan is defined as having the lowest cost calculation </li></ul></ul><ul><li>Optimization Process is </li></ul><ul><ul><li>Query Rewrite </li></ul></ul><ul><ul><li>Cost-based Query Optimization </li></ul></ul><ul><ul><li>Execution Plan Generation </li></ul></ul>
  17. 17. Rewrite the Query <ul><li>Transform to canonical form </li></ul><ul><li>Perform some basic optimizations </li></ul><ul><ul><li>Constant folding </li></ul></ul><ul><ul><li>Transitive Closure </li></ul></ul><ul><li>Determine whether the query (in whole or in part) can be optimized to use Materialized Views </li></ul>
  18. 18. Determine Access Paths <ul><li>Are there any indexes for columns specified in the predicate? </li></ul><ul><li>If there are multiple indexes, build a plan for each </li></ul><ul><li>If we are using an index, can we use that to optimize our join algorithm (hash, nested loop, external, ...) </li></ul>
  19. 19. Perform Cost-Based Optimization <ul><li>The goal is to determine </li></ul><ul><ul><li>Which access methods to use (Index, FTS, etc.) </li></ul></ul><ul><ul><li>Optimal Join Order </li></ul></ul><ul><li>Uses a branch-and-bound algorithm to build plans by: </li></ul><ul><ul><li>Performing join permutation (Join Ordering) </li></ul></ul><ul><ul><li>Determining which joins are not needed (Join Elimination) </li></ul></ul><ul><ul><li>Applying partitioning rules (Partition Elimination/Pruning) </li></ul></ul><ul><ul><li>Attempting to push down selection in joins (to reduce the number of rows at each step) </li></ul></ul><ul><ul><li>Determining which indexes are available for use in the plan and whether that changes the join algorithm (hash, nested loop, ...) </li></ul></ul><ul><ul><li>Calculate the I/O and CPU costs for each step of the plan </li></ul></ul><ul><li>Choose plan with the best overall graph weight. </li></ul>
  20. 20. Join Permutation <ul><li>Build graphs for joins in different orders </li></ul><ul><ul><li>Join R S = (R, S) = (S, R) </li></ul></ul><ul><ul><li>Join R S T = ((R, S), T) = (R, (S, T)) </li></ul></ul><ul><li>Join is a binary Relational Algebra operation </li></ul><ul><li>Follows mathematical rules for associativity and commutativity </li></ul>
  21. 21. Join Elimination <ul><li>Generally follows the process </li></ul><ul><ul><li>If a table is joined but none of its attributes (fields) are projected (in the select-list), consider it for join removal </li></ul></ul><ul><ul><li>If the table being joined is redundant due to PK/FK integrity constraints, is can be considered for removal. </li></ul></ul><ul><ul><li>If the table being joined is redundant due to unique constraints, it can be considered for removal. </li></ul></ul><ul><li>If all conditions are met, it can be removed from the query plan </li></ul>
  22. 22. Partition Elimination <ul><li>Based on the concept of constraint exclusion </li></ul><ul><li>Drops (prunes) all partitions which could not possibly be used in our plan </li></ul><ul><ul><li>If you have range-based partitions for Q1, Q2, Q3, and Q4 and query data BETWEEN '01-JAN-09' AND '15-JAN-09', the optimizer knows that the data could not exist in partitions Q2, Q3, and Q4 by excluding them based on their constraint (Q2>='2009-03-31' and Q2 < '2009-07-01', ...) </li></ul></ul>
  23. 23. Selection Pushdown <ul><li>To try reduce the cost of a join, see if the number of rows in R or S can be reduced prior to the join by pushing down parts of the predicate to those individual tables prior to the join. </li></ul><ul><ul><li>One area where this is beneficial is in nested loop joins, whose basic cost is determined by cardinality(R) * cardinality(S) </li></ul></ul><ul><ul><li>Another area where this is beneficial is in hash joins, where the server can reduce the cardinality of one of the tables significantly. This results in less CPU and memory required to build the hash table. </li></ul></ul>
  24. 24. Build an Execution Plan <ul><li>Based on the best query plan, transform the plan into an execution plan for executing the query. </li></ul>
  25. 25. Execute the Plan <ul><li>For each node in the execution plan, perform the respective operation. </li></ul><ul><ul><li>Table Scan </li></ul></ul><ul><ul><li>Index Scan </li></ul></ul><ul><ul><li>Join </li></ul></ul><ul><ul><li>Qualification </li></ul></ul><ul><ul><li>... </li></ul></ul>
  26. 26. Execute the Plan (Index Node) <ul><li>Find all rows in the index on where foo = 'bar'. </li></ul><ul><li>Open the index </li></ul><ul><li>Perform a search on the B*Tree </li></ul><ul><li>Once the index entry is found, locate the data in the heap (table data) using the ROWID. </li></ul>
  27. 27. Perform B*Tree Search <ul><li>FUNCTION btree_search (x, k) i := 1 WHILE i <= n[x] AND k > key i [x] DO i := i + 1 IF i <= n[x] AND k = key i [x] THEN RETURN (x, i); IF leaf[x] THEN RETURN NIL ELSE kcbget(c i [x]); RETURN btree_search(c i [x], k); END IF; </li></ul>
  28. 28. Retrieving a Block (kcbg*()) <ul><li>Check the Buffer Cache </li></ul><ul><li>If the block isn't in the buffer cache, read it from disk (sfrfb()-System File Read File Block) </li></ul><ul><li>If the block is in the buffer cache, and no one has altered it without committing, use it. </li></ul><ul><li>If the block is in the buffer cache, and someone has altered it without committing, build a before image of the block from UNDO and use it (kcbgtcr()-Kernel Cache Buffer GeT for Consistent Read). </li></ul>
  29. 29. Updating the Data <ul><li>Once the data has been found, update it </li></ul><ul><li>Acquire a row-level lock by placing an entry in the interested transaction list (ITL). If the row is already locked, wait (or don't wait depending on what the user requested) </li></ul><ul><li>The server generates an UNDO/REDO record containing the change vector for the record (ktugur()-Kernel Transaction Undo Generate Undo and Redo) </li></ul><ul><ul><li>UNDO contains the column foo with value bar </li></ul></ul><ul><ul><li>REDO contains the column foo with value baz </li></ul></ul><ul><li>The Oracle server returns a packet to the client regarding success/failure of the statement. </li></ul>
  30. 30. Committing the Data <ul><li>The client sends a commit message to the server, which the server processes as before. Because it's a command, it does not have to go through query planning. </li></ul><ul><li>Flush the REDO/UNDO data to disk up to the point of the commit </li></ul><ul><li>Increment the SCN </li></ul><ul><li>Fast Commit and Delayed Block Cleanout </li></ul><ul><ul><li>In Fast Commit mode, Oracle does not clean the ITL it used on the last transaction as a part of commit. </li></ul></ul><ul><ul><ul><li>The next request to read the block will check to see whether the transactions in the ITL are still in progress, if not, there's no reason to get a consistent read version of the block. </li></ul></ul></ul><ul><ul><ul><li>If the next request is DML, it will itself perform ITL cleanup for the old transaction. </li></ul></ul></ul>
  31. 31. Fetching the Data <ul><li>The client sends the server a fetch request for N number of rows from a the cursor. </li></ul><ul><li>The server marshalls the data to be sent over the network </li></ul><ul><li>The server sends as many packets as are necessary to contain the data </li></ul><ul><li>The client reads the data, unmarshalls it, performs any necessary encoding changes, and returns it to the application. </li></ul>
  32. 32. <ul><li>Questions? </li></ul>
  33. 33. Thank You <ul><li>Fill out evaluation </li></ul><ul><ul><li>The Life of a Query </li></ul></ul><ul><ul><li>Session #399 </li></ul></ul><ul><li>Further Information </li></ul><ul><ul><li>[email_address] </li></ul></ul><ul><ul><li> </li></ul></ul>