Parsing strange v3


Published on

  • Be the first to comment

Parsing strange v3

  1. 1. Parsing Strange:URL to SQL to HTML<br />Hal Stern<br /><br />headshot by Richard Stevens<br />
  2. 2. Why Do You Care?<br />Database performance = user experience<br />A little database expertise goes a long way<br />Taxonomies for more than sidebar lists<br />Custom post types<br />WordPress as a powerful CMS >> blog<br />Change default behaviors<br />Defy the common wisdom<br />Integrate other content sources/filters<br />WordCamp NYC 2010<br />2<br />
  3. 3. Flow of Control<br />Web server URL manipulation<br />Real file or permalink URL?<br />URL to query variables<br />What to display? Tag? Post? Category?<br />Query variables to SQL generation<br />How exactly to get that content?<br />Template file selection<br />How will content be displayed?<br />Content manipulation<br />3<br />WordCamp NYC 2010<br />
  4. 4. Whose File Is This?<br />User URL request passed to web server<br />Web server checks.htaccessfile<br />WP install root <br />Other .htaccessfiles may interfere<br />Basic rewriting rules:If file or directory URL doesn’t exist, start WordPress via index.php<br />WordCamp NYC 2010<br />4<br /><IfModulemod_rewrite.c><br />RewriteEngine On<br />RewriteBase /whereyouputWordPress/<br />RewriteCond %{REQUEST_FILENAME} !-f<br />RewriteCond %{REQUEST_FILENAME} !-d<br />RewriteRule . /index.php [L]<br /></IfModule><br />
  5. 5. Example Meta Fail: 404 Not Found<br />Access broken image URLs for unintended results: no 404 pages!myblog/images/not-a-pic.jpg<br />Web server can’t find file, assumes it’s a permalink, hands to WP <br />WP can’t interpret it, so defaults to home<br />WordCamp NYC 2010<br />5<br />myblog/<br />myblog/wp-content (etc)<br />myblog/images<br />
  6. 6. What Happens Before The Loop<br />Parse URL into a query<br />Set conditionals & select templates<br />Execute the query & cache results<br />Run the Loop:<?phpif (have_posts()) :<br /> while (have_posts()) :<br />the_post(); //loop contentendwhile;endif;?><br />WordCamp NYC 2010<br />6<br />
  7. 7. Examining the Query String<br />SQL passed to MySQL in WP_Query object’s request element<br />Brute force: edit theme footer.phpto see main loop’s query for displayed page<br />WordCamp NYC 2010<br />7<br /><?php<br /> global $wp_query;<br /> echo ”SQL for this page ";<br /> echo $wp_query->request;<br /> echo "<br>";<br />?><br />
  8. 8. SELECT SQL_CALC_FOUND_ROWS wp_posts.* FROM wp_posts WHERE 1=1 AND wp_posts.post_type = 'post’ AND(wp_posts.post_status = 'publish' ORwp_posts.post_status = 'private’)ORDER BY wp_posts.post_date DESC LIMIT 0, 10<br />“Home Page” Query Deconstruction<br />WordCamp NYC 2010<br />8<br />Get all fields from posts table, but limit number of returned rows<br />Only get posts, and those that are published or private to the user<br />Sort the results by date in descending order<br />Start results starting with record 0 and up to 10 more results<br />
  9. 9. Query Parsing<br />parse_request() method of WP_Query extracts query variables from URL<br />Execute rewrite rules<br />Pick off ?p=67 style http GET variables<br />Match permalink structure<br />Match keywords like “author” and “tag”<br />Match custom post type slugs<br />WordCamp NYC 2010<br />9<br />
  10. 10. Query Variables to SQL<br />Query type: post by title, posts by category or tag, posts by date<br />Variables for the query<br />Slug values for category/tags<br />Month/day numbers<br />Explicit variable values<br />post_typevariable has been around for a while; CPT queries fill in new values<br />WordCamp NYC 2010<br />10<br />
  11. 11. Simple Title Slug Parsing<br />Rewrite matches root of permalink, extracts tail of URL as a title slug<br />WordCamp NYC 2010<br />11<br />/2010/premio-sausage<br />SELECT wp_posts.* FROM wp_posts WHERE 1=1 AND YEAR(wp_posts.post_date)='2010' AND wp_posts.post_name = 'premio-sausage' AND wp_posts.post_type = 'post' ORDER BY wp_posts.post_date DESC<br />
  12. 12. CPT Query Variables<br />Register CPT with a custom query variable<br />'query_var' => 'ebay'<br />Variable works in URLs like built-ins<br /><br /><br />Variable value matches CPT title slug<br />WordCamp NYC 2010<br />12<br />
  13. 13. WordPress Meta Data<br />Common DB mechanics for all meta data<br />Categories, tags, custom taxonomies<br />Normalized down to 3 tables<br />Terms: word strings and their slugs<br />Taxonomies: collections of terms<br />Relationships: terms attached to posts<br />It’s so simple it gets really complex. Really.<br />WordCamp NYC 2010<br />13<br />
  14. 14. Graphs and JOIN Operations<br />WordPress maps tags and categories 1:N to posts (each term in many posts)<br />You need to punch MySQL to handle this<br />INNER JOIN builds intermediate tables on common key values<br />Following link in a graph is equivalent to an INNER JOIN on tables of linked items<br />WordCamp NYC 2010<br />14<br />
  15. 15. WordPress Taxonomy Tables<br />Term relationships table maps N terms to each post<br />Term taxonomy maps N terms to each taxonomy<br />Term table has slugs for URL mapping<br />WordCamp NYC 2010<br />15<br />wp_term_relationshipsobject_idterm_taxonomy_id<br />wp_postspost_id….post_date… <br />post_content<br />wp_term_taxonomyterm_taxonomy_idterm_idtaxonomydescription<br />wp_terms<br />term_idnameslug<br />
  16. 16. SELECT SQL_CALC_FOUND_ROWS wp_posts.* FROM wp_postsINNER JOIN wp_term_relationships ON(wp_posts.ID = wp_term_relationships.object_id)INNER JOIN wp_term_taxonomy ON (wp_term_relationships.term_taxonomy_id = wp_term_taxonomy.term_taxonomy_id)INNER JOIN wp_terms ON (wp_term_taxonomy.term_id = wp_terms.term_id)WHERE 1=1 AND wp_term_taxonomy.taxonomy = 'post_tag' AND wp_terms.slug IN ('premio') AND wp_posts.post_type = 'post' AND (wp_posts.post_status = 'publish' OR wp_posts.post_status = 'private') GROUP BY wp_posts.ID ORDER BY wp_posts.post_date DESC LIMIT 0, 10<br />Taxonomy Lookup<br />WordCamp NYC 2010<br />16<br />/tag/premio<br />
  17. 17. More on Canonical URLs<br />Canonical URLs improve SEO<br />WordPress is really good about generating 301 Redirects for non-standard URLs<br />Example: URL doesn’t appear to match a permalink, WordPress does prediction<br />Use “LIKE title%” in WHERE clause<br />Matches “title” as initial substring with % wildcard<br />WordCamp NYC 2010<br />17<br />
  18. 18. Modifying the Query<br />Brute force isn’t necessarily good<br />Using query_posts() ignores all previous parsing, runs a new SQL query<br />Filter query_vars<br />Change default parsing (convert any day to a week’s worth of posts, for example)<br />Actions parse_query & parse_request<br />Access WP_Query object before execution<br />is_xx() conditionals are already set<br />WordCamp NYC 2010<br />18<br />
  19. 19. SQL Generation Filters<br />posts_where<br />More explicit control over query variable to SQL grammar mapping<br />posts_join<br />Add or modify JOIN operations for other graph relationships<br />Many other filters<br />Change grouping of results<br />Change ordering of results<br />WordCamp NYC 2010<br />19<br />
  20. 20. Custom Post Types<br />Change SQL WHERE clause on post type<br />wp_posts.post_type=‘ebay’<br />Add new rewrite rules for URL parsing similar to category & tag<br />Set slug in CPT registration array'rewrite' => array ("slug" => “ebay”),<br />Watch out for competing, overwritten or unflushed rewrite entries<?php echo "<pre>”;print_r(get_option('rewrite_rules'));echo "</pre>”;?><br />WordCamp NYC 2010<br />20<br />
  21. 21. Applications<br />Stylized listings<br />Category sorted alphabetically<br />Use posts as listings of resources (jobs, clients, events) – good CPT application<br />Custom URL slugs<br />Add rewrite rules to match slug and set query variables<br />Joining other social graphs<br />Suggested/related content<br />WordCamp NYC 2010<br />21<br />
  22. 22. Template File Selection<br />is_x() conditionals set in query parsing<br />Used to drive template selection<br />is_tag() looks for tag-slug, tag-id, then tag<br />Full search hierarchy in Codex<br />template_redirectaction<br />Called in the template loader<br />Add actions to override defaults<br />WordCamp NYC 2010<br />22<br />
  23. 23. HTML Generation<br />Done in the_post() method<br />Raw content retrieved from MySQL<br />Short codes interpreted<br />CSS applied<br />Some caching plugins generate and store HTML, so YMMV<br />WordCamp NYC 2010<br />23<br />
  24. 24. Why Do You Care?<br />User experience improvement<br />JOINS are expensive<br />Large post table & repetitive SELECTs = slow<br />Running query once keeps cache warm<br />Category, permalink, title slug choices matter<br />More CMS, less “blog”<br />Alphabetical sort<br />Adding taxonomy/social graph elements<br />WordCamp NYC 2010<br />24<br />
  25. 25. Resources<br />Core files where SQL stuff happens<br />query.php<br />post.php<br />canonical.php<br />rewrite.php<br />Template loader search path<br /><br />WordCamp NYC 2010<br />25<br />
  26. 26. Contact<br />Hal Stern<br /><br />@freeholdhal<br /><br /><br />Other Projects:<br /><br /><br /><br />WordCamp NYC 2010<br />26<br />