Parsing strange v1.1


Published on

WordCamp Chicago developer track talk on "URL to SQL to HTML", explaining how WordPress URL parsing and content selection works.

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • SQL_CALC_FOUND_ROWS limits the number of returned rows via the LIMIT clause, and ensures that you don’t tax MySQL,Perform immense queries (SMOF has 600 post entries)WHERE 1=1 is for building compound where clauses; ensures there’s no degenerate caseType=post versus revision; status publish/private versus draft, trash
  • Look at rewrite.php, and canonical.php (more on that later)Default terms of “tag” and “category” can be changed in the Settings/Permalinks section of the Dashboard
  • Separate namespaces for pages and postsWhat about parent pages?In this example the permalink structure is %year%/%title%
  • Three joins needed to build the full cartesian product of related tables.Get all of the terms that have a slug of “premio”, and find out what taxonomies they’re inGet the taxonomies that are post tags, and find all taxonomy object ids (that are post tags of slug “premio”)Get all of the posts that have this object id associated with them from term_relationshipsOrder the final table by post date,starting with the most recent (0) and getting 10 of them.
  • Don’t want multiple URLs pointing to the same page, so canonical parsing cleans them up
  • Parsing strange v1.1

    1. 1. Parsing Strange:URL to SQL to HTML<br />Hal Stern<br /><br />headshot by Richard Stevens<br />
    2. 2. Why Do You Care?<br />Database performance = user experience<br />A little database expertise goes a long way<br />Use taxonomies for more than sidebar lists<br />WordPress is a powerful CMS<br />Change default behaviors<br />Defy the common wisdom<br />Integrate other content sources/filters<br />WordCamp Chicago<br />2<br />
    3. 3. Disclaimers<br />I’m somewhat social, for Jersey<br />I’m (old) old school<br />If using PHP echo gives you hives……take a Benadryl now<br />If “INNER JOIN” makes you giggle, you’re in the wrong session/conference/fantasy<br />I suck at art and design<br />3<br />WordCamp Chicago<br />
    4. 4. Flow of Control<br />Web server URL manipulation<br />Real file or permalink URL?<br />URL to query variables<br />What to display? Tag? Post? Category?<br />Query variables to SQL generation<br />How exactly to get that content?<br />Template file selection<br />How will content be displayed?<br />Content manipulation<br />4<br />WordCamp Chicago<br />
    5. 5. Whose File Is This?<br />User URL request passed to web server<br />Web server checks.htaccessfile<br />WP install root <br />Other .htaccessfiles may interfere<br />Basic rewriting rules:If file or directory URL doesn’t exist, start WordPress via index.php<br />WordCamp Chicago<br />5<br /><IfModulemod_rewrite.c><br />RewriteEngine On<br />RewriteBase /whereyouputWordPress/<br />RewriteCond %{REQUEST_FILENAME} !-f<br />RewriteCond %{REQUEST_FILENAME} !-d<br />RewriteRule . /index.php [L]<br /></IfModule><br />
    6. 6. What Happens Before The Loop<br />Parse URL into a query<br />Set conditionals & select templates<br />Execute the query & cache results<br />Run the Loop:<?phpif (have_posts()) :<br /> while (have_posts()) :<br />the_post(); //loop contentendwhile;endif;?><br />WordCamp Chicago<br />6<br />
    7. 7. Examining the Query String<br />SQL passed to MySQL in WP_Query object’s request element<br />Brute force: edit theme footer.phpto see main loop’s query for displayed page<br />WordCamp Chicago<br />7<br /><?php<br /> global $wp_query;<br /> echo ”SQL for this page ";<br /> echo $wp_query->request;<br /> echo "<br>";<br />?><br />
    8. 8. SELECT SQL_CALC_FOUND_ROWS wp_posts.* FROM wp_posts WHERE 1=1 AND wp_posts.post_type = 'post’ AND(wp_posts.post_status = 'publish' ORwp_posts.post_status = 'private’)ORDER BY wp_posts.post_date DESC LIMIT 0, 10<br />“Home Page” Query Deconstruction<br />WordCamp Chicago<br />8<br />Get all fields from posts table, but limit number of returned rows<br />Only get posts, and those that are published or private to the user<br />Sort the results by date in descending order<br />Start results starting with record 0 and up to 10 more results<br />
    9. 9. Query Parsing<br />parse_request() method of WP_Query extracts query variables from URL<br />Execute rewrite rules<br />Pick off ?p=67 style http GET variables<br />Match permalink structure<br />Match keywords like “author” and “tag”<br />WordCamp Chicago<br />9<br />
    10. 10. Query Variables to SQL<br />Query type: post by title, posts by category or tag, posts by date<br />Variables for the query<br />Slug values for category/tags<br />Month/day numbers<br />Explicit variable values?p=67 for post_id<br />WordCamp Chicago<br />10<br />
    11. 11. Simple Title Slug Parsing<br />Rewrite matches root of permalink, extracts tail of URL as a title slug<br />WordCamp Chicago<br />11<br />/2010/premio-sausage<br />SELECT wp_posts.* FROM wp_posts WHERE 1=1 AND YEAR(wp_posts.post_date)='2010' AND wp_posts.post_name = 'premio-sausage' AND wp_posts.post_type = 'post' ORDER BY wp_posts.post_date DESC<br />
    12. 12. Graphs and JOIN Operations<br />WordPress treats tags and categories as “terms”, mapped 1:N to posts<br />Relational databases aren’t ideal for this<br />INNER JOIN builds intermediate tables on common key values<br />Following link in a social graph is equivalent to an INNER JOIN on tables of linked items<br />WordCamp Chicago<br />12<br />
    13. 13. WordPress Taxonomy Tables<br />Term relationships table maps N:1 terms to each post<br />Term taxonomy maps slugs 1:N to taxonomies<br />Term table has slugs for URL mapping<br />WordCamp Chicago<br />13<br />wp_term_relationshipsobject_idterm_taxonomy_id<br />wp_postspost_id….post_date… <br />post_content<br />wp_term_taxonomyterm_taxonomy_idterm_idtaxonomydescription<br />wp_terms<br />term_idnameslug<br />
    14. 14. SELECT SQL_CALC_FOUND_ROWS wp_posts.* FROM wp_postsINNER JOIN wp_term_relationships ON(wp_posts.ID = wp_term_relationships.object_id)INNER JOIN wp_term_taxonomy ON (wp_term_relationships.term_taxonomy_id = wp_term_taxonomy.term_taxonomy_id)INNER JOIN wp_terms ON (wp_term_taxonomy.term_id = wp_terms.term_id)WHERE 1=1 AND wp_term_taxonomy.taxonomy = 'post_tag' AND wp_terms.slug IN ('premio') AND wp_posts.post_type = 'post' AND (wp_posts.post_status = 'publish' OR wp_posts.post_status = 'private') GROUP BY wp_posts.ID ORDER BY wp_posts.post_date DESC LIMIT 0, 10<br />Taxonomy Lookup<br />WordCamp Chicago<br />14<br />/tag/premio<br />
    15. 15. More on Canonical URLs<br />Canonical URLs improve SEO<br />WordPress is really good about generating 301 Redirects for non-standard URLs<br />Example: URL doesn’t appear to match a permalink, WordPress does prediction<br />Use “LIKE title%” in WHERE clause<br />Matches “title” as initial substring with % wildcard<br />WordCamp Chicago<br />15<br />
    16. 16. Modifying the Query<br />Brute force isn’t necessarily good<br />Using query_posts() ignores all previous parsing, runs a new SQL query<br />Filter query_vars<br />Change default parsing (convert any day to a week’s worth of posts, for example)<br />Actions parse_query & parse_request<br />Access WP_Query object before execution<br />is_xx() conditionals are already set<br />WordCamp Chicago<br />16<br />
    17. 17. SQL Generation Filters<br />posts_where<br />More explicit control over query variable to SQL grammar mapping<br />posts_join<br />Add or modify JOINS for other graph like relationships<br />Many other filters<br />Change grouping of results<br />Change ordering of results<br />WordCamp Chicago<br />17<br />
    18. 18. Applications<br />Stylized listings<br />Category sorted alphabetically<br />Use posts as listings of resources (jobs, clients, events)<br />Custom URL slugs<br />Add rewrite rules to match slug and set query variables<br />Joining other social graphs<br />Suggested/related content<br />WordCamp Chicago<br />18<br />
    19. 19. Template File Selection<br />is_x() conditionals set in query parsing<br />Used to drive template selection<br />is_tag() looks for tag-slug, tag-id, then tag<br />Full search hierarchy in Codex<br />template_redirectaction<br />Called in the template loader<br />Add actions to override defaults<br />WordCamp Chicago<br />19<br />
    20. 20. HTML Generation<br />Done in the_post() method<br />Raw content retrieved from MySQL<br />Short codes interpreted<br />CSS applied<br />Some caching plugins generate and store HTML, so YMMV<br />WordCamp Chicago<br />20<br />
    21. 21. Why Do You Care?<br />User experience improvement<br />JOINS are expensive<br />Large table, repetitive SELECTs = slow<br />Running query once keeps cache warm<br />Category, permalink, title slug choices matter<br />More CMS, less “blog”<br />Alphabetical sort<br />Adding taxonomy/social graph elements<br />WordCamp Chicago<br />21<br />
    22. 22. Blunt Self-Promotion<br />Brad and David are here, too<br />They’re significantly more WP literate than me<br />This slide was more helpful yesterday<br />22<br />WordCamp Chicago<br />
    23. 23. Resources<br />Core files where SQL stuff happens<br />query.php<br />post.php<br />canonical.php<br />Template loader search path<br /><br />WordCamp Chicago<br />23<br />
    24. 24. Contact<br />Hal Stern<br /><br />@freeholdhal<br /><br /><br /><br />WordCamp Chicago<br />24<br />