mod_rewrite bootcamp, Ohio LInux 2011

4,938 views

Published on

Published in: Technology, Design
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
4,938
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
56
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • mod_rewrite bootcamp, Ohio LInux 2011

    1. 1. mod_rewrite Boot CampBending mod_rewrite to your will Rich Bowen Apache Software Foundation / rbowen@apache.org
    2. 2. Agenda • Common tasks with mod_rewrite • A few advanced rewrite rules • Some things you didn’t know mod_rewrite could do
    3. 3. Quick intro • The basics ...
    4. 4. Although, it is complex ``The great thing about mod_rewrite is it gives you all the configurability and flexibility of Sendmail. The downside to mod_rewrite is that it gives you all the configurability and flexibility of Sendmail. -- Brian Behlendorf
    5. 5. And let’s not forget voodoo! `` Despite the tons of examples and docs, mod_rewrite is voodoo. Damned cool voodoo, but still voodoo. -- Brian Moore
    6. 6. RewriteRule • Apply a regex to a request • Modify a request in some way • Alter aspects of the request in addition to the URI
    7. 7. RewriteRule Syntax RewriteRule PATTERN TARGET • Which means ... • If the request looks like THIS, send it HERE instead
    8. 8. RewriteRule Syntax RewriteRule PATTERN TARGET • The pattern is a regular expression • A substring match • Describes what the request might look like
    9. 9. RewriteRule Syntax RewriteRule PATTERN TARGET • The target is a URI, or perhaps a URL, or maybe a file path, depending on context
    10. 10. RewriteRule Syntax RewriteRule PATTERN TARGET [X] • The rule may be modified by one or more flags
    11. 11. A word about SEO • “Pretty” URLs don’t guarantee search engine ranking • Much of the “SEO” industry relies on misinformation and fear • Content causes people to link to you, which, in turn, drives search engine rankings • Don’t believe people who claim that they can guarantee the top spot on Google. They’re lying.
    12. 12. When not to • The most important thing is to know when not to use mod_rewrite • Be aware of the tools to do things that mod_rewrite is commonly misused for
    13. 13. [R] Redirect /one http://example.com/two
    14. 14. [R] RedirectMatch [mM]onkey http://example.com/ape
    15. 15. Alias Alias /images /var/upload/img
    16. 16. AliasMatch • Like Alias, but with regular expressions
    17. 17. [P] • ProxyPass • ProxyPassReverse
    18. 18. Vocabulary • We’re going to start with a very small vocabulary, and work up from there • Most of the time, this vocabulary is all that you’ll need
    19. 19. . • . matches any character • “a.b” matches acb, axb, a@b, and so on • It also matches Decalb and Marbelized
    20. 20. Repetition • + means that something needs to appear one or more times: ‘a+’ matches ‘a’, ‘aa’, ‘aaa’, and ‘Mondrian’ • * matches zero or more. ‘a*’ matches ‘a’, ‘aa’, and the empty string (‘’) • ? means that the match is optional. ‘colou?r’ matches ‘color’ and ‘colour’
    21. 21. Anchors • ^ means “starts with”. ‘^a’ matches ‘alpha’ and ‘arnold’ • $ means “ends with”. ‘a$’ matches ‘alpha’ and ‘stella’ • ‘^$’ matches an empty string • ‘^’ matches every string (every string has a start)
    22. 22. ( ) - Grouping • ( ) allows you to group several characters into one thingy, and apply other modifiers to it. • “(ab)+” matches ababababababab
    23. 23. ( ) - Backreferences • ( ) allows you to capture a match so that you can use it later (called a backreference) • It might be called $1 or %1 depending on the context • The second match is called $2 (or %2) and so on
    24. 24. [] • [ ] defines a “character class” • [abc] matches a or or b or c • “c[uoa]t” matches cut, cot, or cat • It also matches cote • It does not match coat
    25. 25. NOT • In mod_rewrite regular expressions, ! negates any match • In a character class, ^ negates the character class • [^ab] matches any character except for a or b.
    26. 26. DEMO GOES HERE • Demo of several regular expressions goes here • Note to self: Reggy and RegExhibit, suggested examples are in ~/Documents/ regex_examples.txt
    27. 27. Flags • Flags can modify the behavior of a RewriteRule • I used a flag in the example, and didn’t tell you what it meant
    28. 28. Default • Default is to treat the rewrite target as a file path • If the target starts in http:// or https:// then it is treated as a URL, and a [R] is assumed (Redirect) • In a .htaccess file, or in <Directory> scope, the file path is assumed to be relative to that scope
    29. 29. RewriteRule flags • [Flag] appears at end of RewriteRule • More than one flag separated by commas - eg [R,L,NE] (no spaces) • There’s *lots* of flags
    30. 30. Escape Backreferences • [B] (There’s no long version) • Causes backreferences to be escaped to preserve special characters RewriteRule ^(.*)$ index.php?show=$1 RewriteRule ^(.*)$ index.php?show=$1 [B] • The former will collapse escaped characters (convert %2b to + for example) while the latter will preserve them (stay as %2b) • Consider, for example, a query string of "z and x/y"
    31. 31. Cookie • [CO=NAME:Value:Domain[:lifetime[:path]] • Long form [cookie=...] • Sets a cookieRewriteRule ^/index.html - [CO=frontdoor:yes:.example.com] In this case, the default values for path (”/”) and lifetime (”session”) are assumed.
    32. 32. Discard Path Info • [DPI] or [discardpathinfo] • Discards additional PathInfo bits that may have accumulated via rewriting
    33. 33. Env • [E=var:val] • Long form [env=...] • Sets environment variable • Note that most of the time, SetEnvIf works just fine RewriteRule .(gif|jpg|png)$ - [env=image:1] CustomLog /var/log/access_log combined env=!image
    34. 34. Forbidden • [F] or [Forbidden] forces a 403 Forbidden response • Consider mod_security instead for pattern-based URL blocking RewriteEngine On RewriteRule (cmd|root).exe - [F] You could use this in conjunction with [E] to avoid logging that stuffRewriteRule (cmd|root).exe - [F,E=dontlog:1]CustomLog /var/log/apache/access_log combined env=!dontlog
    35. 35. Handler • [H=application/x-httpd-php] • Forces the use of a particular handler to handle the resulting URL RewriteEngine On RewriteRule !. - [H=application/x-httpd-php]
    36. 36. Last • [L] indicates that you’ve reached the end of the current ruleset • Any rules following this will be considered as a completely new ruleset • It’s a good idea to use it, even when it would otherwise be default behavior. It helps make rulesets more readable.
    37. 37. [L] in .htaccess files • [L] means to hand the resulting URI back to the URI-mapping process. • Part of this process may result in the .htaccess file being invoked again. • Thus, in .htaccess files, [L] can initiate loops. • Use RewriteCond to avoid loops (see examples later)
    38. 38. Proxy • [P] rules are served through a proxy subrequest • mod_proxy must be installed for this flag to work RewriteEngine On RewriteRule (.*).(jpg|gif|png) http://images.example.com$1.$2 [P]
    39. 39. Passthrough • [PT] or [passthrough] • Hands it back to the URL mapping phase • Treat this as though this was the original request
    40. 40. QSAppend • [QSA] or [qsappend] appends to the query string, rather than replacing it. • [QSD] Discards the query string (new in 2.4)
    41. 41. Redirect • [R] or [redirect] forces a 302 Redirect • Note that in this case, the user will see the new URL in their browser • This is the default behavior when the target starts with http:// or https://
    42. 42. Redirect • Can designate a different redirect status code with [R=305] • Can even do [R=404] if you want.
    43. 43. Skip • [S=n] or [skip=n] skips the next n RewriteRules • Can be used for negation of a block of rules - as a sort of inverse RewriteCond # Don’t run these rules if its an image RewriteRule ^/images - [S=4]
    44. 44. RewriteCond and [S] • Problem: I want this RewriteCond to apply to all of the next 4 rules:# Block OneRewriteCond %{HTTP_HOST} example.comRewriteRule .gif$ /something1.htmlRewriteRule .jpg$ /something2.htmlRewriteRule .ico$ /something3.htmlRewriteRule .png$ /something4.html# Block TwoRewriteRule ^ - http://something.else.com/ [R]
    45. 45. RewriteCond and [S] • DOESN’T WORK# Block OneRewriteCond %{HTTP_HOST} example.comRewriteRule .gif$ /something1.htmlRewriteRule .jpg$ /something2.htmlRewriteRule .ico$ /something3.htmlRewriteRule .png$ /something4.html# Block TwoRewriteRule ^ - http://something.else.com/ [R]
    46. 46. RewriteCond and [S] • Solution: Use an [S] flag as a GoTo statement# Block OneRewriteCond !%{HTTP_HOST} example.comRewriteRule ^ - [S=4]RewriteRule .gif$ /something1.htmlRewriteRule .jpg$ /something2.htmlRewriteRule .ico$ /something3.htmlRewriteRule .png$ /something4.html# Block TwoRewriteRule ^ - http://something.else.com/ [R]
    47. 47. Type • [T=text/html] • Forces the Mime type on the resulting URL • Good to ensure that file-path redirects are handled correctlyRewriteRule ^(.+.php)s$ $1 [T=application/x-httpd-php-source]
    48. 48. Others • [NC] or [nocase] makes the RewriteRule case insensitive • [NE] or [NoEscape] - Don’t URL- escape the results • [NS] - Don’t run on subrequests
    49. 49. Infrequently used • [C] or [Chain] - Rules are considered as a whole. If one fails, the entire chain is abandoned • [N] or [Next] - Start over at the top. Useful for rules that run in a while loop RewriteRule (.+)-(.+) $1_$2 [N]
    50. 50. Examples • Examples follow ...
    51. 51. Mapping path to QS Mapping http://example.com/one/two to http://example.com/index.php?x=one&y=two
    52. 52. Step one • First, the easy bit - map the path information to arguments:RewriteEngine OnRewriteRule ^/(.*)/(.*) /index.php?one=$1&two=$2 [PT]
    53. 53. Details • Starts with slash:RewriteEngine OnRewriteRule ^/(.*)/(.*) /index.php?one=$1&two=$2 [PT]
    54. 54. Details • Followed by some stuffRewriteEngine OnRewriteRule ^/(.*)/(.*) /index.php?one=$1&two=$2 [PT]
    55. 55. Details • Then another slashRewriteEngine OnRewriteRule ^/(.*)/(.*) /index.php?one=$1&two=$2 [PT]
    56. 56. Details • And some more stuffRewriteEngine OnRewriteRule ^/(.*)/(.*) /index.php?one=$1&two=$2 [PT]
    57. 57. Details • The first bit becomes $1RewriteEngine OnRewriteRule ^/(.*)/(.*) /index.php?one=$1&two=$2 [PT]
    58. 58. Details • The second bit becomes $2RewriteEngine OnRewriteRule ^/(.*)/(.*) /index.php?one=$1&two=$2 [PT]
    59. 59. Details • And [PT] ensures that php gets a whack at itRewriteEngine OnRewriteRule ^/(.*)/(.*) /index.php?one=$1&two=$2 [PT]
    60. 60. [PT] • Passthrough • Sends the URI back to the URL-mapping engine • Ensures that things like Aliases, Redirects are honored • Ensures that handlers (like PHP) fire
    61. 61. Unfortunately ... • The rule is not precise enough • It’s rather prone to matching the wrong things
    62. 62. Better ... • .* is too greedy. Use something more specific • [^/] matches all “not slash” charactersRewriteEngine OnRewriteRule ^/([^/]*)/([^/]*) /index.php?one=$1&two=$2 [PT]
    63. 63. Better ... • Thus, for /one/two/three, $1 is ‘one’ and $2 is ‘two’ • ‘three’ would be silently discardedRewriteEngine OnRewriteRule ^/([^/]*)/([^/]*) /index.php?one=$1&two=$2 [PT]
    64. 64. Legitimate files • Requests for real files would run afoul of this • /images/toad.gif would be mapped to / index.php?one=images&two=toad.gif • That’s not what we want • What is to be done? Alas and alack!
    65. 65. Ignore files, directories • If it’s not a file • And not a directoryRewriteEngine OnRewriteCond %{REQUEST_FILENAME} !-fRewriteCond %{REQUEST_FILENAME} !-dRewriteRule ^/([^/]*)/([^/]*) /index.php?one=$1&two=$2 [PT]
    66. 66. Existing Aliases, etc • RewriteRule runs before things like Aliases and Redirects • May need to explicitly exempt themRewriteEngine OnRewriteCond %{REQUEST_FILENAME} !-fRewriteCond %{REQUEST_FILENAME} !-dRewriteCond %{REQUEST_URI} !^/(icons|errors|styles|js)RewriteRule ^/([^/]*)/([^/]*) /index.php?one=$1&two=$2 [PT]
    67. 67. Existing Aliases, etc • RewriteRule runs before things like Aliases and Redirects • May need to explicitly exempt themRewriteEngine OnRewriteCond %{REQUEST_FILENAME} !-fRewriteCond %{REQUEST_FILENAME} !-dRewriteCond %{REQUEST_URI} !^/(icons|errors|styles|js)RewriteRule ^/([^/]*)/([^/]*) /index.php?one=$1&two=$2 [PT]
    68. 68. .htaccess • Remember that in .htaccess files or <Directory> scope, you’ll need to remove additional path informationRewriteEngine OnRewriteCond /var/www/%{REQUEST_FILENAME} !-fRewriteCond /var/www/%{REQUEST_FILENAME} !-dRewriteCond %{REQUEST_URI} !^(icons|errors|styles|js)RewriteRule ^([^/]*)/([^/]*) index.php?one=$1&two=$2 [PT]
    69. 69. QSA • If you need to preserve existing query string arguments, use QSA:RewriteEngine OnRewriteCond %{REQUEST_FILENAME} !-fRewriteCond %{REQUEST_FILENAME} !-dRewriteCond %{REQUEST_URI} !^/(icons|errors|styles|js)RewriteRule ^/([^/]*)/([^/]*) /index.php?one=$1&two=$2 [PT,QSA]
    70. 70. QSA • Query String Append (qsappend) • Existing query string is preserved • New stuff is tacked on to the end
    71. 71. FallbackResource • In 2.2.12 and later, use FallbackResource instead • Says "send all requests here, unless they map to a real resource" FallbackResource index.php
    72. 72. Virtual Hosts • Dynamic name-based vhosts
    73. 73. Consider mod_vhost_alias • Always try to avoid mod_rewrite if possible • If there’s another way, it’s pretty much guaranteed to be more efficient.
    74. 74. But ... • mod_vhost_alias has some unfortunate shortcomings • (That’s the polite way to say it)
    75. 75. Vhosts with mod_rewrite • First, you’ll need the hostname • RewriteRule doesn’t have access to the hostname • You’ll need to use RewriteCond for this
    76. 76. Hostname • Snag the first part of the hostname • Copy the entire request into a file pathRewriteEngine OnRewriteCond %{HTTP_HOST} (.*).example.com [NC]RewriteRule (.*) /home/%1/www$1
    77. 77. [NC] • NC matches in a case-insensitive mannerRewriteEngine OnRewriteCond %{HTTP_HOST} (.*).example.com [NC]RewriteRule (.*) /home/%1/www$1
    78. 78. Details • This will need to go in a wildcard virtual host • ServerAlias *.example.com • Must have that in a wildcard DNS record, too • The request uri starts with / so $1 does tooRewriteEngine OnRewriteCond %{HTTP_HOST} (.*).example.com [NC]RewriteRule (.*) /home/%1/www$1
    79. 79. Aliases • RewriteRules run prior to Aliases • Exclude that from the rewriteAlias /icons/ /var/www/icons/RewriteEngine OnRewriteCond %{REQUEST_URI} !^/icons/RewriteCond %{HTTP_HOST} (.*).example.com [NC]RewriteRule (.*) /home/%1/www$1
    80. 80. Using [S] as a “goto” • RewriteCond applies ONLY to the RewriteRule immediately following it • What if you want a RewriteCond to apply to multiple rules? • Use the [S] flag to create a logical block
    81. 81. RewriteCond • What you want:RewriteEngine OnRewriteCond %{REQUEST_FILENAME} !-f# Do BOTH of the followingRewriteRule ^/icons/(.*) /var/www/icons/$1RewriteRule (.*) /home/bob/www$1 • Unfortunately, that’s not what that does ...
    82. 82. [S] • Instead, reverse the RewriteCond and use [S]RewriteEngine OnRewriteCond %{REQUEST_FILENAME} -f# Skip the next two rules ...RewriteRule ^ - [S=2]RewriteRule ^/icons/(.*) /var/www/icons/$1 [L]RewriteRule (.*) /home/bob/www$1 [L]
    83. 83. [S]RewriteEngine OnRewriteCond %{REQUEST_FILENAME} -f# Skip the next two rules ...RewriteRule ^ - [S=2]RewriteRule ^/icons/(.*) /var/www/icons/$1 [L]RewriteRule (.*) /home/bob/www$1 [L] • [L] (last) says “do it now”. If the first rule runs, the second one won’t.
    84. 84. URL Handler • aka “rewrite everything” • All non-file requests go to handler.php
    85. 85. Two usual approaches: • Rewrite the request as a query string • Just rewrite, and let the handler figure it out • I like the second approach a lot more, but it’s less common.
    86. 86. Rewrite as query string • The original request is passed to the handler as a query string - we already did this example earlierRewriteEngine OnRewriteCond %{REQUEST_FILENAME} !-dRewriteCond %{REQUEST_FILENAME} !-fRewriteRule (.*) /index.php?q=$1 [PT,L,QSA]
    87. 87. DetailsRewriteEngine OnRewriteCond %{REQUEST_FILENAME} !-dRewriteCond %{REQUEST_FILENAME} !-fRewriteRule (.*) /index.php?q=$1 [PT,L,QSA] • Don’t rewrite requests that already map to valid files • This should handle images, css, js, existing html files, etc
    88. 88. Details • Will NOT protect Aliases - you’ll need to explicitly exclude those with RewriteCondRewriteEngine OnRewriteCond %{REQUEST_FILENAME} !-dRewriteCond %{REQUEST_FILENAME} !-fRewriteCond %{REQUEST_URI} !^/(icons|cgi-bin)RewriteRule (.*) /index.php?q=$1 [PT,L,QSA]
    89. 89. Or ... • Better: Rewrite to the handler, let it figure it outRewriteEngine OnRewriteCond %{REQUEST_FILENAME} !-dRewriteCond %{REQUEST_FILENAME} !-fRewriteCond %{REQUEST_URI} !^/(icons|cgi-bin)RewriteRule (.*) /index.php [PT,L,QSA]
    90. 90. Handler • For example:<?php$uri = $_SERVER[‘REQUEST_URI’];$parts = explode(‘/’, $uri);// ... etc?>
    91. 91. Basic RewriteMap • 1-1 mapping using RewriteMap • httxt2dbm and dbm rewrite maps
    92. 92. RewriteMap • Creates a rewrite map or function • Simplifies complicated RewriteRule directives • Can call an external source for the rewrite logic
    93. 93. Internal RewriteMap • RewriteMap int • touper • tolower • escape • unescape # Lower-case all requests RewriteMap lc int:tolower RewriteRule (.*) ${lc:$1}
    94. 94. mod_speling # Lower-case all requests RewriteMap lc int:tolower RewriteRule (.*) ${lc:$1} • If you’re trying to make URLs case insensitive, mod_speling might be what you’re looking for • CheckSpelling On
    95. 95. RewriteMap txt • Large number of static 1-1 mappings dogs.txt doberman /dogs.php?breed=278 poodle /dogs.php?breed=78 collie /dogs.php?breed=98 terrier /dogs.php?breed=148 mutt /dogs.php?breed=2 alsatian /dogs.php?breed=113
    96. 96. RewriteMap txt: • Map http://example.com/dog/poodle to the correct URLRewriteMap dogs txt:/etc/dogs.txtRewriteRule ^/dog/(.*) ${dogs:$1}
    97. 97. Default value • What if it doesn’t match anything?RewriteMap dogs txt:/etc/dogs.txtRewriteRule ^/dog/(.*) ${dogs:$1|/index.php}
    98. 98. Updates • mod_rewrite will reload the file if the mdate is updated • Otherwise, contents of file are cached in memory
    99. 99. Caveat • File is unindexed - lookups are slow
    100. 100. Faster • Convert the text file to a dbm • Indexed - faster lookups
    101. 101. httxt2dbm • Converts to a dbm
    102. 102. httxt2dbm httxt2dbm -i dogs.txt -o dogs.map
    103. 103. dbm • Change previous config for dbmRewriteMap dogs dbm:/etc/dogs.mapRewriteRule ^/dog/(.*) ${dogs:$1|/index.php}
    104. 104. RewriteMap prg • Simple Perl-based RewriteMap
    105. 105. Caveats • One copy running - all child processes wait for it • Should use a RewriteLock to avoid contention • Use only as a last resort
    106. 106. The basics RewriteEngine On RewriteMap name prg:/var/www/bin/map.pl RewriteRule (.*) ${name:$1}
    107. 107. Map script • Request comes in on STDIN • Do something useful with it • Response output on STDOUT #!/usr/bin/perl while ( $req = <STDIN> ) { $resp = something_useful( $req ); print $resp; }
    108. 108. Standard example • Convert “-” to “_” everywhere in a URI • Necessary because RewriteRule doesn’t have a “global replace” flag • Note that there’s a better way to do this. (I’ll show you in a moment.)
    109. 109. dash2score • Run it on any URI which contains a dash:RewriteMap d2s prg:/www/bin/dash2score.plRewriteRule (.*-.*) ${d2s:$1} [R]
    110. 110. dash2score.pl • The script:#!/usr/bin/perl$| = 1;while ( $req = <STDIN> ) { $req =~ s/-/_/g; print $req;}
    111. 111. dash2score.pl • Turn off buffering#!/usr/bin/perl$| = 1;while ( $req = <STDIN> ) { $req =~ s/-/_/g; print $req;}
    112. 112. dash2score.pl • Loop for the lifetime of the server#!/usr/bin/perl$| = 1;while ( $req = <STDIN> ) { $req =~ s/-/_/g; print $req;}
    113. 113. dash2score.pl • Replace “-” with “_” globally#!/usr/bin/perl$| = 1;while ( $req = <STDIN> ) { $req =~ s/-/_/g; print $req;}
    114. 114. dash2score.pl • Print the resulting string#!/usr/bin/perl$| = 1;while ( $req = <STDIN> ) { $req =~ s/-/_/g; print $req;}
    115. 115. Better way • Now I will show you a more excellent way • The [N] flag actually does have the occasional use # Replace “-” with “_” and start over RewriteRule (.*)-(.*) $1_$2 [N]
    116. 116. So ... • It’s a simple example • But not particularly practical • Useful examples are too complicated to fit on the screen • See also the dbd: method ...
    117. 117. dbd: • New in 2.3 • Store rewrite maps in a sql databaseDBDriver mysqlDBDParams host=localhost,user=bob,pass=larry,dbname=rewriteRewriteMap mymap “dbd: select dest from rewrite where uri = %s”
    118. 118. Look elsewhere • If the file isn’t here, look there • Smarter than just a 404 error page • Useful when you’ve got files stored in a couple possible locations
    119. 119. -f • Check for the existence of a file • If it’s not there, look somewhere else <Directory /var/www/one> RewriteEngine On RewriteCond /var/www/one/%{REQUEST_FILENAME} !-f RewriteRule (.*) /var/www/two/$1 </Directory>
    120. 120. <Directory> • In a <Directory> everything is relative to the local path <Directory /var/www/one> RewriteEngine On RewriteCond /var/www/one/%{REQUEST_FILENAME} !-f RewriteRule (.*) /var/www/two/$1 </Directory>
    121. 121. Or, more than one place<Directory /var/www/one>RewriteEngine OnRewriteCond /var/www/one/%{REQUEST_FILENAME} -fRewriteRule (.*) /var/www/one/$1 [L]RewriteCond /var/www/two/%{REQUEST_FILENAME} -fRewriteRule (.*) /var/www/two/$1 [L]RewriteCond /var/www/three/%{REQUEST_FILENAME} -fRewriteRule (.*) /var/www/three/$1 [L]</Directory>
    122. 122. ErrorDocument • Note that you could accomplish the same thing with an ErrorDocument handler ErrorDocument 404 /handler/404.cgi • Or just fix your directory structure. Often the obvious solution is the right one.
    123. 123. Image theft • “Image theft” refers to folks including their images in their web pages • Uses your bandwidth, your copyright • Want to deny requests that don’t originate from your own pages • Can check the referer
    124. 124. Referer • Ensure that the referer comes from here • If the referer isn’t from here ... RewriteEngine On RewriteCond %{HTTP_REFERER} !myhost.com RewriteRule .(gif|jpg|png)$ - [NC,F]
    125. 125. Image request • And if the request was for an image ... RewriteEngine On RewriteCond %{HTTP_REFERER} !myhost.com RewriteRule .(gif|jpg|png)$ - [NC,F]
    126. 126. Forbidden • Don’t rewrite it, just fail the request RewriteEngine On RewriteCond %{HTTP_REFERER} !myhost.com RewriteRule .(gif|jpg|png)$ - [NC,F]
    127. 127. [NC] • upper- or lower-case RewriteEngine On RewriteCond %{HTTP_REFERER} !myhost.com RewriteRule .(gif|jpg|png)$ - [NC,F]
    128. 128. no Referer? • Ensure that there is a non-null referer • Some requests won’t send one RewriteEngine On RewriteCond %{HTTP_REFERER} !myhost.com RewriteCond %{HTTP_REFERER} . RewriteRule .(gif|jpg|png)$ - [NC,F]
    129. 129. [F] • This just forbids the request • What if you want to ... RewriteEngine On RewriteCond %{HTTP_REFERER} !myhost.com RewriteCond %{HTTP_REFERER} . RewriteRule .(gif|jpg|png)$ - [NC,F]
    130. 130. Another image • Display a “go away” image instead RewriteEngine On RewriteCond %{HTTP_REFERER} !myhost.com RewriteCond %{HTTP_REFERER} . RewriteRule .(gif|jpg|png)$ /images/goaway.gif [PT,L]
    131. 131. Another site • Or perhaps another site entirely RewriteEngine On RewriteCond %{HTTP_REFERER} !myhost.com RewriteCond %{HTTP_REFERER} . RewriteRule .(gif|jpg|png)$ http://other.site.com/images/x.jpg [R,L]
    132. 132. Where they came from • Or just back where they came from RewriteEngine On RewriteCond %{HTTP_REFERER} !myhost.com RewriteCond %{HTTP_REFERER} . RewriteRule .(gif|jpg|png)$ %{HTTP_REFERER} [R,L]
    133. 133. 2.4 • In 2.4 ... <FilesMatch .(gif|jpg|png)$ > <If "$req{Referer} !~ /myhost.com/"> Require all denied </If> </FilesMatch>
    134. 134. Canonical hostname • Enforce a particular hostname • Perhaps cookies require a particular hostname • Or perhaps it’s just preferred
    135. 135. Canonical hostname • If it’s NOT the preferred hostname <VirtualHost *:80> ServerName www.example.com ServerAlias example.com RewriteEngine On RewriteCond %{HTTP_HOST} !=www.example.com RewriteRule (.*) http://www.example.com$1 [R,L] </VirtualHost>
    136. 136. Canonical hostname • Then redirect it there <VirtualHost *:80> ServerName www.example.com ServerAlias example.com RewriteEngine On RewriteCond %{HTTP_HOST} !=www.example.com RewriteRule (.*) http://www.example.com$1 [R,L] </VirtualHost>
    137. 137. .htaccess • Or, if you have to put it in a .htaccess file: RewriteEngine On RewriteCond %{HTTP_HOST} !=www.example.com RewriteRule (.*) http://www.example.com/$1 [R,L]
    138. 138. Canonical hostname • May make sense to have two vhosts: <VirtualHost *:80> ServerName www.example.com # ... </VirtualHost> <VirtualHost *:80> ServerName example.com RedirectMatch /(.*) http://www.example.com/$1 </VirtualHost>
    139. 139. Canonical hostname • In 2.4: <If "$req{Host} != www.example.com"> RedirectMatch /(.*) http://www.example.com/$1 </If>
    140. 140. http2https • Redirect http requests to https • Require https on certain resources
    141. 141. Like canonical hostname • Special case of the previous rule • Sort of
    142. 142. HTTPS • If it’s not already HTTPSRewriteEngine OnRewriteCond %{HTTPS} !=onRewriteRule ^/?(.*) https://%{SERVER_NAME}/$1 [R,L]
    143. 143. HTTPS • Redirect itRewriteEngine OnRewriteCond %{HTTPS} !=onRewriteRule ^/?(.*) https://%{SERVER_NAME}/$1 [R,L]
    144. 144. Clever trick • Usable in .htaccess or main config • Slash is optionalRewriteEngine OnRewriteCond %{HTTPS} !=onRewriteRule ^/?(.*) https://%{SERVER_NAME}/$1 [R,L]
    145. 145. Just one directory • Some folks want one particular directory SSL • And everything else NOT SSL • I think this is silly, but the customer is always right
    146. 146. One directoryRewriteEngine OnRewriteRule %{HTTPS} !=onRewriteRule ^/?secure(.*) https://%{SERVER_NAME}/secure$1 [R,L]RewriteRule %{HTTPS} =onRewriteCond %{HTTP_REQUEST} !^/?secureRewriteRule ^/?(.*) http://%{SERVER_NAME}/$1 [R,L]
    147. 147. One directoryRewriteEngine OnRewriteRule %{HTTPS} !=onRewriteRule ^/?secure(.*) https://%{SERVER_NAME}/secure$1 [R,L]RewriteRule %{HTTPS} =onRewriteCond %{HTTP_REQUEST} !^/?secureRewriteRule ^/?(.*) http://%{SERVER_NAME}/$1 [R,L]
    148. 148. Caveat • HTTPS pages containing non-HTTPS content (images, css, js) will generate browser errors. • Ensure that these files are available via HTTP, HTTPS
    149. 149. In 2.4<If "%{HTTPS} != on > RedirectMatch (.*) https://example.com$1</If>
    150. 150. Proxy • Using rewriterules to force Proxy
    151. 151. Images elsewhereRewriteEngine OnRewriteRule (.*.(jpg|gif|png)) http://images.example.com$1 [P]ProxyPassReverse / http://images.example.com/
    152. 152. Server Migration • Proxy 404 requests through to old serverRewriteEngine OnRewriteCond %{REQUEST_URI} !-URewriteRule (.*) http://old.server$1 [P]
    153. 153. -U • -U (is existing URL, via subrequest) • Checks whether or not TestString is a valid URL, accessible via all the servers currently-configured access controls for that path. This uses an internal subrequest to do the check, so use it with care - it can impact your servers performance!RewriteEngine OnRewriteCond %{REQUEST_URI} !-URewriteRule (.*) http://old.server$1 [P]
    154. 154. lighttpd for static • php stuff here, everything else on lighttpdRewriteEngine OnRewriteCond %{REQUEST_URI} !.php$RewriteRule (.*) http://x.example.com$1 [P]ProxyPassReverse / http://x.example.com/
    155. 155. ProxyPassReverse • Redirects usually contain the hostname • Ensures that Redirects come from here instead of from thereRewriteEngine OnRewriteCond %{REQUEST_URI} !.php$RewriteRule (.*) http://x.example.com$1 [P]ProxyPassReverse / http://x.example.com/
    156. 156. Links in HTML • If the HTML itself has links to the internal/ backend site, you can rewrite these using mod_substitute or mod_proxy_html • Examples later
    157. 157. PHP when no file ext • Force files with no file extension to be handled by php
    158. 158. • Allows you to have URLs without the annoying “.php” on the end.RewriteEngine OnRewriteRule !. - [H=application/x-httpd-php]
    159. 159. • Doesn’t contain a dotRewriteEngine OnRewriteRule !. - [H=application/x-httpd-php]
    160. 160. • Don’t rewrite itRewriteEngine OnRewriteRule !. - [H=application/x-httpd-php]
    161. 161. • Force it to use the php handlerRewriteEngine OnRewriteRule !. - [H=application/x-httpd-php]
    162. 162. Use PATH_INFO • Now you can have URLs like http://example.com/handler/arg1/arg2 • Use $_SERVER[PATH_INFO] to grab the additional bits of the request
    163. 163. mod_negotiation • Might be able to do the same thing with mod_negotiation • Options +MultiViews
    164. 164. Reading the RewriteLog RewriteLog logs/rewrite.log RewriteLogLevel 9 • Can’t go in a .htaccess file • Needs to be in global or vhost scope
    165. 165. Reading the RewriteLog RewriteLog logs/rewrite.log RewriteLogLevel 9 • Location of the log file • Relative to ServerRoot
    166. 166. Reading the RewriteLog RewriteLog logs/rewrite.log RewriteLogLevel 9 • Number from 0 to 9 • 9 most verbose • Below 3 not particularly useful
    167. 167. Log entries • Let’s look at a few log entries:
    168. 168. • I recommend ignoring this bit:203.167.144.193 - - [05/Nov/2007:22:29:53 --0500][wooga.drbacchus.com/sid#b93592c0][rid#b95c6c98/initial] (3) applying pattern ^/books?/(.+) to uri /favicon.ico
    169. 169. In fact, before we go on • I actually use a piped lot handler to remove this superfluous stuff • Like so ...RewriteLog |/usr/local/bin/rewrite_log_pipeRewriteLogLevel 9
    170. 170. RewriteLog |/usr/local/bin/rewrite_log_pipeRewriteLogLevel 9 • with ...#!/usr/bin/perl$|++;open (F, ">>/tmp/rewrite");select F;while (<>) { s/^.*((d).*)/$1/; print;}
    171. 171. #!/usr/bin/perl$|++;open (F, ">>/tmp/rewrite");select F;while (<>) { s/^.*((d).*)/$1/; print;} • Look for the (1) or (2) bit • drop everything before that
    172. 172. Results in:(4) RewriteCond: input=wooga.drbacchus.com pattern=!^wooga.drbacchus.com [NC] => not-matched(3) applying pattern wp-rss2.php to uri /index.php(3) applying pattern (journal/)?index.rdf to uri /index.php(3) applying pattern ^/wordpress/wp-comments to uri /index.php(3) applying pattern ^/perm/(.*) to uri /index.php(3) applying pattern ^/articles?/(.*) to uri /index.php(3) applying pattern ^/blog/(.*) to uri /index.php(3) applying pattern ^/book/(mod)?_?rewrite to uri /index.php(3) applying pattern ^/book/cookbook to uri /index.php(3) applying pattern ^/book/2.2 to uri /index.php(3) applying pattern ^/booklink/(.*) to uri /index.php(3) applying pattern ^/books?/(.+) to uri /index.php(1) pass through /index.php
    173. 173. Requested URI(4) RewriteCond: input=wooga.drbacchus.com pattern=!^wooga.drbacchus.com [NC] => not-matched(3) applying pattern wp-rss2.php to uri /index.php(3) applying pattern (journal/)?index.rdf to uri /index.php(3) applying pattern ^/wordpress/wp-comments to uri /index.php(3) applying pattern ^/perm/(.*) to uri /index.php(3) applying pattern ^/articles?/(.*) to uri /index.php(3) applying pattern ^/blog/(.*) to uri /index.php(3) applying pattern ^/book/(mod)?_?rewrite to uri /index.php(3) applying pattern ^/book/cookbook to uri /index.php(3) applying pattern ^/book/2.2 to uri /index.php(3) applying pattern ^/booklink/(.*) to uri /index.php(3) applying pattern ^/books?/(.+) to uri /index.php(1) pass through /index.php
    174. 174. Patterns applied(4) RewriteCond: input=wooga.drbacchus.com pattern=!^wooga.drbacchus.com [NC] => not-matched(3) applying pattern wp-rss2.php to uri /index.php(3) applying pattern (journal/)?index.rdf to uri /index.php(3) applying pattern ^/wordpress/wp-comments to uri /index.php(3) applying pattern ^/perm/(.*) to uri /index.php(3) applying pattern ^/articles?/(.*) to uri /index.php(3) applying pattern ^/blog/(.*) to uri /index.php(3) applying pattern ^/book/(mod)?_?rewrite to uri /index.php(3) applying pattern ^/book/cookbook to uri /index.php(3) applying pattern ^/book/2.2 to uri /index.php(3) applying pattern ^/booklink/(.*) to uri /index.php(3) applying pattern ^/books?/(.+) to uri /index.php(1) pass through /index.php
    175. 175. None of them matched(4) RewriteCond: input=wooga.drbacchus.com pattern=!^wooga.drbacchus.com [NC] => not-matched(3) applying pattern wp-rss2.php to uri /index.php(3) applying pattern (journal/)?index.rdf to uri /index.php(3) applying pattern ^/wordpress/wp-comments to uri /index.php(3) applying pattern ^/perm/(.*) to uri /index.php(3) applying pattern ^/articles?/(.*) to uri /index.php(3) applying pattern ^/blog/(.*) to uri /index.php(3) applying pattern ^/book/(mod)?_?rewrite to uri /index.php(3) applying pattern ^/book/cookbook to uri /index.php(3) applying pattern ^/book/2.2 to uri /index.php(3) applying pattern ^/booklink/(.*) to uri /index.php(3) applying pattern ^/books?/(.+) to uri /index.php(1) pass through /index.php
    176. 176. And now • We can actually make some sense of what’s happening • Less inscrutable noise • Yes, it means something, but not to normal people
    177. 177. Examples(4) RewriteCond: input=wooga.drbacchus.com pattern=!^wooga.drbacchus.com [NC] => not-matched • This was the result of RewriteCond %{HTTP_HOST} !^wooga.drbacchus.com [NC]
    178. 178. Examples(4) RewriteCond: input=wooga.drbacchus.com pattern=!^wooga.drbacchus.com [NC] => not-matched • It shows what the input variable looked like RewriteCond %{HTTP_HOST} !^wooga.drbacchus.com [NC]
    179. 179. Examples(4) RewriteCond: input=wooga.drbacchus.com pattern=!^wooga.drbacchus.com [NC] => not-matched • And what pattern was applied RewriteCond %{HTTP_HOST} !^wooga.drbacchus.com [NC]
    180. 180. Examples(4) RewriteCond: input=wooga.drbacchus.com pattern=!^wooga.drbacchus.com [NC] => not-matched • As well as what happened RewriteCond %{HTTP_HOST} !^wooga.drbacchus.com [NC]
    181. 181. Another example (3) applying pattern ^/book/(mod)?_?rewrite to uri / index.php • Was a result of RewriteRule ^/book/(mod)?_?rewrite http://www.amazon.com/exec/obidos/asin/ 1590595610/drbacchus/ [R,L]
    182. 182. Again ... (3) applying pattern ^/book/(mod)?_?rewrite to uri / index.php • What was requested RewriteRule ^/book/(mod)?_?rewrite http://www.amazon.com/exec/obidos/asin/ 1590595610/drbacchus/ [R,L]
    183. 183. And ... (3) applying pattern ^/book/(mod)?_?rewrite to uri / index.php • What it was compared against RewriteRule ^/book/(mod)?_?rewrite http://www.amazon.com/exec/obidos/asin/ 1590595610/drbacchus/ [R,L]
    184. 184. Matched? (3) applying pattern ^/book/(mod)?_?rewrite to uri / index.php • If it matched, the next line will be the action log RewriteRule ^/book/(mod)?_?rewrite http://www.amazon.com/exec/obidos/asin/ 1590595610/drbacchus/ [R,L]
    185. 185. The whole thing(3) applying pattern ^/books?/(mod)?_?rewrite to uri /books/rewrite(2) rewrite /books/rewrite -> http://www.amazon.com/exec/obidos/asin/1590595610/drbacchus/(2) explicitly forcing redirect with http://www.amazon.com/exec/obidos/asin/1590595610/drbacchus/(1) escaping http://www.amazon.com/exec/obidos/asin/1590595610/drbacchus/ for redirect(1) redirect to http://www.amazon.com/exec/obidos/asin/1590595610/drbacchus/ [REDIRECT/302]
    186. 186. The match:(3) applying pattern ^/books?/(mod)?_?rewrite to uri /books/rewrite(2) rewrite /books/rewrite -> http://www.amazon.com/exec/obidos/asin/1590595610/drbacchus/(2) explicitly forcing redirect with http://www.amazon.com/exec/obidos/asin/1590595610/drbacchus/(1) escaping http://www.amazon.com/exec/obidos/asin/1590595610/drbacchus/ for redirect(1) redirect to http://www.amazon.com/exec/obidos/asin/1590595610/drbacchus/ [REDIRECT/302]
    187. 187. Followed by(3) applying pattern ^/books?/(mod)?_?rewrite to uri /books/rewrite(2) rewrite /books/rewrite -> http://www.amazon.com/exec/obidos/asin/1590595610/drbacchus/(2) explicitly forcing redirect with http://www.amazon.com/exec/obidos/asin/1590595610/drbacchus/(1) escaping http://www.amazon.com/exec/obidos/asin/1590595610/drbacchus/ for redirect(1) redirect to http://www.amazon.com/exec/obidos/asin/1590595610/drbacchus/ [REDIRECT/302]
    188. 188. [R](3) applying pattern ^/books?/(mod)?_?rewrite to uri /books/rewrite(2) rewrite /books/rewrite -> http://www.amazon.com/exec/obidos/asin/1590595610/drbacchus/(2) explicitly forcing redirect with http://www.amazon.com/exec/obidos/asin/1590595610/drbacchus/(1) escaping http://www.amazon.com/exec/obidos/asin/1590595610/drbacchus/ for redirect(1) redirect to http://www.amazon.com/exec/obidos/asin/1590595610/drbacchus/ [REDIRECT/302]
    189. 189. But it all runs together! • Look for: • (2) init rewrite engine with requested uri /atom/1 • ‘init rewrite engine’ shows where a new request started being rewritten
    190. 190. .htaccess • Can’t use RewriteLog in a .htaccess file • Test it on your dev server before moving to live server
    191. 191. New logging (2.3) • RewriteLog goes away • Replaced by an ErrorLog argument
    192. 192. ErrorLog ErrorLog alert rewrite:trace8 tail -f error_log | grep rewrite:
    193. 193. TRACERewriteEngine onRewriteCond %{REQUEST_METHOD} ^(TRACE|TRACK)RewriteRule ^ - [F]
    194. 194. Security scanners • Report this as a vulnerability • It isn’t • But it doesn’t hurt to shut them up
    195. 195. Better TraceEnable Off
    196. 196. Related modules • mod_substitute • mod_ext_filter • mod_proxy_html • mod_line_edit
    197. 197. How do I do that ... • Questions like “How do I do XYZ with mod_rewrite” often have the same answer • YOU DON’T • These modules are sometimes the right answer
    198. 198. mod_substitute • New in 2.2.8 • In-stream regex • Replace a string, or a pattern, in the output • Chain with other filters
    199. 199. mod_substitute • One directive: Substitute <Location /> AddOutputFilterByType SUBSTITUTE text/html Substitute s/ariel/verdana/ni </Location>
    200. 200. mod_substitute • n = treat as a fixed string • Default - treat as regex <Location /> AddOutputFilterByType SUBSTITUTE text/html Substitute s/ariel/verdana/ni </Location>
    201. 201. mod_substitute • i - Case insensitive match • Default - Case sensitive • n - string replacement, rather than regex <Location /> AddOutputFilterByType SUBSTITUTE text/html Substitute s/ariel/verdana/ni </Location>
    202. 202. mod_substitute • Replace ariel with verdana everywhere • Filter content as it passes through. Perhaps on a proxy server. <Location /> AddOutputFilterByType SUBSTITUTE text/html Substitute s/ariel/verdana/ni </Location>
    203. 203. More usefully ... • Replace hard-coded hostnames in HTML proxied from a back-end • s/intranet.local/www.corpsite.com/i
    204. 204. mod_ext_filter • Calls an external command to filter the stream • Hugely inefficient
    205. 205. mod_proxy_html • Rewrites HTML at the proxy • Swap hostnames for absolute URLs • Third-party module
    206. 206. mod_line_edit • Very similar to mod_substitute • Third-party module
    207. 207. mod_lua • Embed Lua in configuration file for complex processing • Undocumented at the moment
    208. 208. fin • http://slideshare.net/rbowen • http://wiki.apache.org/httpd • http://httpd.apache.org/docs/trunk/ • rbowen@apache.org • http://omniti.com/is/hiring

    ×