How do you remove default documents in URL?

Jun 17, 2008 at 5:12 PM
Edited Jun 17, 2008 at 5:14 PM
In ISAPI Rewrite, you can remove the default files from the URL.

# remove index pages from URLs
RewriteRule (.*)/default.htm$ $1/ [I,RP]
RewriteRule (.*)/default.aspx$ $1/ [I,RP]
RewriteRule (.*)/index.htm$ $1/ [I,RP]
RewriteRule (.*)/index.html$ $1/ [I,RP]

How would you do this with Ionics Rewrite filter?

If possible, can you rewrite this:
www.site1.com/files/page.aspx to >> www.site1.com

Thank you!
Coordinator
Jun 17, 2008 at 10:09 PM
Edited Jun 18, 2008 at 6:32 AM

I would do it the same way in IIRF. In IIRF, though, there is no RP option. RP = Redirect Permanently in ISAPI Rewrite. In IIRF you would use R=301. So...

# remove index pages from URLs
RewriteRule (.*)/default.htm$ $1/ [I,R=301]
RewriteRule (.*)/default.aspx$ $1/ [I,R=301]
RewriteRule (.*)/index.htm$ $1/ [I,R=301]
RewriteRule (.*)/index.html$ $1/ [I,R=301]

But I would also prepend the beginning-of-line character ( ^ ) for good form. And you will want to escape that dot character, since it is meaningful in a Regex context.   Finally I might suggest consolidating it to a single rule like so: 
 

# remove index pages from URLs
RewriteRule ^(.*)/(default|index)\.(html|htm)$ $1/ [I,R=301]

You also asked how to rewrite "www.site1.com/files/page.aspx" to "www.site1.com". That kind of specific situation is a little tricky. This rule would do it:

RewriteRule ^/files/page\.aspx$  /  [L]

But maybe that is too specific for you. Maybe what you really want is

RewriteRule ^/files/(.+)\.aspx$  /  [L]

This says, any URL that begins with the /files segment, and has any number of additional segments, and ends with an .aspx extension, should get rewritten. So,

URL Result
/files/page.aspx REWRITE to /
/files/subdir/page.aspx REWRITE to /
/files/something.aspx REWRITE to /
/files NO REWRITE
/files/foo.php NO REWRITE
/files/foo.php?q=something.aspx REWRITE to /
/files/something.aspx?q=data NO REWRITE

Or maybe that is too generic, and what you really want is this:

RewriteRule ^/files/([^\.\/\?]+)\.aspx$  /  [L]

This says, any URL that begins with the /files segment, and has exactly one additional segment, and no query string, and ends with an .aspx extension, should get rewritten. So,

URL Result
/files/page.aspx REWRITE to /
/files/subdir/page.aspx NO REWRITE
/files/something.aspx REWRITE to /
/files NO REWRITE
/files/foo.php NO REWRITE
/files/foo.php?q=something.aspx NO REWRITE
/files/foo.aspx?q=something.aspx NO REWRITE
/files/foo.aspx?q=anything NO REWRITE
/files/foo.aspx/ NO REWRITE

You see, it is sometimes difficult to know the rule to write, given only one example of an input.  In general, when designing rewrite rules, think of a number of test cases - both positive (rewrite occurs) and negative (no rewrite)  - and write your rule with those test cases in mind.  That is why the testDriver.exe is really useful.  It allows you to iteratively design and develop rules by testing exactly what you want.  Keep in mind that "rewrite occurs" is not a complete answer.  In some cases, you want to understand just WHAT the outcome is... Not just that a rewrite occurred, but what was the request rewritten TO.

You should also be careful to add test cases with "bad input" - for example unexpected URLs, or what you would consider "badly formed" URLs .   So, in the above case, you would want to consider query strings.  Even if the design of your web app never uses a query string, you want your rules to handle the case where a query string is present, because people may construct their own URLs.  

You may wish www.site1.com/files/page.aspx?q=anything to also get rewritten, but the rules I gave above do not handle that case... 

So the guidance is:  Test early, test often.

Jun 18, 2008 at 2:28 PM
Wow! That is awesome! Thanks for all the info!
I will work on it!


Cheeso wrote:

I would do it the same way in IIRF. In IIRF, though, there is no RP option. RP = Redirect Permanently in ISAPI Rewrite. In IIRF you would use R=301. So...

# remove index pages from URLs
RewriteRule (.*)/default.htm$ $1/ [I,R=301]
RewriteRule (.*)/default.aspx$ $1/ [I,R=301]
RewriteRule (.*)/index.htm$ $1/ [I,R=301]
RewriteRule (.*)/index.html$ $1/ [I,R=301]

But I would also prepend the beginning-of-line character ( ^ ) for good form. And you will want to escape that dot character, since it is meaningful in a Regex context.   Finally I might suggest consolidating it to a single rule like so: 
 

# remove index pages from URLs
RewriteRule ^(.*)/(default|index)\.(html|htm)$ $1/ [I,R=301]

You also asked how to rewrite "www.site1.com/files/page.aspx" to "www.site1.com". That kind of specific situation is a little tricky. This rule would do it:

RewriteRule ^/files/page\.aspx$  /  [L]

But maybe that is too specific for you. Maybe what you really want is

RewriteRule ^/files/(.+)\.aspx$  /  [L]

This says, any URL that begins with the /files segment, and has any number of additional segments, and ends with an .aspx extension, should get rewritten. So,

URL Result
/files/page.aspx REWRITE to /
/files/subdir/page.aspx REWRITE to /
/files/something.aspx REWRITE to /
/files NO REWRITE
/files/foo.php NO REWRITE
/files/foo.php?q=something.aspx REWRITE to /
/files/something.aspx?q=data NO REWRITE

Or maybe that is too generic, and what you really want is this:

RewriteRule ^/files/([^\.\/\?]+)\.aspx$  /  [L]

This says, any URL that begins with the /files segment, and has exactly one additional segment, and no query string, and ends with an .aspx extension, should get rewritten. So,

URL Result
/files/page.aspx REWRITE to /
/files/subdir/page.aspx NO REWRITE
/files/something.aspx REWRITE to /
/files NO REWRITE
/files/foo.php NO REWRITE
/files/foo.php?q=something.aspx NO REWRITE
/files/foo.aspx?q=something.aspx NO REWRITE
/files/foo.aspx?q=anything NO REWRITE
/files/foo.aspx/ NO REWRITE

You see, it is sometimes difficult to know the rule to write, given only one example of an input.  In general, when designing rewrite rules, think of a number of test cases - both positive (rewrite occurs) and negative (no rewrite)  - and write your rule with those test cases in mind.  That is why the testDriver.exe is really useful.  It allows you to iteratively design and develop rules by testing exactly what you want.  Keep in mind that "rewrite occurs" is not a complete answer.  In some cases, you want to understand just WHAT the outcome is... Not just that a rewrite occurred, but what was the request rewritten TO.

You should also be careful to add test cases with "bad input" - for example unexpected URLs, or what you would consider "badly formed" URLs .   So, in the above case, you would want to consider query strings.  Even if the design of your web app never uses a query string, you want your rules to handle the case where a query string is present, because people may construct their own URLs.  

You may wish www.site1.com/files/page.aspx?q=anything to also get rewritten, but the rules I gave above do not handle that case... 

So the guidance is:  Test early, test often.




Coordinator
Jun 19, 2008 at 5:11 AM
ps: don't forget, IIRF is now donationware.
I am now accepting donations on behalf of my favorite charity.
If you find IIRF useful, consider donating.