Hex-Encoding in URLs

Topics: Developer Forum
May 30, 2007 at 2:03 PM
Edited May 30, 2007 at 2:05 PM
I have a problem with ISAPI rewrite which I cannont fix. Here is the situation:

I got a server with a root directory called "xfiles". The files in thisdirectory should only be accessible when a user is logged in.

So I wrote a rewrite rule redirecting files to a rights check script:

RewriteRule ^/xfiles/(.*) /check_files.asp?url=$1 L

Works okay, an direct URL to a file under /xfiles/ is redirected:

http://myserver/xfiles/15/test.doc

This URL is wrapped by the script and therefore not accessible for user not logged in. So far so good ...

But if someone escapes a character in the directory part of the URL, e.g. the "x" in "xfiles":

http://myserver/%58files/17/a.txt

The rule here doesn't fit and the file will be transfered. The same effect if someone uses uppercase characters, e.g. "xFiLeS", the rule doesn't apply.

Is there any way to unescape the URL before matching the rules or create a regex-rule that will enclose the hex-characters and also is not case-sensitive?

I already wrote a rule like this:

RewriteRule ^/(x|%78|X|%58)(f|%66|F|%46)(i|%69|I|%49)(l|%6C|%6c|L|%4C|%4c)(e|%65|E|%45)(s|%73|S|%53)(/|%2F|%2f)(.*) /check_xfiles.asp?url=$8 L

But that doesn't seem very effective. There must be a better solution.

Any help will be great ...
May 30, 2007 at 4:01 PM
Edited May 30, 2007 at 4:03 PM
If you put [I] at the end of you rule, it'll ignore case.

Not sure about your escaping rule...but maybe a Condition rule to verify that the directory exists such as:

RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^/(?:\w+)/(.*)$ /check_files.asp?url=$2 [I,L]

But I'm not sure on that...just an idea.
Jun 1, 2007 at 9:11 AM
Ok, the i works. Thanks.

But your solution with the rewrite condition will not help. A directory check on "xfiles" or "%58files" will both be successfull, because both are after decoding the same.

What I need is a modifier like I for Hex encodings, which tell the rule that hex encodings and its equivalent characters are the same.
Coordinator
Jun 13, 2007 at 12:15 AM
I was going to add this as a work item, but I am not so sure it is a safe thing to do.

I think there are better ways, more direct ways, to get what you want. Essentially you want to disallow circumvention of your IIRF rules via URL encoding. But of course there is more than just IIRF rules that people could get around, using URL encoding.

If I were you I would disable (filter out) any URLs with arbitrary encoding in them. Many websites do this, and there are many tools that perform this task -for example the MS tool known as UrlScan has this filtering. If you don't have UrlScan or some other scanner looking for encoded URLs, then you could include a rule in the IIRF ini file to detect them.

It may be the case that some sequences should be encoded in URLs, such as, perhaps, spaces. So your rule would have to allow for those, but disallow others.

In any case I am going to reject this as a feature request for IIRF.
Jun 15, 2007 at 1:38 PM
Thank you for the answer, Chesso!

I cannot use URLscan or something else, because I only want to disallow hex encodings in the directory part of the URL. The Filename afterwards should allow hex encodings for spaces, hyphens, etc.

You said I can include a rule in IIRF to detect the encodings? How would such a rule look like? Is it able to write a rule that only checks until the last forward slash (that means in the directory part)?
Coordinator
Jun 20, 2007 at 10:18 PM
How about this?
RewriteRule ^/(.*%\d+.*)/([^\/]+) /PleaseDontUseEscapeSequencesInDirectoryNames.aspx?path=$1&file=$2    [L]

The result would be
Incoming URL Result
/NoEscaping/File1.doc (NO REWRITE)
/This%25Directory/Uses%25Escaping/File2.doc /PleaseDontUseEscapeSequencesInDirectoryNames.aspx?path=This%25Directory/Uses%25Escaping&file=File2.doc
/This%Uses%Percent%Chars/File3.doc (NO REWRITE)