rewrite header from URL query string value

Oct 14, 2010 at 8:57 PM

I am attempting to rewrite the REMOTE_USER header with the value from the URL query string, so it would work like this:,%20First%20M

puts this value in the REMOTE_USER header:

Last, First M

I have tried several rules, but I can't figure out if I am missing the RewriteHeader syntax or have the incorrect regex:

RewriteHeader Remote-User: %{HTTP_URL} [=]$
RewriteHeader Remote-User: %{HTTP_URL} name=(+$)
RewriteHeader Remote-User: %{HTTP_URL} =([^&]+)
RewriteHeader Remote-User: %{HTTP_URL} ^=+$

From the logs it doesn't look like I am getting a value to write to the header:

Thu Oct 14 14:31:49 -  3920 - EvaluateRules: depth=0
Thu Oct 14 14:31:49 -  3920 - GetHeader_AutoFree: 'Remote-User:' = ''
Thu Oct 14 14:31:49 -  3920 - EvaluateRules: Rule 1: pattern: %{HTTP_URL}  subject: 
Thu Oct 14 14:31:49 -  3920 - EvaluateRules: Rule 1: -1 (No match)
Thu Oct 14 14:31:49 -  3920 - EvaluateRules: returning 0


Oct 14, 2010 at 10:01 PM

You can use the RewriteHeader to put a value into a header.  But you need to understand how it works.  I suggest you re-examine the relevant documentation.

The directive tests the value of the header specified as the first parameter, against the pattern specified in the second parameter. In your case it is testing the value of the Remote-User header against %{HTTP_URL}.  (That seems illogical, but it is how you wrote your rule.) The value of the given header never matches %{HTTP_URL}, as expected, so the rule never fires.

Check the doc for a more detailed discussion of how RewriteHeader works, along with some examples. If you ALWAYS want to set a header, then you should use a .* pattern.  But that can result in loops, so you should consider prepending your rule with an appropriate RewriteCond. 


Oct 14, 2010 at 11:47 PM

So the RewriteHeader directive is looking for the existing header value for Remote-User, then matching that existing header value (which is never set) to the expression, so it would make sense to use ^$ to always get a match and always set a value.  So then I can specify the value from the URL like %{HTTP_URL}, but I guess that means that I need to first rewrite the URL to just the query string value, then rewrite the header based on the rewritten URL, then rewrite the URL back to the original...or just proxy the request at that point.

RewriteRule [=]$ $1 [U,NI]
RewriteHeader Remote-User: ^$ %{HTTP_URL} [NI]
RewriteRule %{HTTP_URL} .* %{HTTP_X_REWRITE_URL}
ProxyPass ^/(.*)$$1

is that correct?


Oct 15, 2010 at 1:32 PM

Your understanding of how RewriteHeader works is correct.  But No, I don't recommend rewriting the URL twice, just in order to use an interim result in a RewriteHeader.

I recommend you read the IIRF documentation. Based on your questions, I think you don't have a good understanding of the mechanics of how IIRF works. 

For example, a rule like:

RewriteRule [=]$ $1 [U,NI]

...makes no sense.  The $1 is a back-reference to a captured subpattern, and you have no subpattern in your regex. 

Rather than me explaining here, in the forums, what a regex is, what a back-ref is, and what a captured subpattern is, I direct you to the documentation for IIRF where all of this is explained in detail with examples.

I know it's complicated, and you probably are thinking "I just have a simple problem and can't you give me a simple solution?"  But there are lots and lots of simple problems and I Can't solve them all.  I wrote the doc to help people solve their own problems. 


Oct 18, 2010 at 3:33 PM

I understand what a regex is and how to match patterns.  I also understand what a backref is and about captured subpatterns.  What I am missing is how to assign a captured subpattern if I can't use the backref.  So, [=]$ should give me everything after the = in the URL and $1 should be the backref to that matched value.  So RewriteRule [=]$ $1 [U,NI] should capture the querystring value, assign that value as the URL, preserve the original URL in a new header, and stop iteration and move to the next directive.  Do I need to use a different regex syntax like ([=]+)$ to get a match on everything after the = as the matched subpattern?

Oct 18, 2010 at 8:40 PM

Hmm, well, as a regex, [=]$ matches any string that ends in an equals character.  First, that seems like a strange pattern for a URL, but of course that may be what you want.  Secondly, though, there's no capture of *anything* in that regex, because there is no set of parens denoting a capture group. Therefore $1 will always be empty, if that regex matches.

So, no, I still think you don't understand regex and captured subpatterns.  Either that or I am misunderstanding what you are doing.

If I'm right, that you don't quite have it, then I have some suggestions for you: There's a nice section in the IIRF documentation with sample regular expressions, and an english-language description of  how the regexes work, what pieces in the regex match which portions of the subject string, and so on.  If you need something more basic, there's a website dedicated solely to regex, which I reference in the IIRF documentation.  Also, I'll repeat the suggestion I make in the IIRF documentation: you can gain some understanding if you try dynamic regex matching tools; I provide URLs for those free tools in the documentation.  With those tools, you can type in a regex, and a string, and see if it matches. You can also examine the captured subpatterns.  Really helpful when learning.  Finally there is a testdriver.exe tool shipped with IIRF which will allow you to test your ruleset against sets of URL patterns.  The use of this tool is also described in the documentation.

Good luck.