Reverse Proxy and the IIRF input string

Topics: User Forum
Jan 20, 2010 at 11:22 AM
Edited Jan 20, 2010 at 12:01 PM

Hi

 

A very Newbie question I'm afraid.

 

I am not a coder, but I have a SBS server running IIS.

 

I want to use IIRC to do a reverse proxy. So my setup is:

 

Domain: www.mydomain.co.uk - which goes to my web server

LAN server: \\lanserver

  which also runs IIS and where I have an intranet site (Several actually)

In DNS I have registered another URL that points to www.mydomain.co.uk, called, let's say, wiki.mydomain.co.uk

Of course, at the moment that opens my main website.

So I want to use wiki.mydomain.co.uk and get a reesponse, by proxying, from \\lanserver at the necessary site (lets say //lanserver/wiki/default.html)

Also, I want wiki.mydomain.co.uk/page1.html to go to //lanserver/wiki/page1.html

I cannot figure out, before installation, what my rules should be such that I get it right first time and don't lose access to everything else by screwing the rules up!

I think that's mainly because I don't understand what the input string looks like that the rule processes, or how to retain theh page name at the end.

I'm sure this would be a really useful common-use example to include in the documentation.

Can you help?

Many thanks indeed.

Nick

Coordinator
Jan 21, 2010 at 7:46 PM
Edited Jan 26, 2010 at 9:47 PM

Hello Nick, thanks for the good question.

Seems to me you want to do proxy if and only if the incoming domain name used on the query is wiki.mydomain.co.uk.  Is that right?

If so, the construct for that is something like this:

RewriteCond %{HTTP_HOST}  ^wiki\.mydomain\.co\.uk$ 
ProxyPass   ^/(.*)$       http://mylanserver/wiki/$1

Any URL path included in a URL with your special domain name, will be applied to the request that IIRF proxies.

This will work just fine for the first request. There's a problem that happens when links in the returned page (the one returned from http://mylanserver/wiki/index.html), reference locations in the "web directory" for mylanserver. These links can be specified relatively or fully-qualified. In the former case, a link like "foo.htm" in the returned page will be elaborated to "http://wiki.mydomain/foo.htm". If the user clicks on that link, it should be applied to the same ProxyPass rule, and it should just work.

But if the page also contains non-relative link references, let's say, to something like "/wiki/foo.htm" (not the leading slash), then the browser receiving the proxied content will elaborate that to "http://wiki.myserver/wiki/foo.htm", which will get proxied to http://mylanserver/wiki/wiki/foo.htm . (Note the doubling of the "wiki" path segment).

This problem occurs in particular with script, image, or css content, which is often stored in different directories on the web server.

NB: There is a similar problem with hard links that refer to server names.  It's possible to embed a img or <a> tag that specifies the server name mylanserver .  Obviously this server is not supposed to be exposed to the outside world, that's why you're doing the proxy step.  So you need to make sure there are no references to the server name in all of the web content on mylanserver.

To avoid the problem with rooted links, you need to either make sure all your references are relative, or you need to deal with the non-relative references explicitly in the IIRF rules. The former means combing through the web site on mylanserver and making sure all the links, stylesheets, img tags, and so on, all use relative references. this is usually pretty hard and tedious to do. The latter involves modifying the rule in the IIRF file, like this:

RewriteCond %{HTTP_HOST}         ^wiki\.mydomain\.co\.uk$ 
ProxyPass   ^/(?:wiki/)?(.*)$    http://mylanserver/wiki/$1

What this does is omit any preceding "wiki" in any request that gets proxied.

You also have the alternative to require a /wiki as the first segment in the URL path, so all requests destined to be proxied would look like this: http://wiki.mydomain/wiki/whatever.htm. That seems redundant to me, but you might want the URLS like that. if so, then use something like this;

RewriteCond %{HTTP_HOST}    ^wiki\.mydomain\.co\.uk$ 
ProxyPass   ^/wiki/(.*)$    http://mylanserver/wiki/$1

Any URL which has both wiki.mydomain.co.uk as the domain name, and /wiki as the first segment in the URL path, will be proxied. Any URL that misses one or the other or both of those constraints, will not be proxied.  In this case you won't have the problem with relative vs rooted links, bebcause the external URL path is the same as the internal URL path.  http://wiki.mydomain/wiki/foo refers to http://mylanserver/wiki/foo .  Any relative link and rooted link will "just work", as long as there are no references in the content within files in /mylanserver/wiki to directories outside of /mylanserver/wiki .


Keep in mind that you have to configure IIRF as an Extension to get ProxyPass to work. This is explained in the doc, but some people skip that step. The installer, which is currently sort of unreliable, does not register IIRF as an Extension, yet. I'll be fixing this stuff soon, but for now you have to install it manually.

Jan 22, 2010 at 9:20 AM

Wow! Many thanks for the detailed reply.

I’ll have to read this more carefully than this first pass, but it seems to cover every option.

From this, I can probably work out exactly what part of the original URL makes it to the compare string and then I’ll understand it I’m sure.

I used to be quite at home with regex but it’s a long time since I used it. I’m sure I can pick it back up, but not knowing how the URL has been modified by the time it gates to ISAPI floored me.

This really is a comprehensive answer so many thanks for the great support!

I did read the extension bit so that’s fine – I’ll follow those instructions.

Jan 24, 2010 at 11:11 AM

Me again!

I tried to install IIRF, but fell at the Add Extension hurdle.

I followed the instructions to place the dll and set permissions, went to the website properties and Home Directory.

On the Config dialog, clicked Add, entered the .iirf extension and browsed to the dll, selected that. Cleared the box to verify te file exists, click OK - and the dialog won't close. It just flickers and stays there.

I just cannnot add the filter to the list of extensions. I can only Cancel out of that dialog, leaving it uninstalled.

 

Any ideas?

 

Jan 24, 2010 at 2:14 PM

I may have answered my own question. I added quotes round the filename and path after browsing to it, and it seems to have added it. I had come across a post from David Wang on MS that said long filenames must be quoted in IIS manager...

 

Now to test it.

Jan 24, 2010 at 5:23 PM

BAck to the original question.

The filter is installed and working and it's checking incoming URLs. It never matches, and therefore always goes to the defaullt server (site).

THe log shows:

EvalCondition: Cond ${HTTP_HOST} ^wiki\.mydomain\.co\.uk$ => FALSE

So, how can I see what the actual string is, that is being compared here? The browser is getting to the server alright, so DNS out there must be correct. And when it gets there the IIRF filter is certainly processing it.

But why no match (unless there is an error in that regex? But it looks simple, depending what $(HTTP_HOST) evaluates to...

Jan 26, 2010 at 11:35 AM

I have modified to your first rule, which you said should work for the very first request (just to keep it simple)

Here is an extract from the log immediately after the ini file is read (no errors) when a straight "http://wiki.mydomain.co.uk" is typed into the browser address bar:

Tue Jan 26 12:09:13 - 124276 - ReadSiteConfig: Done reading, found 1 rules (0 errors, 0 warnings) on 7 lines
Tue Jan 26 12:09:13 - 124276 - ReleaseOrExpireSiteConfig: site '/LM/W3SVC/1/ROOT' (era=3) (rc=0) (Expired=1) (ptr=0x04958288)...
Tue Jan 26 12:09:13 - 124276 - GetSiteConfig: Obtain  site '/LM/W3SVC/1/ROOT' (era=4) (rc=1) (Expired=0) (ptr=0x04959888)...
Tue Jan 26 12:09:13 - 124276 - HttpFilterProc: SF_NOTIFY_URL_MAP
Tue Jan 26 12:09:13 - 124276 - HttpFilterProc: cfg= 0x04959888
Tue Jan 26 12:09:13 - 124276 - HttpFilterProc: SF_NOTIFY_AUTH_COMPLETE
Tue Jan 26 12:09:13 - 124276 - DoRewrites
Tue Jan 26 12:09:13 - 124276 - GetHeader_AutoFree: 'url' = '/'
Tue Jan 26 12:09:13 - 124276 - GetHeader_AutoFree: 'method' = 'GET'
Tue Jan 26 12:09:13 - 124276 - DoRewrites: New Url, before decoding: '/'
Tue Jan 26 12:09:13 - 124276 - DoRewrites: Url (no decoding): '/'
Tue Jan 26 12:09:13 - 124276 - EvaluateRules: depth=0
Tue Jan 26 12:09:13 - 124276 - EvaluateRules: Rule 1 : 2 matches
Tue Jan 26 12:09:13 - 124276 - ReplaceServerVariables: in='${HTTP_HOST}' out='${HTTP_HOST}'
Tue Jan 26 12:09:13 - 124276 - GenerateReplacementString: result '${HTTP_HOST}'
Tue Jan 26 12:09:13 - 124276 - EvalCondition: Cond ${HTTP_HOST} ^wiki\.mydomain\.co\.uk$ => FALSE  ***this should match should it not?
Tue Jan 26 12:09:13 - 124276 - EvalConditionList: rule 1, FALSE, Rule does not apply
Tue Jan 26 12:09:13 - 124276 - EvaluateRules: returning 0
Tue Jan 26 12:09:13 - 124276 - DoRewrites: No Rewrite

The ini file now says:

RewriteLog d:\program files\iirf\iirf
RewriteLogLevel 4
StatusUrl /iirfStatus  RemoteOk

RewriteCond ${HTTP_HOST}    ^wiki\.mydomain\.co\.uk$
ProxyPass   ^/(.*)$    http://lanserver/sites/wiki/$1

What gets returned to te server is the content at www.mydomain.co.uk/index.html (i.e. there is no redirection)

Here's what IIS log shows happening:

2010-01-26 12:09:13 192.168.16.2 GET /index.html - 80 86.162.116.211 - wiki.mydomain.co.uk 304 0
2010-01-26 12:09:13 192.168.16.2 GET /res/styles.css - 80 86.162.116.211 http://wiki.mydomain.co.uk/ wiki.mydomain.co.uk 304 0
2010-01-26 12:09:13 192.168.16.2 GET /res/up.gif - 80 86.162.116.211 http://wiki.mydomain.co.uk/ wiki.mydomain.co.uk 304 0

The three files above are from www.mydomain.co.uk/

Coordinator
Jan 26, 2010 at 9:46 PM

Yes -

I feel terrible about this.  Instead of $(HTTP_HOST}, please use %{HTTP_HOST} .  The first character should be %, not $.  My apologies - my prior suggestion used the wrong flag character.

 

Jan 27, 2010 at 11:45 AM

Thanks Cheeso!

That's fixed the redirection (I think). I can now see that the proxy happens. I get an authentication error (401.2) and I'm not sure yet if that's due to server config or to the request being proxied (MS suggests it could be either).

Alowing anonymous access at the intranet server doesn't get me anywhere, except the code is now 401.4 - a custom ISAPI filter has denied the request.

 

I shall dig into the MS documentation on this. Because the lan site is running Sharepoint there may be something I have to do in Sharepoint too.

 

Many thanks for all your help!