Apache HTTP Server can be configured in both a forward and reverse proxy (also known as gateway) mode.
mod_proxy and related modules implement a proxy/gateway for Apache HTTP Server, supporting a number of popular protocols as well as several different load balancing algorithms. Third-party modules can add support for additional protocols and load balancing algorithms.
mod_proxy, which provides basic proxy capabilities
- An ordinary forward proxy is an intermediate server that sits between the client and the origin server.
- In order to get content from the origin server, the client sends a request to the proxy naming the origin server as the target.
- The proxy then requests the content from the origin server and returns it to the client.
- The client must be specially configured to use the forward proxy to access other sites.
Note: A typical usage of a forward proxy is to provide Internet access to internal clients that are otherwise restricted by a firewall. The forward proxy can also use caching (as provided by mod_cache) to reduce network usage.
- The forward proxy is activated using the ProxyRequests directive. Because forward proxies allow clients to access arbitrary sites through your server and to hide their true origin, it is essential that you secure your server so that only authorized clients can access the proxy before activating a forward proxy.
- A reverse proxy (or gateway), by contrast, appears to the client just like an ordinary web server. No special configuration on the client is necessary.
- The client makes ordinary requests for content in the namespace of the reverse proxy. The reverse proxy then decides where to send those requests and returns the content as if it were itself the origin.
- A typical usage of a reverse proxy is to provide Internet users access to a server that is behind a firewall.
- Reverse proxies can also be used to balance load among several back-end servers or to provide caching for a slower back-end server.
- In addition, reverse proxies can be used simply to bring several servers into the same URL space.
Note: A reverse proxy is activated using the ProxyPass directive or the [P] flag to the RewriteRule directive. It is not necessary to turn ProxyRequests on in order to configure a reverse proxy.
Reference link
Reverse Proxy guide
click here
As with any modules, the first thing to do is to load them in httpd.conf (this is not necessary if we build them statically into Apache).
LoadModule proxy_module modules/mod_proxy.so
LoadModule proxy_http_module modules/mod_proxy_http.so
#LoadModule proxy_ftp_module modules/mod_proxy_ftp.so
#LoadModule proxy_connect_module modules/mod_proxy_connect.so
LoadModule headers_module modules/mod_headers.so
LoadModule deflate_module modules/mod_deflate.so
LoadFile /usr/lib/libxml2.so
LoadModule xml2enc_module modules/mod_xml2enc.so
LoadModule proxy_html_module modules/mod_proxy_html.so
For windows users this is slightly different: you'll need to load libxml2.dll rather than libxml2.so, and you'll probably need to load iconv.dll and xlib.dll as prerequisites to libxml2 (you can download them from zlatkovic.com, the same site that maintains windows binaries of libxml2). The LoadFile directive is the same.
Of course, you may not need all the modules. Two that are not required in our typical scenario are shown commented out above.
Having loaded the modules, we can now configure the Proxy. But before doing so, we have an important security warning:
Do Not set "ProxyRequests On". Setting ProxyRequests On turns your server into an Open Proxy. There are 'bots scanning the Web for open proxies. When they find you, they'll start using you to route around blocks and filters to access questionable or illegal material. At worst, they might be able to route email spam through your proxy. Your legitimate traffic will be swamped, and you'll find your server getting blocked by things like family filters.
Of course, you may also want to run a forward proxy with appropriate security measures, but that lies outside the scope of this article. The author runs both forward and reverse proxies on the same server (but under different Virtual Hosts).
The fundamental configuration directive to set up a reverse proxy is ProxyPass. We use it to set up proxy rules for each of the application servers:
ProxyPass /app1/ http://internal1.example.com/
ProxyPass /app2/ http://internal2.example.com/
The [P] flag to mod_rewrite offers an alternative to Proxypass, but this is more complex, and may in some instances degrade performance by making it impossible for Apache to use persistent proxy connections.
Now as soon as Apache re-reads the configuration (the recommended way to do this is with "apachectl graceful"), proxy requests will work, so http://www.example.com/app1/some-path maps to http://internal1.example.com/some-path as required.
However, this is not the whole story. ProxyPass just sends traffic straight through. So when the application servers generate references to themselves (or to other internal addresses), they will be passed straight through to the outside world, where they won't work.
For example, an HTTP redirection often takes place when a user (or author) forgets a trailing slash in a URL. So the response to a request for http://www.example.com/app1/foo proxies to http://internal.example.com/foo which generates a response:
HTTP/1.1 302 Found
Location: http://internal.example.com/foo/
(etc)
But from the outside world, the net effect of this is a "No such host" error. The proxy needs to re-map the Location header to its own address space and return a valid URL
HTTP/1.1 302 Found
Location: http://www.example.com/app1/foo/
The command to enable such rewrites in the HTTP Headers is ProxyPassReverse. The Apache documentation suggests the form:
ProxyPassReverse /app1/ http://internal1.example.com/
ProxyPassReverse /app2/ http://internal2.example.com/
However, there is a slightly more complex alternative form that I recommend as more robust:
ProxyPassReverse /
ProxyPassReverse /
Note: this currently fails due to a regression in mod_proxy. It does the right thing with the ProxyPassReverse balancer:/// form if you have a balancer: this is a workaround. Note too that the three slashes are not a typo! Without a balancer, please apply the patch from the bug report or use the other form.
The reason for recommending this is that a problem arises with some application servers. Suppose for example we have a redirect:
HTTP/1.1 302 Found
Location: /some/path/to/file.html
This is a violation of the HTTP protocol and so should never happen: HTTP only permits full URLs in Location headers. However, it is also a source of much confusion, not least because the CGI spec has a similar Location header with different semantics where relative paths are allowed. There are a lot of broken servers out there! In this instance, the first form of ProxyPassReverse will return the incorrect response
HTTP/1.1 302 Found
Location: /some/path/to/file.html
which, even allowing for error-correcting browsers, is outside the Proxy's address space and won't work. The second form fixes this to
HTTP/1.1 302 Found
Location: /app2/some/path/to/file.html
which is still broken, but will at least work in error-correcting browsers. Most browsers will deal with this.
If your backend server uses cookies, you may also need the ProxyPassReverseCookiePath and ProxyPassReverseCookieDomain directives. These are similar to ProxyPassReverse, but deal with the different form of cookie headers. These require mod_proxy from Apache 2.2 (recommended), or a patched version of 2.0.
Reference link