I am using Apache 2.4.59 under Debian as reverse proxy. I can't make it rewrite links in HTML (at all), and I tried everything I could find on various forums: SetOutputFilter
, AddOutputFilter
, inflate;proxy-html;deflate
, specifying extra ProxyHTMLLinks
etc. Nothing works to rewrite links inside the HTML.
I now created a fully self-contained MWE (apache2 config, Makefile to run the server and curl to fetch the page through the proxy), here: https://github.com/eudoxos/rproxy .
The apache config contains:
ProxyRequests Off
ProxyPass /proxied/ http://localhost:8080/
ProxyPassReverse /proxied/ http://localhost:8080/
<Location /proxied/>
ProxyHTMLEnable On
ProxyHTMLLinks link href
AddOutputFilterByType inflate;proxy-html;substitute;deflate text/html
ProxyHTMLURLMap ^/ /proxied/
Substitute "s@Title@REPLACED TITLE@"
</Location>
where substitute
tests that filter machinery is engaged.
The simple index.html
<!DOCTYPE HTML><HTML><head><meta charset="utf-8"><link rel="stylesheet" href="/style.css"><title>Main page</title></head><body><h1>Title</h1></body></HTML>
is returned with <h1>REPLACED TITLE</h1>
, but <link … href="/style.css">
is intact (should become <link … href="/proxied/style.css">
).
Analyzing the log output, I see the filters being run in order on the proxied index.html
:
inflate
:[filter:trace4] Content-Type 'text/html' ... [filter:trace4] ... matched 'text/html' [filter:trace2] Content-Type condition for 'inflate' matched
proxy-html (but NO ACTION HAPPENS — why?):
[xml2enc:debug] AH01430: Content-Type is text/html [xml2enc:debug] AH01434: Charset ISO-8859-1 not supported by libxml2; trying apr_xlate [xml2enc:debug] AH01439: xml2enc: consuming 156 bytes from bucket [xml2enc:debug] AH01441: xml2enc: converted 156/156 bytes [filter:trace4] Content-Type 'text/html;charset=utf-8' ... [filter:trace4] ... matched 'text/html' [filter:trace2] Content-Type condition for 'proxy-html' matched
substitute (replaces title via regex)
[filter:trace4] Content-Type 'text/html;charset=utf-8' ... [filter:trace4] ... matched 'text/html' [filter:trace2] Content-Type condition for 'substitute' matched [substitute:trace8] Line read (140 bytes): <html><head><meta charset="utf-8"><link rel="stylesheet" href="/style.css"><title>Main page</title></head><body><h1>Title</h1></body></html> [substitute:trace8] Replacing regex:'Title' by 'REPLACED TITLE' [substitute:trace8] Matching found [substitute:trace8] Result: 'REPLACED TITLE'
deflate:
[filter:trace4] Content-Type 'text/html;charset=utf-8' ... [filter:trace4] ... matched 'text/html' [filter:trace2] Content-Type condition for 'deflate' matched
You are welcome to run the test yourself locally. Any contribution/idea is appreciated.