using apache as a filtering web proxy

There are many kinds of web proxies and many web proxies available. This howto covers the case where you want to be able to perform filtering on the content of all web traffic. There are several fancy packages for doing this on your home computer (to filter out ads, popups, img bugs, etc) like middleman and filterproxy.

These packages are cool, but they are complicated, a little unstable (in my limited experience), and are overkill for what I need. Instead, in this howto, we will use a very simple apache2 setup to pass all content through a sed script.

Currently, i think the packages in debian to make this work only exist for apache2, although it should work in apache1 as well.

 # apt-get install apache2 

set the port the proxy will be on in /etc/apache2/ports.conf:
bc. Listen 8080

disable unneeded modules and enable the ones we want:

cd /etc/apache2
rm sites-enabled/*
rm mods-enabled/*
cd mods-enabled
ln -s ../mods-available/proxy.* .
ln -s ../mods-available/ext_filter.load .

enable the proxy and set up the filter by editing /etc/apache2/mods-enabled/proxy.conf:

ExtFilterDefine my-filter mode=output intype=text/html cmd="/path/to/script"

<IfModule mod_proxy.c>
    ProxyRequests On
    <Proxy *>
        Order deny,allow
        Deny from all
        Allow from 127.0.0.1
        SetOutputFilter my-filter
    </Proxy>
    ProxyVia Off
</IfModule>

What is this doing?

Create the script specified in ExtFilterDefine:

#!/bin/sed -f
s/capitalism/free society/g

Make sure that the script is owned by www-data and is executable.

You could use any language for the script. Here we use sed, but perl would work too. This filter will replace “capitalism” with “free society”.

See httpd.apache.org/docs-2.0/mod/mod_ext_f... for more information of external apache filters.