My blog has been moved to ariya.ofilabs.com.

Monday, June 21, 2010

proxy server with filtering feature

Beside exploring San Diego, I had done some coding intermittently only. My apology if I do not update X2 with fresh new examples often enough.

Having said that, here is one network-related example: a minor tweak to the previous example of Qt-based proxy server. Basically it adds a minimalistic URL filtering support, in the form of blacklisting certain URLs which start with some predefined strings. The code is available in the usual place, X2 repository, under the directory network/filterproxy.

While major browsers support some variants of content blocking, be it via an extension like AdBlock or as a feature built-in into the browser itself, this new filterproxy should work with any browser that supports proxy. Alas, I did not bother to implement an AdBlock-compatible rule system because it would complicate the code. Again, consider this is a proof of concept only. A challenging exercise would be to fully support the most known subscription filters.

It is unheard that content filtering can dramatically improve your browsing experience. Because it cuts the bandwidth usage, it does translate to lower cost for those who are not lucky enough to get unlimited data plan. But most importantly, throwing garbage out of the web pages definitely speeds up the page loading. For this filterproxy example, I did a very unscientific benchmark and test it with Detik.com news site (now you get the answer why the included blacklist.txt contains only some basic advertisement-laden sites). The screenshots below (click to enlarge) show the unfiltered version (left) and the filtered version (right). Notice also the whopping 40% of bandwith saving!

My promise was to post two variations from that simple proxy example. This counts as one of them, and when the time allows me to clean-up to the other, you'll know it. Stay tuned and happy proxying!

1 comment:

Darmawan Salihun said...

Well, this is just a side note. "Heavy" content filtering via proxy usually implemented with the help of an ICAP (RFC 3507) server. I'm not really sure if that kind of thing should be called deep packet inspection :-/