I've attempted the same thing and found there is very little base to
work from. You can take a look at WEBrick's httpproxy.rb but I found
it hard to determine where I would place my "hooks" to reprocess the
content. I've got a partially functional proxy that I wrote from the
ground up, but it has issues displaying certain pages. If you're
interested I can get the code up somewhere that it can be seen.
Hmm... I actually did this last week, and I found some example code on
the web pretty quickly (it was in Japanese, admittedly...). Here's
the simple AdBlock proxy I ran up whilst playing around (it uses the
pierceive adblock list). It returns an empty document for disallowed
addresses, and removes all img tags, just as an example of processing.
It's not meant to be feature-rich or even high-quality code, but it
does most of what you seem to want.
Paul.
#!/usr/bin/env ruby
require 'webrick/httpproxy'
require 'stringio'
require 'zlib'
require 'open-uri'
require 'iconv'
class AdBlocker
def initialize
reload
end
def reload
bl =
File.open('adblock.txt').each_line do |line|
line.strip!
next if (line =~ /\[Adblock\]/ || line =~ /^!/)
if (%r!^/.*/$! =~ line)
bl << Regexp.new(line[1..-1])
else
bl << line
end
end
@block_list = bl
end
def blocked?(uri)
@block_list.each { |rx|
if (uri.match(rx))
return true
end
}
return false
end
end
module WEBrick
class RejectingProxyServer < HTTPProxyServer
def service(req, res)
if (@config[:ProxyURITest].call(req.unparsed_uri))
super(req, res)
else
blank(req, res)
end
end
def blank(req, res)
res.header['content-type'] = 'text/plain'
res.header.delete('content-encoding')
res.body = ''
end
end
end
class ProxyServer
···
#
# Handler that is called by the proxy to process each page
#
def handler(req, res)
#p res.header
# Inflate content if it's gzipped
if ('gzip' == res.header['content-encoding'])
res.header.delete('content-encoding')
res.body = Zlib::GzipReader.new(StringIO.new(res.body)).read
end
res.body.gsub!(%r!<img[^>]*>!im, '[image]')
end
def uri_allowed(uri)
b = @adblocker.blocked?(uri)
#puts("--> URI #{b ? 'blocked' : 'allowed'}: #{uri}")
return !b
end
def initialize
@server = WEBrick::RejectingProxyServer.new(
:BindAddress => '0.0.0.0',
:Port => 8181,
:ProxyVia => false,
# :ProxyURI => URI.parse('http://localhost:8118/'\),
:ProxyContentHandler => method(:handler),
:ProxyURITest => method(:uri_allowed)
)
@adblocker = AdBlocker.new
end
def start
@server.start
end
def stop
@server.shutdown
end
end
#
# Create and start the server
#
ps = ProxyServer.new
%w[INT HUP].each { |signal| trap(signal) { ps.stop } }
ps.start