An HTTPS system proxy with node

Thursday, January 2, 2014

Creating a simple http proxy in node.js is super easy thanks to an excellent module – http-proxy. With only the following code you’ll have a proxy that can be used as system or browser level proxy:

var httpProxy = require('http-proxy');
var proxyServer = httpProxy.createServer(function (req,res,proxy) {
	var hostNameHeader = req.headers.host,
	hostAndPort = hostNameHeader.split(':'),
	host = hostAndPort[0],
	port = parseInt(hostAndPort[1]) || 80;
	proxy.proxyRequest(req,res, {
		host: host,
		port: port
	});
});

proxyServer.listen(8888);

Adding support for HTTPS

The first thing we will need is a certificate that will be used by TLS implementation for encryption. A server has access to certificate private and public key thus it is able to get a clear text from a message encrypted with public key. It’s a common knowledge that asymmetric encryption is more expensive in terms of CPU cycles than its symmetric counterpart. That’s way when using HTTPS connection asymmetric encryption is only used during initial handshake to exchange a session key in a secure manner between client and server. This session key is then used by both client and server for traffic encryption using symmetric algorithm.

Naturally to get or rather to buy a proper certificate one would have to be verified by a valid certification authority. Fortunately for the sake of a demo we can generate a self signed certificate. That’s really easy, a nice description is available on heroku help pages:

openssl genrsa -des3 -passout pass:x -out proxy-mirror.pass.key 2048
echo "Generated proxy-mirror.pass.key"

openssl rsa -passin pass:x -in proxy-mirror.pass.key -out proxy-mirror.key
rm proxy-mirror.pass.key
echo "Generated proxy-mirror.key"

openssl req -new -batch -key proxy-mirror.key -out proxy-mirror.csr -subj /CN=proxy-mirror/emailAddress=piotr.mionskowski@gmail.com/OU=proxy-mirror/C=PL/O=proxy-mirror
echo "Generated proxy-mirror.csr"

openssl x509 -req -days 365 -in proxy-mirror.csr -signkey proxy-mirror.key -out proxy-mirror.crt
echo "Generated proxy-mirror.crt"

http-proxy support for https

http-proxy module supports various https configuration for example passing traffic from https to http and vice versa. Unfortunately I couldn’t find a way to get it working without preconfiguring target host and port – which was a problem while I was implementing proxy-mirror. Here is what curlwill print out when trying to use such proxy:

curl -vk --proxy https://localhost:8888/ https://pl-pl.facebook.com/
* timeout on name lookup is not supported
* About to connect() to proxy localhost port 8888 (#0)
* Trying 127.0.0.1...
* connected
* Connected to localhost (127.0.0.1) port 8888 (#0)
* Establish HTTP proxy tunnel to pl-pl.facebook.com:443
> CONNECT pl-pl.facebook.com:443 HTTP/1.1
> Host: pl-pl.facebook.com:443
> User-Agent: curl/7.26.0
> Proxy-Connection: Keep-Alive
>

I suspect the problem is inherently related to the way http-proxy uses nodejs core http(s) modules – I might be wrong here though. The workaround I’ve used in proxy-mirror was to listen to CONNECT event on http server, establish socket connection to a fake https server listening on different port. When the https connection is established the fake https server handler proxies requests further. The advantage of this approach is that we can use the same proxy address for both http and https. Here is the code:

	var httpProxy = require('http-proxy'),
    fs = require('fs'),
    https = require('https'),
    net = require('net'),
    httpsOptions = {
	    key: fs.readFileSync('proxy-mirror.key', 'utf8'),
	    cert: fs.readFileSync('proxy-mirror.crt', 'utf8')
    };
    
    var proxyServer = httpProxy.createServer(function (req, res, proxy) {
	    console.log('will proxy request', req.url);
	    var hostNameHeader = req.headers.host,
	    hostAndPort = hostNameHeader.split(':'),
	    host = hostAndPort[0],
	    port = parseInt(hostAndPort[1]) || 80;
	    proxy.proxyRequest(req, res, {
		    host: host,
		    port: port
	    });
    });
    
    proxyServer.addListener('connect', function (request, socketRequest, bodyhead) {
	    var srvSocket = net.connect(8889, 'localhost', function () {
		    socketRequest.write('HTTP/1.1 200 Connection Established\r\n\r\n');
		    srvSocket.write(bodyhead);
		    srvSocket.pipe(socketRequest);
		    socketRequest.pipe(srvSocket);
	    });
    });
    
    var fakeHttps = https.createServer(httpsOptions, function (req, res) {
	    var hostNameHeader = req.headers.host,
	    hostAndPort = hostNameHeader.split(':'),
	    host = hostAndPort[0],
	    port = parseInt(hostAndPort[1]) || 443;
	    
	    proxyServer.proxy.proxyRequest(req, res, {
		    host: host,
		    port: port,
		    changeOrigin: true,
		    target: {
		    	https: true
		    }
	    });
    });
    
    proxyServer.listen(8888);
    fakeHttps.listen(8889);

HTML5 WebSocket support

The above code has still problems handling WebSockets. This is because browsers, according to the spec, change the way they establish initial connection when they detect that they are behind an http proxy. As you can read on wikipedia more existing http proxy implementation suffer from this.  I still haven’t figured out how to handle this scenario elegantly with http-proxy and node.js, when I do I will post my findings here.

Building an HTTP sniffer with node.js

Wednesday, January 1, 2014

Whether to inspect a server response, optimize network usage or just to fiddle with new REST API it almost always make sense to use an application that can display http requests and responses for investigation. On Windows there is the great Fiddler. While it’s possible to use Fiddler on other platforms using virtualization software I think it’s an overkill. Fortunately there are alternatives. Charles looks like the most advanced one offering most if not all features available in Fiddler. There is also a little less popular HTTP Scoop. Both Charles and HTTP Scoop aren’t free but in my opinion they are worth the price especially if used often. Command line lovers might find mitmproxy suit their needs. If you only need basic features ngrok might serve you well. To dive a bit deeper and see http traffic on a tcp level WireShark is indispensable.

As you can see there are plenty of tools available to help understand what is happening on http level. As I was learning about http caching a question popped to my head. How hard would it be to actually build a simple http sniffer?

proxy-mirror – a simple http inspector

As it turned out it’s actually not that hard to built one thus proxy-mirror came to be. Of course this is a very simplistic http sniffer – nowhere near to tools I mentioned above both in terms of features and reliability – but it works (at least in most scenariosWinking smile). It’s open source and I learned couple of new things about HTTP while implementing it – more on that in future posts. Here are some screen shoots of the the tool: 

As you might have guessed from screenshots proxy-mirror is a web application. Right now it’s not very easy to try it out – the instructions are in the Readme – I’ll try to fix it soon.

How it works

I wanted to run proxy-mirror on many platforms and rely on a foundation that has both great support for http and building web applications. I picked node.js over java or ruby mainly because I wanted to sharpen my skills in it.

You can think of proxy-mirror as 2 logical components. The first is a regular http proxy that you can use by configuring your browser or system. I didn’t want to focus on building that first so I utilized a great node module call http-proxy. With it’s helped you can have a system wide proxy in couple of minutes. The http proxy component emits events whenever a request or response goes through it. Those events are consumed by the second logical component – a web application built with express.js. The information about http traffic received from events is then pushed to browser part through socket.io where a simple SPA built with my beloved AngularJS displays information about them.

Features

Right now proxy-mirror has only couple of features:

  • http and https support – although the latter one requires additional setup
  • simple session list – a grid with list of request/response pairs that you can inspect
  • a detailed view – both for request and response that can display headers, message body and a preview right in the browser

Although it still is a very simple (and buggy) application I actually found that it has most of features I frequently use – probably except for filtering. Maybe someday it will become a reasonable alternative for Charles - at least for me.

I think building an http sniffer is a great exercise in the process of learning how http(s) works. I guess the famous saying about journey being more more important than the destination fits nice here.