Karl Matthias

Member of Technical Staff at Mozi. Co-Author of “Docker: Up and Running” from O’Reilly Media

Dynamic Nginx Router... in Go!

Nginx logo

We needed a specialized load balancer at Nitro. After some study, Mihai Todor and I built a solution that leverages Nginx, the Redis protocol, and a Go-based request router where Nginx does all the heavy lifting and the router carries no traffic itself. This solution has worked great in production for the last year. Here’s what we did and why we did it.

Why?

The new service we were building would be behind a pool of load balancers and was going to do some expensive calculations—and therefore do some local caching. To optimize for the cache, we wanted to try to send requests for the same resources to the same host if it were available.

There are a number of off-the-shelf ways to solve this problem. A non-exhaustive list of possibilities includes:

Go Gopher by Renee French.

Using cookies to maintain session stickiness
Using a header to do the same
Stickiness based on source IP
HTTP Redirects to the correct instace

This service will be hit several times per page load and so HTTP redirects are not viable for performance reasons. The rest of those solutions all work well if all the inbound requests are passing through the same load balancer. If, on the other hand, your frontend is a pool of load balancers, you need to be able to either share state between them or implement more sophisticated routing logic. We weren’t interested in the design changes needed to share state between load balancers at the moment and so opted for more sophisticated routing logic for this service.

Our Architecture

It probably helps to understand our motiviation a little better to understand a bit about our architecture.

We have a pool of frontend load balancers and instances of the service are deployed on Mesos so they may come and go depending on scale and resource availability. Getting a list of hosts and ports into the load balancer is not an issue, that’s already core to our platform.

Because everything is running on Mesos, and we have a simple way to define and deploy services, adding any new service is a trivial task.

On top of Mesos, we run gossip-based Sidecar everywhere to manage service discovery. Our frontend load balancers are Lyft’s Envoy backed by Sidecar’s Envoy integration. For most services that is enough. The Envoy hosts run on dedicated instances but the services all move between hosts as needed, directed by Mesos and the Singularity scheduler.

The Mesos nodes for the service under consideration here would have disks for local caching.

Design

Looking at the problem we decided we really wanted a consistent hash ring. We could have nodes come and go as needed and only the requests being served by those nodes would be re-routed. All the remaining nodes would continue to serve any open sessions. We could easily back the consistent hash ring with data from Sidecar (you could substitute Mesos or K8s here). Sidecar health checks nodes and so we could rely on nodes being available if they are alive in Sidecar.

We needed to then somehow bolt the consistent hash to something that could direct traffic to the right node. It would need to receive each request, identify the resource in question, and then pass the request to the exact instance of the service that was prepped to handle that resource.

Of course, the resource identification is easily handled by a URL and any load balancer can take those apart to handle simple routing. So we just needed to tie that to the consistent hash and we’d have a solution.

You could do this in Lua in Nginx, possibly in HAproxy with Lua as well. No one at Nitro is a Lua expert and libraries to implement the pieces we needed were not obviously available. Ideally the routing logic would be in Go, which is already a critical language in our stack and well supported.

Nginx has a rich ecosystem, though, and a little thinking outside the box turned up a couple of interesting Nginx plugins, however. The first of these is the nginx-eval-module by Valery Kholodkov. This allows you to make a call from Nginx to an endpoint and then evaluate the result into an Nginx variable. Among other possible uses, the significance of that for us is that it allows you to dynamically decide which endpoint should receive a proxy-pass. That’s what we wanted to do. You make a call from Nginx to somewhere, you get a result, and then your make a routing decision based on that value.

You could implement the recipient of that request with an HTTP service that returns only a string with the hostname and port of the destination service endpoint. That service would maintain the consistent hash and then tell Nginx where to route the traffic for each request. But making a separate HTTP request, even if were always contained on the same node, is a bit heavy. The whole expected body of the reply would be something like the string 10.10.10.5:23453. With HTTP, we’d be passing headers in both directions that would vastly exceed the size of the response.

So I started to look at other protocols supported by Nginx. Memcache protocol and Redis protocol are both supported. Of those, the best supported from a Go service is Redis. So that was where we turned.

There are two Redis modules for Nginx. One of them is suitable for use with the nginx-eval-module. The best Go library for Redis is Redeo. It implements a really simple handler mechanism much like the stdlib http package. Any Redis procotol command will invoke a handler function, and they are really simple to write. Alas, it only supports a newer Redis protocol than the Nginx plugin can handle. So, I dusted off my C skills and patched the Nginx plugin to use the newest Redis protocol encoding.

So the solution we ended up with is:

 [Internet] -> [Envoy] -> [Nginx] -(2)--> [Service endpoint]
                             \
                          (1) \ (redis proto)
                               \
                                -> [Go router]

The call comes in from the Internet, hits an Envoy node, then an Nginx node. The Nginx node (1) asks the router where to send it, and then (2) Nginx passes the request to the endpoint.

Implementation

We built a library in Go to manage our consistent hash backed by Sidecar or by Hashicorp’s Memberlist library. We called that library Ringman. We then bolted that libary into a service which serves Redis protocol requests via Redeo.

Only two Redis commands are required: GET and SELECT. We chose to implement a few more commands for debugging purposes, including INFO which can reply with any server state you’d like. Of the two required commands, we can safely ignore SELECT, which is for selecting the Redis DB to use for any subsequent calls. We just accept it and do nothing.GET, which does all the work, was easy to implement. Here’s the entire function to serve the Ringman endpoint over Redis with Redeo. Nginx passes the URL it received, and we return the endpoint from the hash ring.

srv.HandleFunc("get", func(out *redeo.Responder, req *redeo.Request) error {
	if len(req.Args) != 1 {
		return req.WrongNumberOfArgs()
	}
	node, err := ringman.GetNode(req.Args[0])
	if err != nil {
		log.Errorf("Error fetching key '%s': %s", req.Args[0], err)
		return err
	}

	out.WriteString(node)
	return nil
})

That is called by Nginx using the following config:

# NGiNX configuration for Go router proxy.
# Relies on the ngx_http_redis, nginx-eval modules,
# and http_stub_status modules.

error_log /dev/stderr;
pid       /tmp/nginx.pid;
daemon    off;

worker_processes 1;

events {
  worker_connections  1024;
}

http {
  access_log   /dev/stdout;

  include     mime.types;
  default_type  application/octet-stream;

  sendfile       off;
  keepalive_timeout  65;

  upstream redis_servers {
    keepalive 10;

    # Local (on-box) instance of our Go router
    server services.nitro.us:10109;
  }

  server {
    listen      8010;
    server_name localhost;

    resolver 127.0.0.1;

    # Grab the filename/path and then rewrite to /proxy. Can't do the
    # eval in this block because it can't handle a regex path.
    location ~* /documents/(.*) {
      set $key $1;

      rewrite ^ /proxy;
    }

    # Take the $key we set, do the Redis lookup and then set
    # $target_host as the return value. Finally, proxy_pass
    # to the URL formed from the pieces.
    location /proxy {
      eval $target_host {
        set $redis_key $key;
        redis_pass redis_servers;
      }

      #add_header "X-Debug-Proxy" "$uri -- $key -- $target_host";

      proxy_pass "http://$target_host/documents/$key?$args";
    }

    # Used to health check the service and to report basic statistics
    # on the current load of the proxy service.
    location ~ ^/(status|health)$ {
      stub_status on;
      access_log  off;
      allow 10.0.0.0/8;    # Allow anyone on private network
      allow 172.16.0.0/12; # Allow anyone on Docker bridge network
      allow 127.0.0.0/8;   # Allow localhost
      deny all;
    }

    error_page   500 502 503 504  /50x.html;
    location = /50x.html {
      root   html;
    }
  }
}

We deploy Nginx and the router in containers and they run on the same hosts so we have a very low call overhead between them.

We build Nginx like this:

./configure --add-module=plugins/nginx-eval-module \
      --add-module=plugins/ngx_http_redis \
      --with-cpu-opt=generic \
      --with-http_stub_status_module \
      --with-cc-opt="-static -static-libgcc" \
      --with-ld-opt="-static" \
      --with-cpu-opt=generic

make -j8

Performance

We’ve tested the performance of this extensively and in our environment we see about 0.2-0.3ms response times on average for a round trip from Nginx to the Go router over Redis protocol. Since the median response time from the upstream service is about 70ms, this is a negligeable delay.

A more complex Nginx config might be able to do more sophisticated error handling. Reliability after a year in service is extremly good and performance has been constant.

Wrap-Up

If you have a similar need, you can re-use most of the components. Just follow the links above to actual source code. If you are interested in adding support for K8s or Mesos directly to Ringman, that would be welcome.

This solution started out sounding a bit like a hack and in the end has been a great addition to our infrastructure. Hopefully it helps someone else solve a similar problem.