Advanced Load Balancing and Sticky Sessions with Ambassador, Envoy and Kubernetes
As we wrote in the Ambassador 0.52 release notes, we have recently added early access support for advanced ingress load balancing and session affinity in the Ambassador API gateway, which is based on the underlying production-hardened implementations within the Envoy Proxy.
It’s been my experience that how load balancing is implemented within Kubernetes Services is not always intuitive. For example, if you are using HTTP/2 or gRPC, then using a Layer 7 aware load balancer like Ambassador can make a big difference to your service level indicators (SLIs). Being able to bypass the Kubernetes Services kube-proxy
, which implements load balancing at Layer 4, in order to communicate directly with Pod endpoints, will positively impact performance and load distribution. In addition, being able to implement “sticky sessions” using a token other than a client IP — for example, a cookie or header — also supports additional session affinity use cases.
Kubernetes Service Networking 101
In a typical Kubernetes cluster, requests that are sent to a Kubernetes Service are routed by a component named kube-proxy
. Somewhat confusingly, kube-proxy
isn’t a proxy in the classic sense, but a process that implements a virtual IP for a Service via iptables rules. This architecture adds additional complexity to routing: not only does it introduce a small amount of latency, but iptables weren’t particularly designed for this routing use case, and therefore your load balancing strategy is limited to the “round robin” algorithm.
The Kubernetes community does realise this is an issue, and as such there is a new Service proxy mode — IP Virtual Server (IPVS) or “ipvs
” — that is being developed to offer additional functionality. However, this feature is listed as in beta within the Kubernetes docs.
While the current implementation within Kubernetes is somewhat complex, this approach has one overwhelming advantage for Ambassador users: simplicity. Service discovery and load balancing are delegated to Kubernetes, and testing the routing with common tools since as curl was straightforward.
Endpoint Routing and Load Balancing
In Ambassador 0.52, we introduced a new set of controls for load balancing. These controls are opt-in, so if you don’t change anything, you’ll get the standard Kubernetes-based load balancing behavior. If you set the AMBASSADOR_ENABLE_ENDPOINTS
environment variable within your Ambassador Deployment this will enable the new controls. Specifically:
- Ambassador will watch all Kubernetes Endpoints for state changes, instead of just Kubernetes Services.
- Ambassador can then be configured to use different load balancing algorithms and route directly to Kubernetes Endpoints, bypassing
kube-proxy
.
Here’s a sample mapping where we add the load_balancer
annotation:
apiVersion: ambassador/v1
kind: Mapping
name: qotm_mapping
prefix: /qotm/
service: qotm
load_balancer:
policy: round_robin
Note that the default load balancing policy can also be set globally with annotations in the Ambassador module as well.
Session Affinity: a.k.a “Sticky Sessions”
In addition to the default round_robin policy, Ambassador 0.52 now supports session affinity, or “sticky sessions”, through the underlying Envoy ring_hash and Maglev load balancing policies.
Configuring sticky sessions makes Ambassador route requests to the same backend service in a given session. In other words, requests in a session are served by the same Kubernetes pod. Ambassador lets you configure session affinity based on the following parameters in an incoming request:
- Cookie
- Header
- Source IP
The Ambassador documentation goes into much more detail on the configuration options with these new controls, but here is an example that implements sticky sessions by configuring a request the client to set a cookie named sticky-cookie
with an expiration of 60 seconds in response to the first request if the cookie is not already present:
apiVersion: ambassador/v1
kind: Mapping
name: qotm_mapping
prefix: /qotm/
service: qotm
load_balancer:
policy: ring_hash
cookie:
name: sticky-cookie
ttl: 60s
The Envoy load balancer policy documentation provides good guidance on the use cases for Ring Hash vs Maglev, but in a nutshell the building of the token lookup table and host selection times are faster with Maglev (especially when load balancing over a large amount of hosts) with the penalty that it is slower in moving hash keys around when hosts are removed from the cluster. The Envoy docs do make a note of calling out that if you are using Redis, then Maglev is very likely a superior drop in replacement for ring hash, and this may be worth considering in the context of other distributed data stores or middleware.
Some Gotchas
A few Ambassador users have reached out to us about strange results they are seeing in response to load balancing benchmarks they have run. It is worth noting that nuances in the threading model of the underlying Envoy Proxy do mean that traffic received under certain conditions may not be as evenly distributed as you would estimate. In addition, aAs the Twilio team discovered, the multiplexing of HTTP/2 stream does mean that under certain TCP connection failure conditions you may see unexpected penalties in performance.
Learning More
The networking implementation within Kubernetes is now complex then it might first appear, and also somewhat more limited than many engineers understand. Matt Klein has put together a very informative blog post “Introduction to modern network load balancing and proxying” which provides a great foundation for understanding key concepts. There are also a series of additional posts that explain why organisations have chosen to use Layer 7 aware proxies to load balance ingress traffic, such as Bugsnag, Geckoboard, and Twilio.
We’re Keen to Hear Your Feedback
As this is early-access functionality we are keen to get feedback, answer any questions you may have, and also to listen to your comments.
As usual, you can also ask any questions you may have via Twitter (@getambassadorio), Slack or raise issues via GitHub.