Francisco Mejia 08/31/2018

Iterating Towards a More Scalable Ingress

Shopify, the leading cloud-based, multi-channel commerce platform, is growing at an incredibly fast pace. Since the beginning of 2016, the number of merchants on the platform increased from 375,000 to 600,000+. As the platform scales, we face new and exciting challenges such as implementing Shopify’s Pod architecture and future proofing our cloud storage usage. Shopify’s infrastructure relies heavily on Kubernetes to serve millions of requests every minute. An essential component of any Kubernetes cluster is its ingress, the first point of entry in a cluster that routes incoming requests to the corresponding services. The ingress controller implementation we adopted at the beginning of the year is ingress-nginx, an open source project.

Before ingress-nginx, we used Google Cloud Load Balancer Controller (glbc). We opted out of glbc because, for Shopify, it underperformed on the cloud. We observed underperforming load balancing and request queueing, particularly during deployments. Shopify currently deploys around 40 times per day without scheduling downtime. At the time we identified these problems, glbc wasn’t endpoint aware while ingress-nginx was. Having endpoint awareness allows the ingress to implement alternative load balancing solutions and not rely on the solution offered by Kubernetes Services through kube-proxy. The above reasons, together with the NGINX expertise Shopify acquired through running and maintaining its NGINX (supercharged with Lua) edge load balancers, made the Edgescale team migrate the ingress on our Kubernetes clusters from glbc to ingress-nginx.

Even though we now leverage endpoint awareness through ingress-nginx to enhance our load balancing solution, there are still additional performance issues that arise at our scale. The Edgescale team, which is in charge of architecting, building and maintaining Shopify’s edge infrastructure, began contributing optimizations to the ingress-nginx project to ensure it performs well at Shopify’s scale and as a way to give back to the ingress-nginx community. This post focuses on the dynamic configuration optimization we contributed to the project which allowed us to reduce the number of NGINX reloads throughout the day.

Now’s the perfect time to introduce myself 😎— my name is Francisco Mejia, and I’m a Production Engineering Intern on the Edgescale team. One of my major goals for this internship was to learn and become familiar with Kubernetes at scale, but little did I know that I would spend most of my internship contributing to a Kubernetes project!

One of the first performance bottlenecks we identified when using ingress-nginx was the high frequency of NGINX reloads during application deployments. Whenever application deployments occurred on the cluster, we observed increased latencies for end users which lead us to investigate and find a solution to this problem.

NGINX uses a configuration file to store the active endpoints for every service it routes traffic to. During deployments to our clusters, Pods running the older version are killed and replaced with Pods running the updated version. It’s possible that a single deployment may trigger multiple reloads, as the controller receives updates for the endpoint changes. Any time NGINX reloads it reads an NGINX configuration file into memory, starts new worker processes and signals the old worker processes to shutdown gracefully.

Although NGINX reloads gracefully, reloads are still detrimental from a performance perspective. Old worker processes being shut down results in increased memory consumption, and the reset of keepalive connections and load balancing state. Clients that previously had open keepalive connections with the old worker processes now need to open new connections with the new worker processes. In addition, opening connections at a faster rate means that the server will need to allocate more resources to handle connection requests. We addressed this issue by introducing dynamic configuration to the ingress controller.

To reduce the number of NGINX reloads when deployments occur we added the ability for ingress-nginx to update application endpoints by maintaining them in-memory, thereby eliminating the need for NGINX to regenerate the configuration file and issue a reload. We accomplished this by creating an HTTP endpoint inside NGINX using lua-nginx-module that receives endpoint configuration updates from the ingress controller and modifies an internal Lua shared dictionary that stores the endpoint configuration for all services. This mechanism enabled us to both: skip NGINX reloads during deployments and significantly improved request latencies, especially during deploys.

Here’s a more granular look at the general flow when we instruct the controller to dynamically configure endpoints:

  1. A Kubernetes resource is modified, created or deleted.
  2. The ingress controller sees the changes and sends a POST request to /configuration/backends containing the up to date list of endpoints for every service.
  3. NGINX receives a POST request to /configuration/backends which is served by our Lua configuration module.
  4. The module handles the request by receiving the list of endpoints for all services and updates a shared dictionary that keeps track of the endpoints for all backends.

My team carried out tests to compare the latency of requests between glbc and ingress-nginx with dynamic configuration enabled. The test consisted of the following:

  1. Find a request rate for the load generator where the average request latency is under 100ms when using glbc to access an endpoint.
  2. Use the same rate to generate load on an endpoint behind ingress-nginx and compare latencies, standard deviation and throughput.
  3. Repeat step 1, but this time carry out application deploys while load is being generated to endpoints.

The latencies were distributed as follows:

Latency by percentile distribution glbc vs dynamic

Up until the 99.9th percentile of request latencies both ingresses are very similar, but when we reach 99.99th percentile or greater, ingress-nginx outperforms glbc by multiple orders of magnitude. It’s vital to minimize the request latency as much as possible as it highly impacts merchants success.

We also compared the request latencies when running the ingress controller with and without dynamic configuration. The results were the following:

Latency by percentile distribution - Dynamic configuration enabled vs disabled

From the graph, we can see that the 99th percentile of latencies when using dynamic configuration is comparable to the 99th percentile when using the vanilla ingress controller - with roughly similar results.

We also carried out the previous test, but this time during application deploys - here’s where we really get to see the impact of the dynamic configuration feature. The results are depicted below:

Latency by percentile distribution deploys - dynamic vs vanilla

It’s clear from the graph that there was a huge increase in performance after the 80th percentile from ingress-nginx with dynamic configuration.

When operating at Shopify’s scale a whole new world of engineering challenges and opportunities arise. Together with my teammates, we have the opportunity to find creative ways to solve optimization problems involving both Kubernetes and NGINX. We contributed our NGINX expertise to the ingress-nginx project and will continue doing so. The contribution explained throughout this post wouldn’t have been possible without the support of the ingress-nginx community, massive kudos to them 🎉! Keep an eye out for more ingress-nginx updates on its GitHub page!