What's inside the Nginx Ingress Controller?

Photo by Taylor Vick

Indrek Ots
by Indrek Ots
5 min read

Categories

  • articles

Tags

  • kubernetes
  • nginx

When working with applications in Kubernetes, there’s a high chance that you’ve come across the Nginx Ingress Controller. But what exactly makes it Nginx? From the application developer’s perspective, we usually don’t think about the implementation of the Ingress Controller. We use the Ingress API to define the virtual host and declare routes to services. In this post, we’ll explore the inner workings of the Nginx Ingress Controller and show you that it’s essentially an Nginx reverse proxy running in a container.

Where Can I Find Nginx?

To answer the question, let’s take out our trusty old kubectl and pry open an Nginx Ingress Controller.

$ kubectl get pods -n ingress-nginx

NAME                                        READY   STATUS      RESTARTS   AGE
ingress-nginx-controller-7dcdbcff84-bf6sn   1/1     Running     0          40s
ingress-nginx-controller-7dcdbcff84-mvkkj   1/1     Running     0          2m12s

Let’s exec into a pod and explore.

$ kubectl exec --stdin --tty -n ingress-nginx ingress-nginx-controller-7dcdbcff84-bf6sn -- /bin/bash

At /etc/nginx we can find a file called nginx.conf. The file name should be familiar if you’ve ever set up an Nginx reverse proxy. This is the main configuration file for the Nginx web server.

The default configuration file is pretty dense. Feel free to look inside but in the context of this post the details of the configuration aren’t relevant. The following is a rough outline.

http {
  
    # Server configuration
    server {
        listen 80;
        server_name localhost;
        location / {
        }
    }

    server {
        location / {
        }
    }
}
  • The http block configures the HTTP server
  • Each server block represents a virtual server that listens for HTTP requests on a specific host name and port
  • The location block is used to define how the server should handle different URLs within a given server block

In the grand scheme of things, the configuration’s structure is fairly similar to the ingress resource rules where we define hosts and paths. Let’s dig deeper and see what happens when we create a new ingress.

Create a New Ingress

To better understand how the Nginx Ingress works, let’s create a new ingress. The following configuration creates a Kubernetes deployment, service and an ingress for echo-server, a simple HTTP server that echoes the information about the incoming HTTP request back to the client.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: echo-server
spec:
  replicas: 3
  selector:
    matchLabels:
      app: echo-server
  template:
    metadata:
      labels:
        app: echo-server
    spec:
      containers:
        - name: echo-server
          image: jmalloc/echo-server
          ports:
            - name: http-port
              containerPort: 8080
---
apiVersion: v1
kind: Service
metadata:
  name: echo-service
spec:
  ports:
    - name: http-port
      port: 80
      targetPort: http-port
      protocol: TCP
  selector:
    app: echo-server
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: echo-ingress
spec:
  ingressClassName: nginx
  rules:
    - host: foo.com
      http:
        paths:
        - path: /
          pathType: Prefix
          backend:
            service:
              name: echo-service
              port:
                name: http-port
        - path: /bar
          pathType: Prefix
          backend:
            service:
              name: echo-service
              port:
                name: http-port

Have a look at the ingress configuration. Take note of the host foo.com and the two paths.

Assuming the entire configuration is in a file named echo-server.yml, let’s create the Kubernetes resources using kubectl.

$ kubectl apply -f echo-server.yml

When we now look at the nginx.conf file, a new server block has appeared for foo.com.

## start server foo.com
server {
	server_name foo.com ;

The foo.com server block has three location blocks defined.

location /bar/ {
  ...

location = /bar {
  ...

location / {
  ...

Based on this observation, we can conclude that whenever the state of the objects in the Kubernetes cluster changes, the nginx.conf file is updated accordingly.

Proxy Upstream

In the context of a Kubernetes ingress, Nginx acts as a reverse proxy. It accepts HTTP connections and proxies them to upstream backends, which in our case are pods. Let’s look at the nginx.conf again and see where the upstream backends are defined.

In the foo.com server location blocks, we can see the following.

proxy_pass http://upstream_balancer;

And if we look at the upstream block, it’s defined as follows.

upstream upstream_balancer {
	server 0.0.0.1:1234; # placeholder
	
	balancer_by_lua_block {
		tcp_udp_balancer.balance()
	}
}

Usually, when defining Nginx upstream servers manually, we list all the servers in the upstream block. However, Nginx Ingress Controller doesn’t do that. If it did, the configuration would have to be reloaded every time a pod is added or removed. Instead, it uses the lua-nginx-module to update the upstream configuration.

In a relatively big cluster with frequently deploying apps this feature [lua-nginx-module] saves significant number of Nginx reloads which can otherwise affect response latency, load balancing quality (after every reload Nginx resets the state of load balancing) and so on.

The Nginx Ingress Controller has a binary called dbg in the root directory which we can use to list the current upstream backends.

$ /dbg backends all

The command will output a JSON array containing all currently configured backends. In our case, the one we’re interested in is the echo-service. The following output is considerably truncated. In reality, the array has more entries and the objects have more attributes that aren’t significant in the context of this post.

[
  {
    "name": "default-echo-service-http-port",
    "endpoints": [
      {
        "address": "10.244.1.60",
        "port": "8080"
      },
      {
        "address": "10.244.0.129",
        "port": "8080"
      },
      {
        "address": "10.244.0.69",
        "port": "8080"
      }
    ]
  }
]

The list of endpoints in the dbg output matches the echo-service k8s endpoints.

$ kubectl get endpoints echo-service
NAME           ENDPOINTS                                             AGE
echo-service   10.244.0.129:8080,10.244.0.69:8080,10.244.1.60:8080   15m

Essentially, these are the IP addreses of the echo-service pods.

How nginx.conf Gets Updated

In addition to the Nginx binary, the Nginx Ingress Controller container has a process called nginx-ingress-controller.

$ ps -a
PID   USER     TIME  COMMAND
    1 www-data  0:00 /usr/bin/dumb-init -- /nginx-ingress-controller --publish-service=ingress-nginx/ingress-nginx-controller --election-id=ingress-nginx-leader --controller-class=k8s.io/ingress-nginx --ingress-cla
    7 www-data  0:00 /nginx-ingress-controller --publish-service=ingress-nginx/ingress-nginx-controller --election-id=ingress-nginx-leader --controller-class=k8s.io/ingress-nginx --ingress-class=nginx --configmap=i
   19 www-data  0:00 nginx: master process /usr/bin/nginx -c /etc/nginx/nginx.conf
   25 www-data  0:00 nginx: worker process
   26 www-data  0:00 nginx: cache manager process

It’s a Kubernetes controller that talks to the Kubernetes API server and watches for changes in the state of the cluster. If Kubernetes resources such as Ingresses, Services, Endpoints, Secrets or Configmaps are updated, nginx-ingress-controller will update the Nginx configuration file and reload the configuration if needed.

Summary

In the book Fundamentals of Software Architecture, the authors defined the first law of software architecture: “Everything in software architecture is a trade-off”. To be able to assess the trade-offs properly, it’s good to have a deeper understanding of the components we’re working with and not treat them as black boxes. Knowing how things work under the hood is a superpower.

When it comes to the Nginx Ingress Controller, we’ve learned that it’s essentially an Nginx reverse proxy running in a container with some extra functionality that keeps the configuration in sync with the state of the Kubernetes cluster.