Enabling Ingress on Airflow with GKE

12 Aug 2021

It is rather simple to create an Ingress resource for your Airflow webserver deployment and there are three ways that I know of to do it:

  • plain Kubernetes manifest
  • as a Helm chart value
  • as a provisioned Terraform resource

Airflow Timeout > GCP Interval == Value Error

There is a gotcha that needs to be addressed if you are using the Helm chart: decrease the webserver.readinessProbe.timeoutSeconds from the default 30 to 5 or change the check-interval value from 5 to 30.

Because the default checkIntervalSec value for a health check on GCP is only 5 seconds while the default on the Airflow Helm chart is 30, this results in a value error 1 that will prevent your Ingress (and subsequently, the GCP load balancer) from being created.

ManagedCertificate for Ingress

GCP has ManagedCertificates to make setting up the GCP external load balancer (which will be automagically created from your Kubernetess Ingress resource) even easier.

apiVersion: networking.gke.io/v1beta1
kind: ManagedCertificate
metadata:
  name: airflow
  namespace: airflow
spec:
  domains:
    - airflow.hostyhost.com

Make sure that the ManagedCertificate is created before the Ingress or the Ingress will fail to be created.

Plain Kubernetes Manifest

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: airflow-ingress
  namespace: airflow
  annotations:
      kubernetes.io/ingress.global-static-ip-name: 'my-unique-global-loadbalancer-ip'
      networking.gke.io/managed-certificates: 'airflow-certificate'
      kubernetes.io/ingress.allow-http: 'false'
spec:
  rules:
  - host: testing-airflow.hostyhost.com
    http:
      paths:
      - backend:
          service:
            name: airflow-webserver
            port:
              name: airflow-ui  # based on an Airflow release deployed with Helm
        path: /*

Helm Chart Value

There’s a handy parameter in the official Airflow Helm chart to create an Ingress resource here.

ingress:
  enabled: true
  web:
    annotations:
      kubernetes.io/ingress.global-static-ip-name: 'my-unique-global-loadbalancer-ip'
      networking.gke.io/managed-certificates: 'airflow-certificate'
      kubernetes.io/ingress.allow-http: 'false'
    host: 'airflow.hostyhost.com'
    path: '/*'

I don’t configure tls here since I am using a GCP ManagedCertificate for SSL.

Terraform

Make sure that the kubernetes provider is enabled in your Terraform project before defining the following resource.

resource "kubernetes_ingress" "airflow-ingress" {
  metadata {
    name      = "airflow-ingress"
    namespace = "airflow"

    annotations = {
      "kubernetes.io/ingress.global-static-ip-name" = google_compute_global_address.loadbalancer.name
      "networking.gke.io/managed-certificates"      = "airflow-certificate"
      "kubernetes.io/ingress.allow-http"            = "false"
    }
  }

  spec {
    rule {
      host = "airflow.hostyhost.com"

      http {
        path {
          backend {
            service_name = "airflow-webserver"
            service_port = 8080
          }

          path = "/*"
        }
      }
    }
  }
}

  1. The error looks like: Error during sync: error running backend syncing routine: googleapi: Error 400: Invalid value for field 'resource.timeoutSec': '30'. TimeoutSec should be less than checkIntervalSec., invalid ↩︎