Deploying Airflow 2 on EKS using Terraform, Helm and ArgoCD — Part 2/2

airflow 2 deployment

This is part 2/2 of the article. If you missed the first part, please check it here.

In this article we are going to deploy Airflow on the EKS Cluster we have created before. Instead of manually deploy it, we will use the ArgoCD application we also installed on Kubernetes cluster.

Project Structure

kubernetes folder

Let’s take a look on a folder we haven’t talked about in part 1 of this article.

  • terraform: it’s the terraform configuration project we have talked about on part 1 of this artigle
  • kubenetes: here we have the helm projects of the applications we want to deploy. In this case, apps/airflow

Airflow Deployment

As mentioned before, we are going to use Airflow Helm chart. We can use the project directly from artifacthub.io or download the source and customize as we want. We going to deploy using the later option.

  1. Go to kubertes/apps/airflow folder
  2. We will use this Helm: https://artifacthub.io/packages/helm/airflow-helm/airflow
  3. In order to download, please follow the commands below:
helm repo add airflow-helm https://airflow-helm.github.io/charts
helm pull airflow-helm/airflow --version 8.5.2

4. Once you have downloaded the Helm project your folder will look like below:

airflow helm folder

This path in our repository will be used in next steps to deploy Airflow in Kuebernetes.

Helm Chart Configuration

There are several customization we can do since we have the source helm chart project into our repository.

For most of the scenarios, using the default helm chart is good enough. The only file we must change as we need is the values.yaml.

It allows us to configure the Apache Airflow as we want. Even though Airflow Configuration is a topic for another entire article, let’s take a look in some of the configurations I’ve changed for this tutorial:

  • executor: we will use KubernetesExecutor;
  • fernetKey: it’s a critical security value. In this example, the raw value is exposed. Don’t do that in production environment. Please consider using AWS Secret Manager, Kubernetes secrets or any other way you prefer to hide this value;
  • postgresql: enabled: false ( we are setting false because we want to use the RDS we have created before)
  • redis: enable: false
  • externalDatabase: we need to setup the RDS we have created before. Remember we have set up a kubernetes secret for this

There are a lot more you can configure like e-mail, ssl, gitSync and more. For more details about the configuration, please check this link.

ArgoCD Configuration

It’s finally time to configure the ArgoCD and set up the Airflow Helm chart to be deployed on Kubernetes Cluster.

For the next steps, I will assume you have the cluster up and running with all the configuration made into part 1 of this article.

  1. Login into ArgoCD Web UI (http://localhost:8001/api/v1/namespaces/argocd/services/https:argocd-server:443/proxy/)
  2. On the left bar, click on “Manage your repositories, projects, settings”
  3. You will see a page like below. Click on “Repositories”
Argo CD Configuration

4. We are going to configure our repository using SSH

5. We need to generate ssh keys and configure our repository as well. Let’s follow the github documentation (commands below are related to Linux):

Generating ssh-keygen

6. Get the public key with following command:

cat /home/your-user/.ssh/id_rsa_argo.pub

7. Go to your github settings on “SSH and GPG Keys”

8. Click on “New SSH Key” and add the value you got on step 6

adding new key

9. Go back to ArgoCD and click on “CONNECT REPO USING SSH”

10. Add a name: Apache Airflow

11. Select a project: default

12. Repository URL: paste your ssh git path (mine is -> git@github.com:vitorcarra/airflow-kubernetes-iac.git)

13. Paste the private SSH key you have generated before

cat /home/your-user/.ssh/id_rsa_argo

14. It should look like below:

adding new git

15. Click on “Connect”

Repo already configured

16. On left menu click on “Manage your applications”

17. Click on new app

18. Fill the configuration as below:

  • Application Name: airflow2
  • Project: default
  • Sync Policy: Automatic
  • Check Prune Resources
  • Prune Propagation Policy: foreground
  • Source: select the repository we have configured before
  • Revision: HEAD
  • Path: infrastructure/kubernetes/apps/airflow/
  • Destination: select the kubernetes cluster (you will see only one)
  • Namespace: airflow
adding new app

19. Click on “Create” button

The next screen you will see should look like below:

Applications home screen

20. Click on “airflow2” project and will see a screen like below:

airflow 2 deployment in progress

21. When you see app health as “Healthy” you can follow the next steps and access Airflow Web UI

Access Airflow Web UI

Now we have Airflow up and running in our AWS EKS cluster. It’s time to see Airflow web UI.

  1. Open a terminal and run below command. It will map the Airflow running on Kubernetes to your localhost:8080
kubectl port-forward svc/airflow2-web 8080:8080 -n airflow

2. Open a web browser and access below url:

http://localhost:8080
airflow login page

3. Username: admin

4. Password: admin

5. Now you have a Airflow 2.x running on EKS and ready to be used

airflow home screen

Summary

At the end of this article you have learned how to deploy a Kubernetes Cluster on AWS EKS using Terraform, how to deploy ArgoCD on EKS using Helm and Terraform and how to deploy Apache Airflow 2.x on EKS using ArgoCD.

There are much more configurations to be done on Airflow and ArgoCD but I hope you now have a direction about how to integrate all these amazing tools.

Below you can find great links that can help you with next steps:

Attention

If you let this infrastructure up and running you will be billed by the time it was kept up. So please make sure you destroy everything after your study.

terraform destroy -var-file terraform.tfvars

--

--

--

I’m a Data Engineer and guitar player.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Why I Use DMXzone as a Framework (And You Should, Too!)

10 things you need to know about Date and Time in Python with datetime, pytz, dateutil & timedelta

No, you should not say “No” to NoCode: It’s rocking

Pastebot 2.0

DevFest Hellas 2020 — behind the scenes!

Introducing AWS App Runner

AWS Appsync: Simplifying Application Development

AWS Appsync

To Win At Work: Be A Jack of All Trades, Master of One

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Vitor Carra

Vitor Carra

I’m a Data Engineer and guitar player.

More from Medium

Deploying Airflow 2.0 on EKS using Terraform, Helm and ArgoCD — Part 1/2

Deploying Airflow in Local Kubernetes Cluster: Part II

Autoscaling your Airflow using DataDog External Metrics

Stream Landing Kafka Data to Object Storage using Terraform