Deploying Airflow 2 on EKS using Terraform, Helm and ArgoCD — Part 2/2
Learn how to deploy Apache Airflow 2.x on Kubernets using ArgoCD, git-sync, Terraform and helm charts.
This is part 2/2 of the article. If you missed the first part, please check it here.
In this article we are going to deploy Airflow on the EKS Cluster we have created before. Instead of manually deploy it, we will use the ArgoCD application we also installed on Kubernetes cluster.
Let’s take a look on a folder we haven’t talked about in part 1 of this article.
- terraform: it’s the terraform configuration project we have talked about on part 1 of this artigle
- kubenetes: here we have the helm projects of the applications we want to deploy. In this case, apps/airflow
As mentioned before, we are going to use Airflow Helm chart. We can use the project directly from artifacthub.io or download the source and customize as we want. We going to deploy using the later option.
- Go to kubertes/apps/airflow folder
- We will use this Helm: https://artifacthub.io/packages/helm/airflow-helm/airflow
- In order to download, please follow the commands below:
helm repo add airflow-helm https://airflow-helm.github.io/charts
helm pull airflow-helm/airflow --version 8.5.2
4. Once you have downloaded the Helm project your folder will look like below:
This path in our repository will be used in next steps to deploy Airflow in Kuebernetes.
Helm Chart Configuration
There are several customization we can do since we have the source helm chart project into our repository.
For most of the scenarios, using the default helm chart is good enough. The only file we must change as we need is the values.yaml.
It allows us to configure the Apache Airflow as we want. Even though Airflow Configuration is a topic for another entire article, let’s take a look in some of the configurations I’ve changed for this tutorial:
- executor: we will use KubernetesExecutor;
- fernetKey: it’s a critical security value. In this example, the raw value is exposed. Don’t do that in production environment. Please consider using AWS Secret Manager, Kubernetes secrets or any other way you prefer to hide this value;
- postgresql: enabled: false ( we are setting false because we want to use the RDS we have created before)
- redis: enable: false
- externalDatabase: we need to setup the RDS we have created before. Remember we have set up a kubernetes secret for this
There are a lot more you can configure like e-mail, ssl, gitSync and more. For more details about the configuration, please check this link.
It’s finally time to configure the ArgoCD and set up the Airflow Helm chart to be deployed on Kubernetes Cluster.
For the next steps, I will assume you have the cluster up and running with all the configuration made into part 1 of this article.
- Login into ArgoCD Web UI (http://localhost:8001/api/v1/namespaces/argocd/services/https:argocd-server:443/proxy/)
- On the left bar, click on “Manage your repositories, projects, settings”
- You will see a page like below. Click on “Repositories”
4. We are going to configure our repository using SSH
5. We need to generate ssh keys and configure our repository as well. Let’s follow the github documentation (commands below are related to Linux):
6. Get the public key with following command:
7. Go to your github settings on “SSH and GPG Keys”
8. Click on “New SSH Key” and add the value you got on step 6
9. Go back to ArgoCD and click on “CONNECT REPO USING SSH”
10. Add a name: Apache Airflow
11. Select a project: default
12. Repository URL: paste your ssh git path (mine is -> firstname.lastname@example.org:vitorcarra/airflow-kubernetes-iac.git)
13. Paste the private SSH key you have generated before
14. It should look like below:
15. Click on “Connect”
16. On left menu click on “Manage your applications”
17. Click on new app
18. Fill the configuration as below:
- Application Name: airflow2
- Project: default
- Sync Policy: Automatic
- Check Prune Resources
- Prune Propagation Policy: foreground
- Source: select the repository we have configured before
- Revision: HEAD
- Path: infrastructure/kubernetes/apps/airflow/
- Destination: select the kubernetes cluster (you will see only one)
- Namespace: airflow
19. Click on “Create” button
The next screen you will see should look like below:
20. Click on “airflow2” project and will see a screen like below:
21. When you see app health as “Healthy” you can follow the next steps and access Airflow Web UI
Access Airflow Web UI
Now we have Airflow up and running in our AWS EKS cluster. It’s time to see Airflow web UI.
- Open a terminal and run below command. It will map the Airflow running on Kubernetes to your localhost:8080
kubectl port-forward svc/airflow2-web 8080:8080 -n airflow
2. Open a web browser and access below url:
3. Username: admin
4. Password: admin
5. Now you have a Airflow 2.x running on EKS and ready to be used
At the end of this article you have learned how to deploy a Kubernetes Cluster on AWS EKS using Terraform, how to deploy ArgoCD on EKS using Helm and Terraform and how to deploy Apache Airflow 2.x on EKS using ArgoCD.
There are much more configurations to be done on Airflow and ArgoCD but I hope you now have a direction about how to integrate all these amazing tools.
Below you can find great links that can help you with next steps:
If you let this infrastructure up and running you will be billed by the time it was kept up. So please make sure you destroy everything after your study.
terraform destroy -var-file terraform.tfvars