Home
Softono
aws-ecs-airflow

aws-ecs-airflow

Open source MIT HCL
160
Stars
74
Forks
10
Issues
3
Watchers
1 year
Last Commit

About aws-ecs-airflow

Run Airflow in AWS ECS(Elastic Container Service) using Fargate tasks

Platforms

Web Self-hosted Docker

Languages

HCL

Links

airflow-ecs

Setup to run Airflow in AWS ECS containers

Requirements

Local

  • Docker

AWS

  • AWS IAM User for the infrastructure deployment, with admin permissions
  • awscli, intall running pip install awscli
  • terraform >= 0.13
  • setup your IAM User credentials inside ~/.aws/credentials
  • setup these env variables in your .zshrc or .bashrc, or in your the terminal session that you are going to use
    export AWS_ACCOUNT=your_account_id
    export AWS_DEFAULT_REGION=us-east-1 # it's the default region that needs to be setup also in infrastructure/config.tf
    

Local Development

  • Generate a Fernet Key:

    pip install cryptography
    export AIRFLOW_FERNET_KEY=$(python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())")
    
    More about that [here](https://cryptography.io/en/latest/fernet/)
  • Start Airflow locally simply running:

    docker-compose up --build
    

If everything runs correctly you can reach Airflow navigating to localhost:8080. The current setup is based on Celery Workers. You can monitor how many workers are currently active using Flower, visiting localhost:5555

Deploy Airflow on AWS ECS

To run Airflow in AWS we will use ECS (Elastic Container Service).

Deploy Infrastructure using Terraform

Run the following commands:

make infra-init
make infra-plan
make infra-apply

or alternatively

cd infrastructure
terraform get
terraform init -upgrade;
terraform plan
terraform apply

By default the infrastructure is deployed in us-east-1.

When the infrastructure is provisioned (the RDS metadata DB will take a while) check the if the ECR repository is created then run:

bash scripts/push_to_ecr.sh airflow-dev
By default the repo name created with terraform is `airflow-dev` Without this command the ECS services will fail to fetch the `latest` image from ECR

Deploy new Airflow application

To deploy an update version of Airflow you need to push a new container image to ECR. You can simply doing that running:

./scripts/deploy.sh airflow-dev

The deployment script will take care of:

  • push a new ECR image to your repository
  • re-deploy the new ECS services with the updated image

TODO

  • Create Private Subnets
  • Move ECS containers to Private Subnets
  • Use ECS private Links for Private Subnets
  • Improve ECS Task and Service Role