Skip to content

Trapping SIG inside Fargate and ECS containers

Wrap any container, which you want to trap signals inside. In my example I use custom SIG handler for prevent Consul Agent sidecar's early shutdown. Timeout can be set through env variable EXIT_TIMEOUT.

Dockerfile

FROM consul:1.6.2

COPY entrypoint.sh ./
RUN chmod +x entrypoint.sh

ENTRYPOINT ["./entrypoint.sh"]

entrypoint.sh

#!/bin/sh
set -x

pid=0
timeout=0

if [ ! -z $EXIT_TIMEOUT ]; then
    timeout=$EXIT_TIMEOUT
fi

# Signals-handler
sig_handler() {
  if [ $pid -ne 0 ]; then
    echo "$1: Waiting $2 sec..."
    sleep $2s
    kill -$1 "$pid"
    wait "$pid"
  fi
  exit $((128+$3));
}

# Setup handlers for SIGTERM and SIGINT signals.
# On callback, kill the last background process, 
# which is `tail -f /dev/null` and execute the specified handler.
# Handler will kill with trapped signal process number from $pid variable.
trap 'kill ${!}; sig_handler SIGTERM ${timeout} 15' SIGTERM
trap 'kill ${!}; sig_handler SIGINT ${timeout} 2' SIGINT

# Run original entrypoint script with received parametrs
docker-entrypoint.sh "$@" &
pid="$!"

# Wait forever
while true
do
  tail -f /dev/null & wait ${!}
done

Pay attention!

  • Fargate and ECS containers use different way for stop_timeout definition. Fargate defines it inside json task definition parameter, but ECS container make it through env variable ECS_CONTAINER_STOP_TIMEOUT inside it's container.
  • Also, be careful with these parameters and wrapper trap timeout value. It has to be equal or less than stop_timeout minus real time for container's graceful shutdown.

Time-based autoscaling on Fargate

Sometimes we want to turn off our staging environments at night to save some money. If your infrastructure use a lot of Fargate containers - you can set there cron scheduler for every task you have.
Only have to remember about API Call limits. If you setup many timers to trigger at the same time - they could be throttled. AWS wont explain exact numbers, but after many checks and fails - we've found out that this number is about 100. And AWS don't want to increase it by our request to support team... Maybe we are not big enough for them. :)

Set parameters

$ export ECS_CLUSTER_NAME={YOUR_ECS_CLUSTER_NAME}
$ export ECS_SERVICE_NAME={YOUR_ECS_SERVICE_NAME}

RegisterScalableTarget

$ aws application-autoscaling register-scalable-target --service-namespace ecs \
    --scalable-dimension ecs:service:DesiredCount \
    --resource-id service/${ECS_CLUSTER_NAME}/${ECS_SERVICE_NAME} \
    --min-capacity 1 \
    --max-capacity 3

PutScheduledAction

$ export SCALE_OUT_ACTION_NAME=fargate-time-based-scale-out

# configure scaling out
$ aws application-autoscaling put-scheduled-action --service-namespace ecs \
    --scalable-dimension ecs:service:DesiredCount \
    --resource-id service/${ECS_CLUSTER_NAME}/${ECS_SERVICE_NAME} \
    --scheduled-action-name ${SCALE_OUT_ACTION_NAME} \
    --schedule "cron(50 23 * * ? *)" \ # every day at 8:50am JST
    --scalable-target-action MinCapacity=3,MaxCapacity=10
$ export SCALE_IN_ACTION_NAME=fargate-time-based-scale-in

# configure scaling in
$ aws application-autoscaling put-scheduled-action --service-namespace ecs \
    --scalable-dimension ecs:service:DesiredCount \
    --resource-id service/${ECS_CLUSTER_NAME}/${ECS_SERVICE_NAME} \
    --scheduled-action-name ${SCALE_IN_ACTION_NAME} \
    --schedule "cron(10 9 * * ? *)" \ # every day at 6:10pm JST
    --scalable-target-action MinCapacity=1,MaxCapacity=1

DeleteScheduledAction

$ aws application-autoscaling delete-scheduled-action --service-namespace ecs \
    --scheduled-action-name ${SCALE_OUT_ACTION_NAME} \
    --resource-id service/${ECS_CLUSTER_NAME}/${ECS_SERVICE_NAME} \
    --scalable-dimension ecs:service:DesiredCount
$ aws application-autoscaling delete-scheduled-action --service-namespace ecs \
    --scheduled-action-name ${SCALE_IN_ACTION_NAME} \
    --resource-id service/${ECS_CLUSTER_NAME}/${ECS_SERVICE_NAME} \
    --scalable-dimension ecs:service:DesiredCount

DescribeScheduledActions

$ aws application-autoscaling describe-scheduled-actions --service-namespace ecs \
    --scheduled-action-names ${SCALE_IN_ACTION_NAME} ${SCALE_OUT_ACTION_NAME}

See also

https://docs.aws.amazon.com/autoscaling/application/userguide/application-auto-scaling-scheduled-scaling.html

Terraform example

resource "aws_appautoscaling_target" "tgt" {
  service_namespace  = "ecs"
  resource_id        = "service/${var.cluster}/${var.service}"
  scalable_dimension = "ecs:service:DesiredCount"
  role_arn           = "${var.role_arn}"
  min_capacity       = 1
  max_capacity       = 1

  lifecycle = {
    create_before_destroy = true
  }
}

// Night OFF (capacity 0) scheduler at 21:00 UTC
resource "aws_appautoscaling_scheduled_action" "night_off" {

  name               = "${var.service}-night-off-timer"
  service_namespace  = "ecs"
  resource_id        = "service/${var.cluster}/${var.service}"
  scalable_dimension = "ecs:service:DesiredCount"
  schedule           = "cron(0 21 * * ? *)"

  scalable_target_action {
    min_capacity = 0
    max_capacity = 0
  }

  depends_on = ["aws_appautoscaling_target.tgt"]
}

// Day ON scheduler (capacity 1) at 5:00 UTC
resource "aws_appautoscaling_scheduled_action" "day_on" {
  name               = "${var.service}-day-on-timer"
  service_namespace  = "ecs"
  resource_id        = "service/${var.cluster}/${var.service}"
  scalable_dimension = "ecs:service:DesiredCount"
  schedule           = "cron(0 5 * * ? *)"

  scalable_target_action {
    min_capacity = 1
    max_capacity = 1
  }

  depends_on = ["aws_appautoscaling_target.tgt"]
}

ESR repo access from another account

Set Repo Permissions Policy in main Account A for Account B:

{
    "Version": "2012-10-17",
    "Statement": [
      {
        "Effect": "Allow",
        "Action": [
          "ecr:GetAuthorizationToken",
          "ecr:BatchCheckLayerAvailability",
          "ecr:GetDownloadUrlForLayer",
          "ecr:GetRepositoryPolicy",
          "ecr:DescribeRepositories",
          "ecr:ListImages",
          "ecr:DescribeImages",
          "ecr:BatchGetImage",
          "ecr:InitiateLayerUpload",
          "ecr:UploadLayerPart",
          "ecr:CompleteLayerUpload",
          "ecr:PutImage"
        ],
        "Condition": {
          "StringLike": {
            "aws:ResourceTag/Team": "Payments"
          }
        },
        "Principal": {
          "AWS": [ // Cahnge to "root" for whole account B access
            "arn:aws:iam::YOUR-ACCOUNT-B-ID:user/devusr2" 
          ]
        },
        "Sid": "AllowCrossAccountPushAndPull"
      }
    ]
  }

Comments