Concepts
Autoscaling

Autoscaling your Fargate Services

When you run your services in a production environment, you typically have multiple instances of each server for high availability.

Flightcontrol lets you set the minimum and maximum number of instances for each service (web server or worker). In addition, you can configure the instance size to use with each service.

Autoscaling Rules

Flightcontrol sets up autoscaling rules on Fargate for each type of service (web service and worker).

The default autoscaling rules used by both types are:

70% CPU utilization

70% Memory utilization

Web Services

For Fargate web services, Flightcontrol also sets a threshold of 500 requests over the course of 60 seconds, measured twice in a three minute period. If this threshold is exceeded, another Fargate instance gets added to your set, up to the maximum number of instances.

Configuring Autoscaling in the Dashboard

The autoscaling configuration is available in the dashboard. For an existing service, you will find the options under the Config tab.

Number of Instances

The minumum and maximum number of instances are options in the Instance configuration section of the service configuration. If these numbers are different, Flightcontrol will tell ECS Fargate to use autoscaling. If the numbers are the same, autoscaling wouldn't be necessary.

Showing the minimum and maximum instance configuration in the dashboard

Both minimum and maximum instances may be specified for each service. This can also vary between environments, as each environment is configured separately.

Thresholds

The autoscaling thresholds are configured in the Autoscaling section of the service configuration.

Showing the configuration options for autoscaling in the dashboard

You can set different thresholds for different environments. For example, you may want to set a higher threshold for your staging environment, and a lower threshold for your production environment.

Configuring Autoscaling in Code

With flightcontrol.json as your configuration option, the minInstances and maxInstances attributes can be set for each individual service.

The following example shows a Flask web application that has a minimum of 2 instances, and a maximum of 5 instances to run at any one time.

We have also configured the autoscaling parameters with the following:

  • CPU Threshold of 60%
  • Memory Threshold of 60%
  • Cooldown Timer of 300 seconds
  • Requests per Target of 1000
flightcontrol.json
{
  "$schema": "https://app.flightcontrol.dev/schema.json",
  "environments": [
    {
      "id": "production",
      "name": "Production",
      "region": "us-west-2",
      "source": {
        "branch": "main"
      },
      "services": [
        {
          "id": "flask-web",
          "name": "Flask Web",
          "type": "fargate",
          "buildType": "nixpacks",
          "cpu": 0.5,
          "memory": 1,
          "minInstances": 2,
          "maxInstances": 5,
          "autoscaling": {
            "cpuThreshold": 60,
            "memoryThreshold": 60,
            "cooldownTimerSecs": 300,
            "requestsPerTarget": 1000
          },
          "envVariables": {},
          "healthCheckPath": "/healthcheck"
        }
      ]
    }
  ]
}

Conclusion

We encourage everyone that uses cloud environments to monitor their costs, and to consider how instance counts and sizes affect cost and performance.

You may also need to adjust the autoscaling parameters to suit your application's performance needs. For example, if your application is CPU intensive, you may want to increase the CPU threshold to 80%.