Autoscaling your Fargate Services

When you run your services in a production environment, you typically have multiple instances of each server for high availability.

Flightcontrol lets you set the minimum and maximum number of instances for each service (web server or worker). In addition, you can configure the instance size to use with each service.

Autoscaling Rules

Flightcontrol sets up autoscaling rules on Fargate for each type of service (web service and worker).

The default autoscaling rules used by both types are:

70% CPU utilization

70% Memory utilization

Web Services

For Fargate web services, Flightcontrol also sets a threshold of 500 requests over the course of 60 seconds, measured twice in a three minute period. If this threshold is exceeded, another Fargate instance gets added to your set, up to the maximum number of instances.

Configuring Autoscaling in the Dashboard

Within the dashboard, autoscaling configuration for services is in the Advanced section of the service configuration.

Click the Advanced header to expand out the configuration options.

Showing the location of the Advanced button at the bottom of the Service Configuration screen

After expanding the Advanced section, you'll see the autoscaling configuration options.

Expanded Advanced Service Configuration Options

Both minimum and maximum instances may be specified for each service. This can also vary between environments, as each environment is configured separately.

Configuring Autoscaling in Code

With flightcontrol.json as your configuration option, the minInstances and maxInstances attributes can be set for each individual service.

The following example shows a Flask web application that has a minimum of 2 instances, and a maximum of 5 instances to run at any one time.

We have also configured the autoscaling parameters with the following:

  • CPU Threshold of 60%
  • Memory Threshold of 60%
  • Cooldown Timer of 300 seconds
  • Requests per Target of 1000
  "$schema": "",
  "environments": [
      "id": "production",
      "name": "Production",
      "region": "us-west-2",
      "source": {
        "branch": "main"
      "services": [
          "id": "flask-web",
          "name": "Flask Web",
          "type": "fargate",
          "buildType": "nixpacks",
          "cpu": 0.5,
          "memory": 1,
          "minInstances": 2,
          "maxInstances": 5,
          "autoscaling": {
            "cpuThreshold": 60,
            "memoryThreshold": 60,
            "cooldownTimerSecs": 300,
            "requestsPerTarget": 1000
          "envVariables": {},
          "healthCheckPath": "/healthcheck"


We encourage everyone that uses cloud environments to monitor their costs, and to consider how instance counts and sizes affect cost and performance.

You may also need to adjust the autoscaling parameters to suit your application's performance needs. For example, if your application is CPU intensive, you may want to increase the CPU threshold to 80%.