Autoscaling your Fargate Services
When you run your services in a production environment, you typically have multiple instances of each server for high availability.
Flightcontrol lets you set the minimum and maximum number of instances for each service (web server or worker). In addition, you can configure the instance size to use with each service.
Autoscaling Rules
Flightcontrol sets up autoscaling rules on Fargate for each type of service (web service and worker).
The default autoscaling rules used by both types are:
70% CPU utilization
70% Memory utilization
Web Services
For Fargate web services, Flightcontrol also sets a threshold of 500 requests over the course of 60 seconds, measured twice in a three minute period. If this threshold is exceeded, another Fargate instance gets added to your set, up to the maximum number of instances.
Configuring Autoscaling in the Dashboard
The autoscaling configuration is available in the dashboard. For an existing service, you will find the options under the Config tab.
Number of Instances
The minumum and maximum number of instances are options in the Instance configuration section of the service configuration. If these numbers are different, Flightcontrol will tell ECS Fargate to use autoscaling. If the numbers are the same, autoscaling wouldn't be necessary.
Both minimum and maximum instances may be specified for each service. This can also vary between environments, as each environment is configured separately.
Thresholds
The autoscaling thresholds are configured in the Autoscaling section of the service configuration.
You can set different thresholds for different environments. For example, you may want to set a higher threshold for your staging environment, and a lower threshold for your production environment.
Configuring Autoscaling in Code
With flightcontrol.json
as your configuration option, the minInstances
and maxInstances
attributes can be set for each individual service.
The following example shows a Flask web application that has a minimum of 2 instances, and a maximum of 5 instances to run at any one time.
We have also configured the autoscaling parameters with the following:
- CPU Threshold of 60%
- Memory Threshold of 60%
- Cooldown Timer of 300 seconds
- Requests per Target of 1000
{
"$schema": "https://app.flightcontrol.dev/schema.json",
"environments": [
{
"id": "production",
"name": "Production",
"region": "us-west-2",
"source": {
"branch": "main"
},
"services": [
{
"id": "flask-web",
"name": "Flask Web",
"type": "fargate",
"buildType": "nixpacks",
"cpu": 0.5,
"memory": 1,
"minInstances": 2,
"maxInstances": 5,
"autoscaling": {
"cpuThreshold": 60,
"memoryThreshold": 60,
"cooldownTimerSecs": 300,
"requestsPerTarget": 1000
},
"envVariables": {},
"healthCheckPath": "/healthcheck"
}
]
}
]
}
Conclusion
We encourage everyone that uses cloud environments to monitor their costs, and to consider how instance counts and sizes affect cost and performance.
You may also need to adjust the autoscaling parameters to suit your application's performance needs. For example, if your application is CPU intensive, you may want to increase the CPU threshold to 80%.