Runway Scalability
Runway supports scaling for horizontal and vertical resources of a service.
Horizontal
By default, Runway will scale up instances to handle all incoming requests. When a service is not receiving any traffic, instances are scaled down to zero.
Minimum instances
The minimum number of instances of a service. To override the default configuration:
Recommendation: Use this setting if you need to reduce cold start latency for a service.
To learn more, refer to documentation.
Maximum instances
The maximum number of instances of a service. To override the default configuration:
Recommendation: Use this setting if you need to limit the number of connections to a backing service, e.g. database.
To learn more, refer to documentation.
Maxmimum instance concurrent requests
The maximum number of concurrent requests per instance of the service. To override the default configuration:
Recommendation: Use this setting if you need to either optimize cost efficiency, or limit concurrency of a service.
To learn more, refer to documentation.
Vertical
By default, Runway will provision CPU and memory resources limits of 1000m
and 512Mi
, respectively. When a resource limit is exceeded, instance is terminated.
Note: CPU resources can be defined in millicores. If your container needs two full cores to run, you would put the value 2000m
. If your container only needs ¼ of a core, you would put a value of 250m
.
Memory
The memory limit of an instance. To override the default configuration:
Recommendation: For right-sizing a service, refer to capacity planning.
To learn more, refer to documentation.
CPU
The CPU limit of an instance. To override the default configuration:
Recommendation: For right-sizing a service, refer to capacity planning.
To learn more, refer to documentation.
CPU Boost
Provide additional CPU during instance startup time. To enable configuration:
Recommendation: Use this setting if you need to reduce cold start latency for a service.
To learn more, refer to documentation.