Given a min, max, and desired number of instances, an auto-scaling group will automatically launch and terminate cloud servers based on rules around time, load, or queued work. When the number of launched servers is less than the number you want or need, the auto-scaling group is in a “sick” state.
Here is a list of common causes, in decreasing order of likelihood:
Disable all auto-scaling actions so the problem doesn’t get worse. Manually launch more instances and associate them with the scale group.
If launching new servers is rejected by your cloud provider, re-purpose existing, low-load servers.
Watch your cloud status page for any reported errors. Enable all auto-scaling actions when the problem has subsided.
Test your auto-scaling configuration and machine images to make sure instances come into rotation appropriately. Automate your cloud configuration changes to avoid errors in the future.
Adjust your auto-scaling cooldown to match how long new servers take to pass health checks.
If applicable, rethink your spot instance pricing strategy. Many organizations bid far more than the on-demand price to ensure their spot instances are never terminated abruptly.