AWS AutoScaling | Blue Matador

Effects

Here is a list of common causes, in decreasing order of likelihood:

Bad configuration. If newly launched servers don’t initiate properly, they’ll either shut down themselves, lock up, or immediately fail health checks.
Disabled auto-scaling actionsM. If a server is terminated, and auto-scaling actions are disabled, the auto-scaling group will refuse to replace that server.
Failed application health checks. If a new release, or a change of configuration causes the servers to fail health checks, you’ll be in a constant state of trying to launch more servers.
Cloud error during relaunch. If your cloud provider is unable to launch additional servers due to their own errors or server shortage, auto-scaling actions will have no impact.
Grace period is too short. If your servers take more time to come into rotation than the grace period allows, they will be shutdown. The auto-scaling group will never be able to recover.
Spot instance pricing too low. If your bid is too low, the auto-scaling group will not launch more servers.

Quick Fix

Disable all auto-scaling actions so the problem doesn’t get worse. Manually launch more instances and associate them with the scale group.

If launching new servers is rejected by your cloud provider, re-purpose existing, low-load servers.

Watch your cloud status page for any reported errors. Enable all auto-scaling actions when the problem has subsided.

Thorough Fix

Test your auto-scaling configuration and machine images to make sure instances come into rotation appropriately. Automate your cloud configuration changes to avoid errors in the future.

Adjust your auto-scaling cooldown to match how long new servers take to pass health checks.

If applicable, rethink your spot instance pricing strategy. Many organizations bid far more than the on-demand price to ensure their spot instances are never terminated abruptly.

Resources

AWS Status Page (Amazon)
AWS Autoscaling (Amazon)