What is planned autoscaling?

Google Cloud Platform (GCP) joins AWS and Microsoft Azure on Planned Autoscaling.

 One of the many FinOps good practices is autoscaling. GCP has just joined AWS and Azure on this subject. If you have a late train, or are starting FinOps, here is a short summary on it.


What is autoscaling?

Autoscaling is a method used in cloud computing that dynamically adjusts the amount of computing resources in your different servers - usually measured by the number of active servers - automatically. They adjust by increasing the number of resources at work or decreasing it.

Why the planned autoscaling?

Autoscaling offers advantages such as the following:


For companies that operate their own Web server infrastructure, autoscaling typically means allowing certain servers to fall asleep during periods of low load, saving on running costs.

For businesses using cloud-based infrastructure, autoscaling can result in lower bills, as most cloud providers charge based on total usage rather than maximum capacity.

Even for companies that cannot reduce the total computing capacity they run or pay for at any given time, autoscaling can help by allowing the company to run less urgent workloads on machines that are freed up by autoscaling during periods of low traffic.

Autoscaling solutions, such as AWS, can also support the replacement of faulty instances and thus provide some protection against hardware, network and application failures.

Autoscaling can offer greater availability in cases where production workloads are variable and unpredictable. For example, while traffic is typically lower at midnight, a static scaling solution can schedule some servers to go to sleep at night.

Finally, autoscaling can better handle unexpected traffic peaks.

GCP's autoscaling

This is configured in the Compute Engine product, which now includes a scaling planning option. Google Cloud is following in the footsteps of Microsoft and AWS.

In turn, Google Cloud offers a planned scaling (or autoscaling) option. Applicable to groups of instances managed on the Compute Engine, it complements three other forms ofautoscaling based on :

  • CPU usage
  • The diffusion capacity of an external HTTP(S) load balancer type load balancer
  • Cloud Monitoring metrics (CPU usage, memory capacity)

The scheduling option involves defining the minimum number of VM instances required, start time, duration and possible recurrence. It will be ensured beforehand that the MIGs concerned include at least one other type of scaling rule.

The approach is comparable to that of AWS on EC2. But beyond the minimum number of VMs, a maximum number and a desired quantity can be specified. Amazon also offers an option that Google Cloud doesn't have: to trigger autoscaling after a threshold of queued messages (on SQS).

Microsoft also offers this capability on Azure, in addition to planned scaling or based on metrics collected at runtime.

We also take advantage of this new feature of Google Cloud Platform to announce the arrival of the GCP connector at Lota.cloud.

If you would like more information on the subject, we advise you to try our platform. It's free and without obligation so there's really no reason not to do so. Go to this link to start your free trial! We will welcome you with a smile in the FinOps arena...



Leave a comment

Test Lota.cloud for free for 30 days