My Blogs - Cloudhaven
Auto-scaling on Cloud
Auto-scaling: An Introduction
Auto-scaling is the key feature in the Cloud if you are an App/Web based company, budding startup or a fully grown enterprise looking to improve your service and ROI on your existing infrastructure.
It provides you with the level of scale unlike any physical infrastructure. The scale ranges from a set minimum number of resources to a flexible maximum set of resources to support your internet facing requests and demand, only limited by the Cloud's global infrastructure.
Need for Auto-scaling
Typically now, small businesses get a huge load of requests on the Web with SEO and intuitive Ad campaigns. Startups beat the Internet with better and innovative ideas of going about our day-to-day lives and become the next big thing in a matter of hours/days. In the traditional IT scenario, the scale here is limitless and quickly becomes immeasurable until the point of no return where there is a bottleneck in the infrastructure serving this genuine, thought provoking fun idea < End-of-Story >.
Auto-scaling is the Cloud's key solution to the growth bubble without hurting your pocket or robbing the bank. Startups can now initiate prototypes and refine on their product or speciality without having a procurement strategy for setup infrastructure. They can simply spin up some instances i.e. virtualized servers on any of the Global Cloud Providers (GCP) available in the market.
Why auto-scaling is the Cloud way for applications and huge workloads
With the increasing boom of data generated from all the services that we consume as millennials of the digital age, Artificial Intelligence and Big Data create huge workloads for data processing and these effort consumes the most resources until date. In the traditional IT scenario, such processing workloads would take a long and painstaking process to analyze small blocks of data and then stitch the outputs to arrive at a conclusion for a specific attribute of the given dataset. For further analysis, the workload needs to be re-processed in the same manner with a similar timeline.
Why auto-scaling by metrics and not any other way
All is not roses in the Auto-scaling land as well. Auto-scaling also requires some controls to be successful, otherwise it could yield no different from the physical on-premise infrastructure. Auto-scaling is supervised process and can be managed automatically with simple functions or built-in process provided by the GCP. Generally, Cloud services share enough metrics to the user on the usage and the status of all the features. Simple code functions can utilize the shared metrics to scale-in and scale-out the server infrastructure or allow the GCP to manage them for you.
Steps to work out before auto-scaling
With increased power comes increased responsibility. All is well when you setup auto-scaling and run your IT workloads on the new infrastructure. The workouts come only when you see an increased load when auto-scaling actually triggers in. Global corporations simulate such events very often, as often as daily to check their infrastructure is running as desired. Simulation of such events is key for ensuring the quality of service (QoS) in a digital or media business. When experiencing an increase in utilization, the resources need to scale out horizontally and spin up new instances that could start serving requests once it is ready out-of-the-box "not literally", I meant, as soon as they are provisioned.
The following list is a pretty decent one to start with -
So… What should I auto-scale?
In my opinion, there is no harm in setting an auto-scaling mechanism around everything in the infrastructure that can be scaled if you dont want to be worried about an outage. If something cannot be auto-scaled then it is time for you to lose it from the ecosystem and look for an in-place auto-scaling replacement. Most SaaS (Software as a Service) based support services are scalable and run in your own Cloud or on the vicinities of the Cloud and are encouraging options in comparison with traditional software.
Here's a simple list but as I said it is "everything" -
How to avoid auto-scaling everything
Yes, auto-scale everything that is possible and change anything that is not but lets not overdo anything to an extent that we drain our finances without proper utilization. This is where an Architect decides the threshold limits for each and every service that is auto-scaled and arrives at a desired level for each of the services.
How to discern these set thresholds for auto-scaling
Frankly, it differs for every case of auto-scaling but the ground rule is to set a calculated threshold that does not bring any downtime to the system. Thresholds need to be forgiving of the time a new server takes to setup and serve requests. Specific cooldown periods needs to be forced to ensure the auto-scaling mechanism for scaling in and out does not under or over-provision resources and create an impact for service.