From the course: Cloud-Based AI Solution Design Patterns

Unlock this course with a free trial

Join today to access over 24,900 courses taught by industry experts.

AI-workload autoscaling

AI-workload autoscaling

- Perhaps the most compelling reason to deploy an AI system in a cloud is to be able to scale its underlying infrastructure so that it can effectively manage and respond to workload fluctuations. This is what the AI workload autoscaling design pattern is for. It essentially demonstrates the dynamic scalability of cloud infrastructure to optimize resource allocation for AI workloads. Although not explained in this video, it's worth mentioning that orchestration platforms can introduce their own scaling capabilities and are also often capable of interacting with cloud infrastructure scaling on their own. There are four common ways this pattern can be applied: horizontal autoscaling, vertical autoscaling, scheduled autoscaling, and predictive autoscaling. With horizontal autoscaling, a specialized horizontal scaling mechanism is used to automatically provide the required amount of virtual server instances needed to handle a given workload. These server instances are created and destroyed…

Contents