GPU Virtualization

GPU Virtualization is a feature that allows a single physical GPU to be partitioned into multiple smaller instances, each acting as an independent virtual GPU (vGPU). This enables more efficient utilization of GPU resources by allowing multiple users or workloads to share a single physical GPU while maintaining isolation and performance.

In the Sundara network, vGPU is powered by Vistara. For the inference request sent to Sundara, Vistara will use its resource matching algorithm to find the best matching resource on the GPU network, leveraging Vistara's Hypervisor to run tasks on an isolated, secure, and virtualized environment using MIG vGPUs or GPU passthrough.

Benefits of GPU Virtualization:

  • Improved Resource Utilization: MIG vGPUs enable finer-grained resource allocation of a single physical GPU into multiple smaller instances, allowing users to allocate GPU resources more precisely based on workload requirements. This leads to improved resource utilization across the Sundara and Vistara network.

  • Scalability and Flexibility: With MIG vGPUs, Vistara can scale GPU resources dynamically to meet changing demands, ensuring that users have access to the GPU resources they need when they need them. It simplifies GPU management, making it easier to scale GPU resources.

  • Cost Efficiency: By sharing a single physical GPU among multiple vGPU instances, Vistara reduces hardware costs and improves cost efficiency for users.

  • Isolation and Security: MIG vGPUs provide hardware-level isolation between vGPU instances, ensuring that each user's workload remains secure and isolated from others on the same physical GPU.

Benefits of using GPU passthrough:

  1. Native Performance: GPU passthrough provides native-level performance by directly assigning a physical GPU to a MicroVM, bypassing the hypervisor layer and providing direct access to GPU hardware.

  2. Compatibility: GPU passthrough supports a wide range of GPU hardware and drivers, making it compatible with various GPU vendors and models.

  3. Low Latency: By bypassing the hypervisor layer, GPU passthrough minimizes latency and overhead, making it suitable for latency-sensitive applications such as AI inference and real-time rendering.

  4. Full GPU Features: GPU passthrough allows VMs to access the full capabilities and features of the physical GPU, including hardware acceleration and specialized GPU functions.

Last updated