NVIDIA and Google Cloud Deliver Powerful New Generative AI Platform, Built on the New L4 GPU and Vertex AI
NVIDIA Inference Platform for Generative AI to Be Integrated Into Google Cloud Vertex AI; Google Cloud First CSP to Make NVIDIA L4 GPU Instances Available
GTC—NVIDIA today announced Google Cloud is integrating the newly launched L4 GPU and Vertex AI to accelerate the work of companies building a rapidly expanding number of generative AI applications.
Google Cloud, with its announcement of G2 virtual machines available in private preview today, is the first cloud services provider to offer NVIDIA’s L4 Tensor Core GPU. Additionally, L4 GPUs will be available with optimized support on Vertex AI, which now supports building, tuning and deploying large generative AI models.
Developers can access the latest state-of-the-art technology available to help them get new applications up and running quickly and cost-efficiently. The NVIDIA L4 GPU is a universal GPU for every workload, with enhanced AI video capabilities that can deliver 120x more AI-powered video performance than CPUs, combined with 99% better energy efficiency.
“Surging interest in generative AI is inspiring a wave of companies to turn to cloud-based computing to support their business models,” said Jensen Huang, founder and CEO of NVIDIA. “We are working with Google Cloud to help ensure that the capabilities they require are easily available and able to help fuel the incredible new tools and applications they will create.”
“Generative AI represents a new era of computing — one that demands the speed, scalability and reliability we provide on Google Cloud,” said Amin Vahdat, vice president of Systems & Services Infrastructure at Google Cloud. “As our customers begin to explore the possibilities of generative AI, we’re proud to offer them NVIDIA’s latest L4 GPU innovation as part of our workload-optimized Compute Engine portfolio.”
Helping New Generative AI Applications Come to Life
Google Cloud provides the infrastructure for a wide variety of organizations offering generative AI applications, many of which are designed to help professionals do their work better and faster. Rapid inference is key to successfully running their applications.
Generative AI is also driving a number of new apps that help people connect and have fun. WOMBO, which offers an AI-powered text to digital art app called Dream, has had early access to NVIDIA’s L4 inference platform on Google Cloud.
“WOMBO relies upon the latest AI technology for people to create immersive digital artwork from users’ prompts, letting them create high-quality, realistic art in any style with just an idea,” said Ben-Zion Benkhin, CEO at WOMBO. “NVIDIA’s L4 inference platform will enable us to offer a better, more efficient image-generation experience for users seeking to create and share unique artwork.”
Descript offers AI-powered editing features that let creators remove filler words, add captions and make social-media clips in a few clicks. Or they can use Descript’s generative-AI voice cloning to fix audio mistakes — even create entire voiceover tracks — just by typing.
“Descript uses NVIDIA TensorRT to optimize models to accelerate AI inferencing,” said Andrew Mason, CEO of Descript. “It allows users to replace their video backgrounds and enhance their speech to produce studio-quality content, without the studio.”
NVIDIA L4 GPUs are available in private preview on Google Cloud. Apply for access.
Watch Huang discuss the integration of NVIDIA’s inference platform for generative AI into Google Cloud in his GTC keynote.