Building off its recent GTC announcements which introduced its Base Command software to control AI workloads, Nvidia said it had hooked up with NetApp to launch its cloud-based Base Command Platform, which will provide access to SuperPods from $90,000 a month later in the northern hemisphere summer.
NetApp is providing the flash storage and managing the customers, with Nvidia owning the equipment that is housed in Equinix data centres.
“The intent here is that customers can have access to this powerful supercomputer, the superpod just on a rental basis, they can experience it, they can do their work, and from there they can graduate to either acquiring their own superpod or to go and do AI at scale, for example in the public cloud,” Nvidia head of enterprise computing Manuvir Das said.
“What this does is creates a true hybrid model, where for the customer, it’s the same single interface for submitting their jobs and doing all their AI work. That interface can be used for their own superpod equipment that is on-premises or for infrastructure from instances with GPUs in the cloud, but it’s the same experience either way.”
Das added the smallest footprint to be offered to customers will be three or four DGX A100 machines clustered together, instead of the full 20.
Nvidia said customers using Base Command can deploy AI workloads onto AWS SageMaker, with support for Google Cloud arriving soon.
For customers that cannot afford a SuperPod, Nvidia is opening up the technology inside and allow system manufacturers to “pull out all the individual pieces that were engineered inside the DGX”.
At Computex on Tuesday, new systems using BlueField-2 data processing units (DPU) from Asus, Dell Technologies, Gigabyte, QCT, and Supermicro were announced.
“BlueField DPUs shift infrastructure tasks from the CPU to the DPU, making more server CPU cores available to run applications, which increases server and data centre efficiency,” the company said.
“The DPU places a ‘computer in front of the computer’ for each server, delivering separate, secure infrastructure provisioning that is isolated from the server’s application domain. This allows agentless workload isolation, security isolation, storage virtualization, remote management and telemetry on both virtualised and bare-metal servers.”
It is expected that servers using BlueField-2 will appear later this year, with “several” to be Nvidia-certified once the specifications are formalised.
Nvidia added it was expanding its certification to include Arm-based CPUs, with servers set to arrive in 2022.
“More and more now you have a server with a CPU, a GPU, and a DPU. And a lot of the computational work of the workload is done on the GPU, and while the DPU is managing the network, and securing the network and providing security capabilities like firewalls,” Das said.
“For example with VMware’s project Monterey, even the functionality of the hypervisor is moving to the DPU, and so what this means is that the host CPU can be thought of more as an orchestrator that is managing the lifecycle of what is happening on the server in the workloads, rather than as the computer engine.”
The company said it is working with Gigabyte on a developer kit that contains an Arm Neoverse Ampere Altra processor, a pair of A100 GPUs and BlueField-2 DPUs, and the Nvidia HPC SDK on board.
“The idea is that application developers across the planet … we have more than 2.5 million developers who use CUDA to program against GPUs, all of them and in fact system manufacturers can use the dev kit as a blueprint to prepare their applications for Arm-based systems,” Das said.
Related Coverage
Nvidia’s ownership of ARM could drive customers to RISC-V, says Xilinx CEO.Nvidia crushes Q1 earnings targets, Q2 guidance strongNewly-made Nvidia GPUs with halved cryptomining rate to carry LHR labelNvidia aims up at mainstream ray-tracing with RTX 3050 laptop GPUsEverything announced at Nvidia’s GTC 2021: A data center CPU, SDK for quantum simulations and more