NVIDIA GPU Cloud Supports Microsoft Azure for AI Developers
NVIDIA now makes GPU-optimised deep learning and HPC software available to data scientists, researchers and developers when using NVIDIA GPU instances on Microsoft Azure, to reduce the complexity of AI software integration and testing.
Building and testing reliable software stacks to run deep learning software applications - such as TensorFlow, Microsoft Cognitive Toolkit, PyTorch and NVIDIA TensorRT - is generally challenging and time consuming. NVIDIA GPU Cloud (NGC) now supports the Azure platform through the development of ready-to-run, deep learning containers that give developers access to on-demand GPU computing that scales to requirements. In effect, access to prepared containers gives developers, data scientists and researchers a head start on their GPU computing projects.
For example, NVIDIA TensorRT is a combined high-performance inference optimiser and runtime engine. It can take in neural networks trained on various frameworks, optimise the neural network computation, generate a light-weight runtime engine to deploy to the production environment, and then maximise the throughput, latency and performance on several different GPU platforms.
While these are useful abilities, they also involve dependencies at the operating system level and have specific drivers, libraries and runtimes. Many of the applications recommend specific, different versions of the supporting components, as well.
Furthermore, whenever the frameworks and applications are updated to new versions, the setup work has to be redone each time and the new version needs to be tested to make sure its performance is the same or better as before. All of this work must be done before starting on a project.
For HPC, the difficulty is knowing how to deploy the latest software to clusters of systems. Developers must find and install the correct dependencies, test performance and so on, in a multi-tenant environment - that is, reflecting the fact that the software serves multiple customers - and across many systems.
The pre-configured, GPU-accelerated containers available on NGC were developed to make building software stacks simpler. NVIDIA maintains the containers to make sure they can take advantage of up-to-date GPU functionality, and tests, tunes and optimises the complete software stack in the deep learning containers with updates each month to keep performance levels high.
NVIDIA also works with the community and framework developers, and contributes back to open source projects – they made over 800 contributions in 2017. They also work with the developers of the other containers available on NGC to optimise their applications, and to test for performance and compatibility.
NGC with Microsoft Azure
Developers can access 35 GPU-accelerated containers for deep learning software, HPC applications, HPC visualization tools and various partner applications from the NGC container registry and run them on the following Microsoft Azure instance types with NVIDIA GPUs.
NCv3 (1, 2 or 4 NVIDIA Tesla V100 GPUs)
NCv2 (1, 2 or 4 NVIDIA Tesla P100 GPUs)
ND (1, 2 or 4 NVIDIA Tesla P40 GPUs)
The same NGC containers work across Azure instance types, even with different types or quantities of GPUs.
On the Microsoft Azure Marketplace is the NVIDIA GPU Cloud Image for Deep Learning and HPC - a pre-configured Azure virtual machine image with all materials needed to run NGC containers. Once a compatible NVIDIA GPU instance has been launched on Azure, the user pulls the desired containers from the NGC registry into a running instance. More information is available in the Using NGC with Microsoft Azure documentation.
As well as using NVIDIA published images on Azure Marketplace to run these NGC containers, Azure Batch AI can also be used to download and run the containers on Azure NCv2, NCv3 and ND virtual machines. www.nvidia.com