Microsoft Introduces Multimodal SLMs Trained on NVIDIA GPUs

James Ding
Feb 26, 2025 15:38

Microsoft unveils new Phi SLMs, together with the multimodal Phi-4, educated on NVIDIA GPUs, enhancing AI capabilities with environment friendly useful resource utilization.

Microsoft has introduced the most recent additions to its Phi household of small language fashions (SLMs), that includes the brand new Phi-4-multimodal and Phi-4-mini fashions, each educated utilizing NVIDIA GPUs. This improvement marks a big step within the evolution of language fashions, specializing in effectivity and flexibility, in response to NVIDIA.

Developments in Small Language Fashions

SLMs have emerged as a sensible resolution to the challenges posed by massive language fashions (LLMs), which, regardless of their capabilities, require substantial computational sources. SLMs are designed to function effectively inside constrained environments, making them appropriate for deployment on units with restricted reminiscence and computational energy.

Microsoft’s new Phi-4-multimodal mannequin is especially noteworthy for its means to course of a number of varieties of knowledge, together with textual content, audio, and pictures. This functionality opens up new prospects for functions comparable to automated speech recognition, translation, and visible reasoning. The mannequin’s coaching concerned 512 NVIDIA A100-80GB GPUs over 21 days, underscoring the intensive computational efforts required to attain its capabilities.

Phi-4-multimodal and Phi-4-mini

The Phi-4-multimodal mannequin boasts 5.6 billion parameters and has demonstrated superior efficiency in automated speech recognition, rating first on the Huggingface OpenASR leaderboard with a phrase error fee of 6.14%. This achievement highlights the mannequin’s potential in enhancing speech recognition applied sciences.

Alongside Phi-4-multimodal, Microsoft additionally launched Phi-4-mini, a text-only mannequin optimized for chat functions. With 3.8 billion parameters, Phi-4-mini is designed to deal with long-form content material effectively, providing a context window of 128K tokens. Its coaching concerned 1024 NVIDIA A100 80GB GPUs over 14 days, reflecting the mannequin’s concentrate on high-quality academic knowledge and code.

Deployment and Accessibility

Each fashions can be found on Microsoft’s Azure AI Foundry, offering a platform for designing, customizing, and managing AI functions. Customers also can discover these fashions via the NVIDIA API Catalog, which affords a sandbox surroundings for testing and integrating these fashions into varied functions.

NVIDIA’s collaboration with Microsoft extends past simply coaching these fashions. The partnership contains optimizing software program and fashions like Phi to advertise AI transparency and help open-source tasks. This collaboration goals to advance AI expertise throughout industries, from healthcare to life sciences.

For extra detailed info, go to the NVIDIA weblog.

Picture supply: Shutterstock

Source link