Microsoft announces Phi-3-vision small AI model that can read images – The Times of India

2 minutes, 20 seconds Read

Phi-3, Microsoft’s latest small language model is now available with a new capability. At the Microsoft Build 2024, CEO Satya Nadella announced three new AI models belonging to the Phi-3 family that can run conversational AI experiences on devices like smartphones. One of them is Phi-3 that can now input images and return answers.


World’s Leading High-rise Marketplace

What is Phi-3-vision AI model

Phi-3-vision is a multimodal model, which means that it has the ability to input images and text and receive text responses.For example, users can ask questions about a pie-chart or ask an open-ended question about specific images, such as the breed of a dog, name of a flower and so on.
“Phi-3 models are powerful, cost-effective and optimised for personal devices,” the company said. Phi-3-vision has 4.2 billion parameters.
The Phi-3 family of AI small language models (SLMs) are developed by Microsoft and are now available in Azure, which means that developers can experiment with these models in the Azure AI Playground, and they can start building and customising with the models in Azure AI Studio.
Along with Phi-3-vision, Nadella also announced Phi-3-small and Phi-3-medium AI models that have 7 billion and 14 billion parameters.
“With Phi models, you can build apps for Android, the web, iOS, Windows and the Edge. They can take advantage of local hardware available and fall back on the cloud when not, simplifying what developers have to do using one AI model,” Nadella added.

OpenAI’s GPT-4o in Azure AI

The development comes a few days after OpenAI, the company in Microsoft has invested billions, announced GPT-4o, its newest flagship multimodal model that integrates text, image and audio processing. Nadella said that the model is now available in Azure AI Studio and as an API.

This post was originally published on 3rd party site mentioned in the title of this site

Similar Posts