What's New :
Join Our Open Mentorship Sessions, Register here...
UPSC Interview Guidance Program. Register here...

Multimodal artificial intelligence

  • Category
    Science & Technology
  • Published
    14th Oct, 2023


After a report by “The Information” revealed that Google’s new yet-to-be-released multimodal large language model called ‘Gemini’ was already being tested in a bunch of companies.

Google is also working on a new project called‘Gobi’ which is expected to be a multimodal AI system from scratch, unlike the GPT models.


About multimodal AI:

  • Multimodal AI combines different types of information like text, images, and audio to perform various tasks, such as detecting hateful memes or predicting dialogue lines in videos.
  • Models like OpenAI's DALL.E use this approach to generate images based on text prompts, by finding patterns that connect visual data with image descriptions.
  • In the case of audio, OpenAI's Whisper, a speech-to-text translation model, enables the system to recognize speech in audio and convert it into simple text.

Applications of Multimodal AI:

  • Meta introduced a complex open-source AI system called ImageBind, which incorporates text, visual data, audio, temperature, and movement readings.
    • This system hints at the possibility of future AI including more sensory data like touch, smell, and brain signals.
  • Industries like medicine and autonomous driving benefit from multimodal AI.
    • It helps analyze complex datasets in areas like identifying rare genetic variations and processing CT scans.
    • Additionally, speech translation models like Google Translate use multiple modes for efficient translation across different languages.

Verifying, please be patient.

Our Centers

DELHI (Karol Bagh)

GS SCORE, 1B, Second Floor, Pusa Road, Karol Bagh, New Delhi - 110005 (Beside Karol Bagh Metro Station Gate No. 8)

Get directions on Google Maps

BHUBANESWAR (Jaydev Vihar)

GS SCORE, Plot No.2298, Jaydev Vihar Square, Near HCG Day Care, BBSR - 751013

Get directions on Google Maps

LUCKNOW (Aliganj)

GS SCORE, 2nd Floor, B-33, Sangam Chauraha, Sector H, Aliganj, Lucknow, UP - 226024

Get directions on Google Maps

Enquire Now