Gemini Embedding 2 is natively “multimodal.” (opens in new tab)
Gemini Embedding 2 is natively “multimodal.” This means it can process different types of media in one unified embedding space, and intuitively understand how they relate and connect to each other across 100+ languages. For example, it doesn’t just see "a video of a dog" — it understands how that video relates to the word "puppy" and the sound of a bark, all at the same time. 🐕 Start building today via the Gemini API and Vertex AI.
Read the original article