Google Launches Gemini Embedding 2, Enabling Unified Multimodal Vector Representation

2 day ago / Read about 0 minute

Author：小编

On March 10, 2026, Google DeepMind introduced its inaugural native multimodal embedding model, Gemini Embedding 2. This cutting-edge model facilitates the seamless integration of text, images, videos, audio, and documents into a unified embedding space, fostering cross-modal semantic comprehension across an impressive 100 languages. In comparison to its forerunner, it boasts an expanded text context window of 8,192 tokens, the capability to process up to 6 images concurrently, manage videos extending up to 120 seconds, handle audio input without the need for transcription, and support PDFs of up to 6 pages. The model is currently accessible for preview through the Gemini API and Vertex AI platforms, making it an ideal choice for applications such as Retrieval-Augmented Generation (RAG), semantic search, and sentiment analysis.

Previous page：A 'Blank Book' Published by 10,000 Writers: Litera...

Next page：NVIDIA to Unveil NemoClaw Open-Source Agent Platfo...

Return to List

Hot Reading

2 day ago

Apple Confirms MacBook Neo Battery Cycle Limit—Here's How Long It Can Last

2 day ago

Replit snags $9B valuation 6 months after hitting $3B

2 day ago

Nvidia is reportedly planning its own open source OpenClaw competitor

2 day ago

Quince hits $10B valuation with giant $500M round led by Iconiq