Inception Labs Unveils Mercury 2: Its Pioneering Diffusion-Based Inference AI Model

1 day ago / Read about 0 minute

Author：小编

AI startup Inception Labs has officially rolled out Mercury 2, a large-scale inference model that leverages a diffusion architecture. This innovative model enables efficient reasoning by processing multiple text segments simultaneously. When running on NVIDIA Blackwell GPUs, Mercury 2 achieves an impressive end-to-end latency of just 1.7 seconds. This performance marks a significant improvement over both Gemini 3 Flash and Claude Haiku 4.5, all while maintaining a generation quality on par with leading high-speed models. With a pricing structure of $0.25 per million tokens for input and $0.75 per million tokens for output, Mercury 2 supports a 128K context window, tool invocation capabilities, and JSON output formatting. These features make it an ideal solution for low-latency applications, such as voice assistants and coding tools. Early access to Mercury 2 is now available.

Previous page：Ali Unveils Three Innovative Medium-Sized Qwen 3.5...

Next page：Kyoto University Team Crafts a Buddhist Robot: Mer...

Return to List

Hot Reading

2 day ago

OpenAI COO says ‘we have not yet really seen AI penetrate enterprise business processes’

2 day ago

Why Your Home Feels "Off" — Even When It Looks Clean

2 day ago

Why Analytics Breaks During System Transformations — And How to Design for Continuity

2 day ago

More startups are hitting $10M ARR in 3 months than ever before