NVIDIA and the University of Maryland Jointly Release Audio Flamingo Next, an Open-Source Long-Audio Understanding Model - AI

7 x 24 Track global technological trends

Hot Topic

Day

News Topic

NVIDIA and the University of Maryland Jointly Release Audio Flamingo Next, an Open-Source Long-Audio Understanding Model

1 week ago / Read about 0 minute

Author：小编

According to Marktechpost, the research team from NVIDIA and the University of Maryland has jointly released Audio Flamingo Next (AF-Next), the most powerful open-source large audio language model in the Audio Flamingo series, specifically designed to tackle challenges in long-audio understanding and complex reasoning. Built upon Qwen-2.5-7B, AF-Next supports audio inputs of up to 30 minutes and a 128k context window. Through the innovative 'Temporal Audio Chain-of-Thought' technique, it significantly enhances the model's evidence aggregation capabilities and accuracy in long-audio tasks. This open-source release includes three variants: AF-Next-Instruct, AF-Next-Think, and AF-Next-Captioner, optimized for general question answering, multi-step reasoning, and audio captioning tasks, respectively. Experimental data shows that the model substantially outperforms open-source models of the same class across 20 benchmarks and surpasses Gemini 2.5 Pro on challenging benchmarks like MMAU-Pro, demonstrating exceptional generalization ability and practical value.

Previous page：Adobe Launches AI Assistant for Creative Tools Com...

Next page：World Labs Open-Sources Spark 2.0, Breaking the Li...

Return to List

Hot Reading

2 day ago

Framework Laptop 13 Pro is a major overhaul for the modular, upgradeable laptop

2 day ago

Framework's CEO on the RAM crisis and creating a "MacBook Pro for Linux users"

2 day ago

Framework Laptop 16 upgrades make it look less like an unfinished prototype

2 day ago

Anthropic gets $5B investment from Amazon, will use it to buy Amazon chips