Google Unveils African Speech Dataset WAXAL, Strengthening Africa’s Voice in AI Conversations

1 day ago / Read about 0 minute

Author：小编

Google has recently rolled out a substantial speech dataset in Africa, named WAXAL, encompassing 21 African languages, including Acholi, Hausa, Luganda, and Yoruba. These languages collectively represent over 100 million speakers. The dataset was developed through a collaborative effort involving local African academic and community institutions, and it comprises roughly 1,250 hours of automatic speech recognition (ASR) data, along with 180 hours of high-quality text-to-speech (TTS) data. Notably, the ownership of this dataset remains with African institutions. This project is designed to tackle the shortage of resources for African languages within mainstream AI systems, fostering the development of inclusive voice technology and safeguarding digital language heritage. The dataset is now freely accessible to researchers worldwide under the CC-BY-4.0 license.

Previous page：Anthropic Commits to Covering Data Center Upgrade ...

Next page：Ex-Honor CEO Zhao Ming Appointed as Co-Chairman of...

Return to List

Hot Reading

2 day ago

Former Founders Fund VC Sam Blond launches AI sales startup to upend Salesforce

2 day ago

Threads’ new ‘Dear Algo’ AI feature lets you personalize your feed

1 day ago

Spotify says its best developers haven’t written a line of code since December, thanks to AI

2 day ago

Why the economics of orbital AI are so brutal