Mozilla, the maker of the Firefox browser, has made the greatest record of human voices available, which was completely spoken by volunteers.
the project “Common Voice” to create the world’s most diverse language data set that is optimized for the development of language technologies.
The company in San Francisco wants to allow smaller manufacturers and Crowdfunding projects with no license fees, its own voice recognition systems.
so Far, it’s the Internet, dominate companies like Google, Microsoft, IBM, Amazon and Apple, the market for speech recognition. Important Player, the company is Nuance, whose technology is behind the speech recognition of Apple’s Siri.
The record of Mozilla includes according to the company, 18 different languages, including English, French, German, and Mandarin (traditional), but also, for example, Welsh and Kabyle, Algerian Berber language. The record adds up to almost 1,400 hours of recorded voice data of more than 42,000 contributors.
The Mozilla collected data are under the “CC0″license. This is the most generous variant of the Creative Commons licenses (“No rights reserved”). The project participants provide voluntary as well as metadata such as age, gender, and accent.
“To be stored along with your records for more information, with which speech Engines even better trained,” reads the Blog post from Mozilla. They wanted to contribute to “a diverse and innovative Ecosystem of technologies”. The target was to bring its own voice-controlled products on the market, but also to support researchers and smaller players.