Meta is launching a brand new program in partnership with UNESCO to gather speech recordings and transcriptions the corporate stated will assist the event of future overtly obtainable AI.
This system, the Language Technology Partner Program, is looking for collaborators who can contribute greater than 10 hours of speech recordings with transcriptions, massive quantities of written textual content, and units of translated sentences in “various languages.” Based on Meta, companions will work with the corporate’s AI groups to combine these languages into AI speech recognition and translation fashions, which — when finalized — shall be open-sourced.
Companions up to now embrace the federal government of Nunavut, a sparsely populated territory in Northern Canada. Some residents of Nunavut converse Intuit languages collectively referred to as Inuktut.
“Our efforts are particularly centered on underserved languages, in help of UNESCO’s work,” Meta wrote in a weblog submit offered to TechCrunch. “Finally, our purpose is to create clever programs that may perceive and reply to complicated human wants, no matter language or cultural background.”
Complementary to the brand new program, Meta stated that it’s releasing an open source machine translation benchmark to guage the efficiency of language translation fashions. The benchmark, composed of sentences crafted by linguists, helps seven languages, and could be accessed — and contributed to — from the AI improvement platform Hugging Face.
Meta is framing each initiatives as philanthropic. However the firm stands to learn from upgraded speech recognition and translation fashions.
Meta continues to broaden the variety of languages its AI-powered assistant, Meta AI, helps, and pilot options equivalent to computerized translation for creators. Final September, Meta introduced that it might start testing a instrument to translate voices in Instagram Reels, permitting creators to dub their speech and auto-lip-sync it.
Meta’s therapy of content material in languages aside from English throughout its platforms has been the goal of a lot criticism. Based on one report, Fb left virtually 70% of Italian- and Spanish-language COVID misinformation unflagged in comparison with simply 29% of comparable English-language misinformation. And leaked documents from the company reveal that Arabic-language posts are repeatedly flagged erroneously as hate speech.
Meta has stated that it’s taking steps to enhance its translation and moderation applied sciences.