「ロブロックス、マルチリンガル翻訳モデルで言語の壁を打破！」【24/02/06】

Oyaji 2024/2/6 3:36 最終更新日:2024/02/06 3:36

Robloxが新しいAIチャット翻訳を導入し、異なる言語を話す人々がコミュニケーションを取ることができるようになった。
Robloxのカスタム多言語モデルにより、16の言語の任意の組み合わせで直接翻訳が可能となった。
Robloxの3D体験で実際に不可能な状況を作り出し、仮想世界で異なる言語を話す人々が円滑にコミュニケーションを取ることができるようになった。
この新しい機能は、Robloxのインゲームテキストチャットで利用可能。

Breaking Down Language Barriers with a Multilingual Translation Model

Imagine discovering that your new Roblox friend, a person you’ve been chatting and joking with in a new experience, is actually in Korea — and has been typing in Korean the entire time, while you’ve been typing in English, without either of you noticing. Thanks to our new real-time AI chat translations, we’ve made possible […]
The post Breaking Down Language Barriers with a Multilingual Translation Model appeared first on Roblox Blog.

Breaking Down Language Barriers with a Multilingual Translation Model February 5, 2024 by Daniel Sturman, Chief Technology Officer Product & Tech Imagine discovering that your new Roblox friend, a person you’ve been chatting and joking with in a new experience, is actually in Korea — and has been typing in Korean the entire time, while you’ve been typing in English, without either of you noticing. Thanks to our new real-time AI chat translations, we’ve made possible on Roblox something that isn’t even possible in the physical world — enabling people who speak different languages to communicate seamlessly with one another in our immersive 3D experiences. This is possible because of our custom multilingual model, which now enables direct translation between any combination of the 16 languages we currently support (these 15 languages, as well as English). In any experience that has enabled our in-experience text chat service, people from different countries can now be understood by people who don’t speak their language. The chat window will automatically show Korean translated into English, or Turkish translated into German, and vice versa, so that each person sees the conversation in their own tongue. These translations are displayed in real time, with latency of 100 milliseconds or less, so the translation happening behind the scenes is nearly invisible. Using AI to automate real-time translations in text chat removes language barriers and brings more people together, no matter where they live in the world. Building a Unified Translation Model AI translation is not new, the majority of our in-experience content is already automatically translated. We wanted to go beyond translating static content in experiences. We wanted to automatically translate interactions — and we wanted to do that for all 16 languages we support on the platform. This was an audacious goal for two reasons: First, we weren’t just translating from one primary language (i.e., English) to another, we wanted a system capable of translating between any combination of the 16 languages we support. Second, it had to be fast. Fast enough to support real chat conversations, which to us meant getting latency down to 100 milliseconds or less. Roblox is home to more than 70 million daily active users all over the world and growing. People are communicating and creating on our platform — each in their native language — 24 hours a day. Manually translating every conversation happening across more than 15 million active experiences, all in real time, is obviously not feasible. Scaling these live translations to millions of people, all having different conversations in different experiences simultaneously, requires an LLM with tremendous speed and accuracy. We need a context-aware model that recognizes Roblox-specific language, including slang and abbreviations (think obby, afk, or lol). Beyond all of that, our model needs to support any combination of the 16 languages Roblox currently supports. To achieve this, we could have built out a unique model for each language pair (i.e., Japanese and Spanish), but that would have required 16×16, or 256 different models. Instead, we built a unified, transformer-based translation LLM to handle all language pairs in a single model. This is like having multiple translation apps, each specializing in a group of similar languages, all available with a single interface. Given a source sentence and target language, we can activate the relevant “expert” to generate the translations. This architecture allows for better utilization of resources, since each expert has a different specialty, which leads to more efficient training and inference — without sacrificing translation quality. Illustration of the inference process. Source messages, along with the source language and target languages are passed through RCC. Before hitting the back end, we first check cache to see if we already have translations for this request. If not, the request is passed to the back end and to the model server with dynamic batching. We added an embedding cache layer between the encoders and decoders to further improve efficiency when translating into multiple target languages. This architecture makes it far more efficient to train and maintain our model for a few reasons. First, our model is able to leverage linguistic similarities between languages. When all languages are trained together, languages that are similar, like Spanish and Portuguese, benefit from each other’s input during training, which helps improve the translation quality for both languages. We can also far more easily test and integrate new research and advances in LLMs into our system as they’re released, to benefit from the latest and greatest techniques available. We see another benefit of this unified model in cases where the source language is not set or is set incorrectly, where the model is accurate enough that it’s able to detect the correct source language and translate into the target language. In fact, even if the input has a mix of languages, the system is still able to detect and translate into the target language. In these cases, the accuracy may not be quite as high, but the final message will be reasonably understandable. To train this unified model, we began by pretraining on available open source data, as well as our own in-experience translation data, human-labeled chat translation results, and common chat sentences and phrases. We also built our own translation evaluation metric and model to measure translation quality. Most off-the-shelf translation quality metrics compare the AI translation result to some ground truth or reference translation and focus primarily on the understandability of the translation. We wanted to assess the quality of the translation — without a ground truth translation. We look at this from multiple aspects, including accuracy (whether there are any additions, omissions, or mistranslations), fluency (punctuation, spelling, and grammar), and incorrect references (discrepancies with the rest of the text). We classify these errors into severity levels: Is it a critical, major, or minor error? In order to assess quality, we built an ML model and trained it on human labeled error types and scores. We then fine-tuned a multilingual language model to predict word-level errors and types and calculate a score using our multidimensional criteria. This gives us a comprehensive understanding of the quality and types of errors occurring. In this way we can estimate translation quality and detect errors by using source text and machine translations, without requiring a ground truth translation. Using the results of this quality measure, we can further improve the quality of our translation model. With source text and the machine translation result, we can estimate the quality of the machine translation without a reference translation, using our in-house translation quality estimation model. This model estimates the quality from different aspects and categorizes errors into critical, major, and minor errors. Less common translation pairs (say, French to Thai), are challenging due to a lack of high quality data. To address this gap, we applied back translation, where content is translated back into the original language, then compared to the source text for accuracy. During the training process, we used iterative back translation, where we use a strategic mix of this back translated data and supervised (labeled) data to expand the amount of translation data for the model to learn on. Illustration of the model training pipeline. Both parallel data and back translation data are used during the model training. After the teacher model is trained, we apply distillation and other serving optimization techniques to reduce the model size and improve the serving efficiency. To help the model understand modern slang, we asked human evaluators to translate popular and trending terms for each language, and included those translations in our training data. We will continue to repeat this process regularly to keep the system up to date on the latest slang. The resulting chat translation model has roughly 1 billion parameters. Running a translation through a model this large is prohibitively resource-intensive to serve at scale and would take much too long for a real-time conversation, where low latency is critical to support more than 5,000 chats per second. So we used this large translation model in a student-teacher approach to build a smaller, lighter weight model. We applied distillation, quantization, model compilation, and other serving optimizations to reduce the size of the model to fewer than 650 million parameters and improve the serving efficiency. In addition, we modified the API behind in-experience text chat to send both the original and the translated messages to the person’s device. This enables the recipient to see the message in their native language or quickly switch to see the sender’s original, non-translated message. Once the final LLM was ready, we implemented a back end to connect with the model servers. This back end is where we apply additional chat translation logic and integrate the system with our usual trust and safety systems. This ensures translated text gets the same level of scrutiny as other text, in order to detect and block words or phrases that violate our policies. Safety and civility is at the forefront of everything we do at Roblox, so this was a very important piece of the puzzle. Continuously Improving Accuracy In testing, we’ve seen that this new translation system drives stronger engagement and session quality for the people on our platform. Based on our own metric, our model outperforms commercial translation APIs on Roblox content, indicating that we’ve successfully optimized for how people communicate on Roblox. We’re excited to see how this improves the experience for people on the platform, making it possible for them to play games, shop, collaborate, or just catch up with friends who speak a different language. The ability for people to have seamless, natural conversations in their native languages brings us closer to our goal of connecting a billion people with optimism and civility. To further improve the accuracy of our translations and to provide our model with better training data, we plan to roll out a tool to allow people on the platform to provide feedback on their translations and help the system improve even faster. This would enable someone to tell us when they see something that’s been mistranslated and even suggest a better translation we can add into the training data to further improve the model. These translations are available today for all 16 languages we support — but we are far from done. We plan to continue to update our models with the latest translation examples from within our experiences as well as popular chat phrases and the latest slang phrases in every language we support. In addition, this architecture will make it possible to train the model on new languages with relatively low effort, as sufficient training data becomes available for those languages. Further out, we’re exploring ways to automatically translate everything in multiple dimensions: text on images, textures, 3D models, etc. And we are already exploring exciting new frontiers, including automatic voice chat translations. Imagine a French speaker on Roblox being able to voice chat with someone who only speaks Russian. Both could speak to and understand one another, right down to the tone, rhythm, and emotion of their voice, in their own language, and at low latency. While this may sound like science fiction today, and it will take some time to achieve, we will continue to push forward on translation. In the not-too-distant future, Roblox will be a place where people from all around the world can seamlessly and effortlessly communicate not just via text chat, but in every possible modality! Recommended Inside the Tech – Solving for Multilingual & Semantic SearchHow Roblox Avatar Tech Is EvolvingImproving Front-End Memory Usage by 30x on Carousel LoadingRDC 2022: Our Vision for the Future of Roblox

全文表示

ソース：https://blog.roblox.com/2024/02/breaking-down-language-barriers-with-a-multilingual-translation-model/

「ROBLOX」最新情報はこちら

ROBLOXの動画をもっと見る

セルランの推移をチェックしましょう

2月5日前日の様子	ゲームセールス：69位総合セールス：圏外無料ランキング：35位
24'2月6日(火曜日) 記事掲載日 ※1日の最高順位	ゲームセールス：83位総合セールス：圏外無料ランキング：37位
セールスランキング上位 ※古参ユーザーも要チェック！	売上が好調。お得なセール期間などの可能性も。普段の順位より高ければ課金するタイミング。
無料ランキング上位 ※初心者にオススメ！	話題性もあり新規またはアクティブユーザーが多い。リセマラのチャンスの可能性も高くサービス開始直後や期間などのゲームも多い。
サービス開始日	2012年12月11日
何年目？	4406日(12年)
周年いつ？	次回：2025年12月11日(13周年)
アニバーサリーまで	あと342日
ハーフアニバーサリー予測	2025年6月11日(12.5周年) あと159日
運営	Roblox Corporation

ROBLOX情報

ROBLOXについて何でもお気軽にコメントしてください(匿名)

全ソシャゲのコメントをチェック

Roblox（ロブロックス）は、無料でカンタンにダウンロードできる制作・交流型のバーチャル空間プラットフォームです。世界中のクリエーターが制作・投稿した膨大な数のバーチャル空間で友達と交流したりしながら楽しみましょう。バーチャル空間が何百万本もラインナップ Robloxのバーチャル空間には、何百万通りの楽しみ方があります。例えば... ・話題の映画やテレビ番組の公式ワールドを探検・バーチャルコンサート鑑賞・一流ファッションブランドのアパレル試着・ホラー空間で肝試し・eスポーツでみんなと試合したり、格闘ゲームや障害物アスレに挑戦・世界の都市で観光体験・アバターになったミュージシャンやタレントと交流 …など、盛りだくさんです。みんなとつながれます・パソコン、モバイル、Xbox One、VRヘッドセットなど、ほとんどの端末環境で動作。・友達と同じ端末を使っていなくても一緒に楽しめます。・テキストと音声チャット機能とプライベートメッセージ機能つき。・グループ機能で同じ趣味や推しの世界規模のコミュニティとつながれます。・アバターとして通話できる通信機能、Roblox Connect（ロブロックス・コネクト）搭載。なりたい自分になれちゃいます・アバターをカスタマイズして自分らしくコーディネート。・ブランドや企業のバーチャル空間内でバーチャル商品の購入もできます。・語学や教育に特化したバーチャル空間でスキルアップも。　・プログラミングやデザインを学べるツールも満載。・アイテムやバーチャル空間を制作して、クリエーターデビューもできます。オリジナルのバーチャル空間制作: https://www.roblox.com/develop サポート: https://help.roblox.com お問い合わせ: https://corp.roblox.com/contact/ プライバシーポリシー: https://en.help.roblox.com/hc/ja/articles/115004630823 保護者ガイド: https://corp.roblox.com/parents/ 利用規約: https://www.roblox.com/info/terms ご注意: 参加するには、ネットワーク接続が必要です。Robloxは、Wi-Fiに接続した状態での使用が最適です。Roblox上の機能やコンテンツの一部は、お住まいの国や地域、お使いの言語ではご利用いただけない場合があります。

全文表示

今も楽しいですが物や人を飛ばすで荒らしや稀にチーターがいるのが嫌ですでもロブロはとっっっっててっっても好きです！無課金でも可愛い顔に無料の部類を追加してください2023年の様におねがいします！ (★5)(24/12/29)

Hate the hackers I mind my business but hackers suddenly disturb me I also hate scammers but love the games and events (★4)(24/12/29)

とにかく最高です。ﾀﾋぬまでこのゲームやってられます。有名なゲームは大量にあるし何億個もある面白いゲームは見たことがありません。課金しても損はほぼありません。だけど最近課金してる人が多すぎて無課金勢がいじめられるという事が多発してて困ってます。僕は課金勢なんですけどそういう現場はちょくちょうみてます。物や人を飛ばすっていうゲームとかは無課金差別がやばいです。そして多分日本人の治安の悪さ世界一です。べつに行くなと言う訳では無いんですけどまじで治安悪いので気をつけてください。でも当たりのサーバーに行けばめっちゃ最高のゲームでもあります。治安は悪いけど面白いゲームかな。とにかくロブロックスは最高ですね。やろうかなと思ってる人、絶対にやってください。特にやった方がいいゲームを言います（個人的な意見）・ブロックスフルーツ・ブルックヘブン・スラップバトル・プリーズトネート（これはロブロックス内通貨であるロバックスを寄付するゲームです）・ドアーズ・ひみつのおるすばん・最強の戦場・ライバル・殺人ミステリー2 ・むありあ放送局このような感じです。ちなみにむありあ放送局は行った方がいいですね。むありあっていうユーチューバー居るんですけどその人がよくロバックス配布企画してるんですよね。なのでグループに入って告知みながらやってみて下さい。ライブでもロバックス配布してるのでグループの告知をみながら待ちましょう！ (★3)(24/12/29)

Sometimes I need to use real money to buy something special but I still love Roblox because I can meet new friends and there are many different games (★5)(24/12/29)

1.正直課金ゲー正直課金ゲーすぎて笑えません。ロブロックス面白いですが、課金をしないとあまり私的には面白くないです。私は課金しましたが、最初は私も無課金です。誰でもそうです。なのでそこはもう少し無課金に優しくしてもらいたいです。 2.出会い厨出会い厨多すぎです。特に、物や人を飛ばす・海鼠の湯で彼氏欲しいなどの発言をしている方がいらっしゃいます。見ていて不快です。 3.やらしい行動 2でも言った、温泉でやらしい行動をしている、男女がいます。こちらも見ていて不快です。 4.□倍詐欺ロブロックスの中には色々なゲームがありその中の「おねがい寄付して｣（寄付ゲー）を例にしますね。そのゲームは寄付をしたり寄付をされたりするゲームなんですけど、その中に「〇ロバ（ロブロックスの中のお金）をくれたら□倍で返します。｣といった詐欺が大量発生😅どういうことかというと、その詐欺に騙されてAさん（例え）が寄付します。するとその□倍にして返すという詐欺をBさんとしましょう。そのBさんは寄付をしてもらうと直ぐにそのワールドから抜けるとのことです。正直それは犯罪なのでは？と思います。 5.運営さんが厳しすぎる。運営さん、厳しすぎます。少しの発言で1日BANされます。 (★4)(24/12/29)

最近、ロブロックスのアプデがきましたよね。フレンドからも聞きました。チャットをできなくしました。1人のフレンドがチャットができなくなっています。プライベートチャットができなくなったのが、1番の欠点です。ここからはプライベートチャットのことをプラチャと言います。何故かと言うと、野良のサーバーに行く時があるんですが、その時、プラチャが出来ないと、他の人が話に割り込んでくるので、そこだけは修正してください。良いところ・ゲームが面白い・無料のスキンがあるだけです。個人的な評価なので質問は御控えください。悪いところ・暴言キッズがいる・誤banされる・入った時フレンドがいるサーバーに飛ばされる (上の悪いところはプレイヤーが満員だった時) ・治安が悪い修正してほしいところ・プレイヤーが満員の時、別のサーバーに飛ばしてほしい。・チーターがチートをしたら、無期限バンをする。 ⚠️注意　これは個人的に修正してほしいところ。・ゲームによくバグ技がある。 (揺るぎない魂の壁抜けを除いて．) のみです。あとがきよく、自分が通報しようと思ったら、逃げられるんです。個人的には結構いいと思いますよ。ただ…一つだけですけどね…また変わったことがあったらまたレビューします。では、good-bye！ (★5)(24/12/29)

レビューをもっと見る

One thought on “「ロブロックス、マルチリンガル翻訳モデルで言語の壁を打破！」”

ROBLOX より:

2024年2月6日 3:40 AM

これにより、異なる言語を話す人々がRobloxの3D体験内でシームレスにコミュニケーションを取ることができるようになりました。これは非常に革新的で素晴らしい取り組みだと思います。言語の壁を取り払うことで、より多くの人々が参加し、交流することができるでしょう。

返信

ご意見らくがき帳コメントをキャンセル

カテゴリーへ移動して「ROBLOX」の最新情報をチェックしてください

「ロブロックス、マルチリンガル翻訳モデルで言語の壁を打破！」

Breaking Down Language Barriers with a Multilingual Translation Model

One thought on “「ロブロックス、マルチリンガル翻訳モデルで言語の壁を打破！」”

ご意見らくがき帳 コメントをキャンセル

ご意見らくがき帳コメントをキャンセル