AI Technology: Natural Language Processing
This project aims to develop Tower-v3, a suite of general-purpose multilingual language models with extended context capabilities up to 128K tokens.
Building on Unbabel's successful Tower model family, we will employ innovative model merging techniques and reinforcement learning approaches to create multilingual models ranging from 2B to 70B parameters.
The project focuses on two key innovations: lightweight context extension through efficient model merging, and enhanced post-training techniques for improved general-purpose capabilities.
This research extends Unbabel's proven expertise in model adaptation, demonstrated through Tower-v2's state-of-the-art performance in machine translation, to create novel models that will enable expansion of enterprise AI applications and use-cases.
André Martins, Unbabel, Portugal