AI Technology: Natural Language Processing, Generative Language Modeling
The project team proposes a suite of multilingual continual pretrained Dense and Mixture-of-Expert (MoE) models at different size tiers for different types of workloads that have different inference compute constraints.
The project is composed of two main phases: a continual pretraining phase of dense models and another of MoE models.
The first is motivated on
- (i) the fact that adapting existing pretrained models is considerably less expensive than pretraining a model from scratch and;
- (ii) leverages Unbabel extensive experience in developing adapted LLMs.
The second phase is motivated on MoE models rivaling dense models of bigger capacity at a much lower training and inference cost.
The outcomes of this project would boost Unbabel's position in the language technologies space, with a tiered model suite that could be used to deliver high-quality services across a wide range of domains, applications, and languages.
Likewise, the outcomes of this project will also help the research community working on multilingual language applications by releasing a new suite of advanced language models, combined with details on how they are produced.
André Martins, Unbabel Lda - Portugal