Beyond Human: Exploring Reinforcement Learning with AI-Only Feedback

34,650

Awarded Resources (in node hours)

Meluxina GPU

System Partition

13 November 2023 - 12 November 2024

Allocation Period

Large Language Models have revolutionised work with breakthrough capabilities. They can perform tasks without explicit training, making them incredibly powerful.

Yet, the key to a great AI assistant is finetuning with Reinforcement Learning from Human Feedback (RLHF). GPT-3 was powering 300 applications one year after its commercial release. However, these numbers pale in comparison to the success of ChatGPT – a model finetuned with human feedback for chat applications – which gained one million users within 5 days of its launch in November 2022.

There is currently no viable open competitor to ChatGPT. RLHF uses human annotation to teach the model how to generate text based on human preference. It's a slow, finicky, and expensive process. Recently, the research community has started to investigate approaches to finetuning with AI-generated data to reduce dependence on human annotations.

State-of-the-art HPC infrastructure is necessary to conduct representative experiments, which limits such studies. The project proposes an extensive investigation into reinforcement learning using AI-generated feedback only, harnessing EuroHPC resources.

It will focus on safety and performance in business use cases. By unlocking RLHF on AI-generated data alone, we can empower everyone with powerful AI assistant capabilities, preventing power concentration in a few major players.