Skip to main content
The European High Performance Computing Joint Undertaking (EuroHPC JU)

CogniSparseLLM: Cognitively enhanced LLMs with sparse connectivity

50000
Awarded Resources (in node hours)
Leonardo BOOSTER
System Partition
January 2025 - January 2026
Allocation Period

AI Technology: Machine Learning, Natural Language Processing, and Deep Learning.

During brain development, an excess number of synapses are initially created, which are progressively eliminated through a process known as synaptic pruning.

This procedure is activity-dependent, shaped by the brain’s experiences. While creating an overabundance of synaptic connections only to later remove many might appear inefficient, research suggests that networks formed by this procedure demonstrate significant efficiency and robustness. 

Inspired by this biological process, the project team has proposed a neural network architecture utilizing long connections instead of traditional short residual connections. When long connections neural networks (LCNs) are trained with gradient descent, information is naturally "pushed" down to the first few layers, leading to a sparse network. 

Even more surprising is that this simple architectural modification leads to networks that exhibit behaviors similar to biological brain networks, namely: early over-connectivity to later sparsity, enhanced robustness to noise, efficiency in low-data settings and longer training times. 

Specifically, starting with a traditional neural network architecture with initial depth d and k connections, long connections are added from all layers to the last layer and summed up. During LCN training, 30-80% of the top layers become effective identity mappings as all relevant information is concentrated in the bottom layers. Pruning the top layers results in a refined network with a reduced depth d′ and final connections k′, achieving significant efficiencies without any loss in performance compared to residual baselines. 

The project has applied this architecture to various classification tasks and shown that, in all experiments, the network converges to utilizing only a subset of the initially defined pre-training connections, and the amount of compression is dependent on the task complexity.

This project will explore applications of LCNs to language models, proposing the training and open sourcing of a LC-enhanced LLM.