2900 - Towards Self-Adapting Language Models
Objectives
This internship aims to reimplement the core components of the SEAL (Self-Adapting Language Models) framework, introduced by Zweiger et al. (2025) in [1], for a medium-sized open-source language model (1–7B parameters). The goal is to enable the model to generate, apply, and learn from its own synthetic finetuning data in a self-directed and reinforcement-learning-driven loop.
The successful implementation will allow the model to incorporate new knowledge or adapt to novel tasks without relying on external supervision or manual dataset curation.
https://arxiv.org/abs/2506.10943
Large Language Models (LLMs) typically require significant manual intervention and curated data to adapt post-pretraining. SEAL proposes a promising alternative: using the model’s own generative capabilities to define its learning process via self-edits, namely synthetic data and update directives it creates for itself.
This approach has implications for continual learning, low-resource adaptation, and autonomous agents. However, no open-source implementation currently exists. Reproducing SEAL for a 1–7B model would validate its practicality and pave the way for more adaptable language models.
Note: this internship is primarly targetting excellent candidates at the Master 2 level but all strong aplications will be considered (M1 and PhD student)