Artificial intelligence (AI) has advanced significantly, with models like ChatGPT impressing us with their ability to generate coherent text and perform complex tasks. However, a fundamental challenge remains: once these AI models are trained, they can not easily adapt to new information. In order to update these models with new data, they often need to be retrained from scratch—a process that is both costly and resource-intensive.
This limitation arises from the way AI models, particularly those based on neural networks, are structured. Neural networks mimic the way human brains function, using connected measurements called artificial neurons. During training, these neurons are adjusted to recognize certain patterns. After this training phase, the model can process new inputs, like responding to prompts in ChatGPT. However, these neurons don’t change once the training is complete, meaning the model cannot learn from new data without a full retraining.
The question of whether AI models can be designed to learn continuously has significant implications. If these models could update themselves with new information without starting from scratch each time, it could save substantial time and money.
Researchers, including Shibhansh Dohare from the University of Alberta, have been investigating this issue. Their findings reveal that after a certain number of retraining cycles, many neurons in AI models become inactive, a state often referred to as “dead.” This results in the model losing its ability to learn from new data, which is a significant hurdle for developing AI systems that can adapt over time.
In their experiments, Dohare’s team trained AI systems with a set of images. Instead of a one-time training process, they re-trained the AI model, after each pair of images to assess its ability to keep learning. However, they observed that after several thousand cycles, the model’s performance declined sharply, with many neurons becoming inactive.
To address this, Dohare and his team proposed an algorithm that reactivates inactive neurons during the learning process. By doing so, the AI models showed some capacity to regain their ability to learn from new data. While this approach is promising, it requires further testing on larger, more complex models to determine its effectiveness.
Mark van der Wilk from the University of Oxford emphasized the importance of finding a solution to this challenge, noting that it could significantly reduce the costs of training AI models. If successful, this approach could lead to more flexible and cost-effective AI systems that can continuously update themselves as new information becomes available.
The research is still in its early stages, but it represents an important step toward making AI systems more adaptable and efficient. As the AI field continues to evolve, addressing the challenge of continuous learning will be crucial for the development of smarter, more responsive technologies.