Publication Overview
This doctoral dissertation, completed at the University of Tennessee, Knoxville in August 2013, presents a comprehensive investigation into one of the most challenging problems in neural network research: catastrophic interference (also known as catastrophic forgetting). The work introduces the Fixed Expansion Layer (FEL), a novel architectural approach that significantly mitigates the loss of previously learned information when neural networks are trained on new data.
The Challenge: Catastrophic Interference
Catastrophic interference refers to the tendency of neural networks to abruptly lose previously learned mappings when trained on new information. This phenomenon stands in stark contrast to biological learning systems, which demonstrate remarkable ability to retain long-term memories while continuously acquiring new knowledge. Traditional neural networks, particularly multilayer perceptrons (MLPs), are highly susceptible to this interference, severely limiting their applicability to real-world scenarios involving:
- Nonstationary data streams
- Incremental learning tasks
- Lifelong learning applications
- Sequential pattern learning
The Solution: Fixed Expansion Layer Architecture
The dissertation introduces the Fixed Expansion Layer (FEL) neural network, which incorporates a sparsely-activating hidden layer between the last hidden layer and output layer of a standard MLP. Key innovations include:
-
Sparse Representation: The FEL uses fixed (non-trainable) weights to transform dense hidden layer activations into sparse representations, creating a form of implicit regularization that protects prior learning.
-
Theoretical Foundation: The work provides rigorous mathematical analysis of how sparse coding principles can be leveraged to reduce interference between learned representations.
-
Ensemble Learning Framework: Beyond individual networks, the dissertation explores how ensembles of FEL networks can be trained using information-theoretic diversity measures (Jensen-Shannon Divergence) to further enhance robustness against forgetting.
-
Extension to Recurrent Networks: The research extends FEL principles to recurrent neural networks (RNNs), demonstrating effectiveness in sequential and temporal learning tasks.
Research Contributions
This dissertation makes several significant contributions to the field of neural network research:
- Novel Architecture: Introduction of the FEL as a practical solution to catastrophic forgetting
- Theoretical Analysis: Mathematical framework for understanding why sparse representations mitigate interference
- Empirical Validation: Comprehensive experiments on benchmark tasks including MNIST classification under nonstationary conditions
- Ensemble Methods: New approaches to training diverse ensembles that resist forgetting
- Recurrent Extensions: Application of FEL principles to sequential learning in RNNs
Impact and Applications
The research presented in this dissertation has direct implications for:
- Continual Learning Systems: Enabling AI systems that can learn throughout their operational lifetime without catastrophic loss of prior knowledge
- Adaptive Pattern Recognition: Building classifiers that can adapt to changing data distributions
- Temporal Sequence Learning: Improving RNN performance on tasks requiring long-term memory retention
- Real-World Machine Learning: Addressing the gap between idealized batch learning scenarios and practical incremental learning requirements
Academic Context
- Degree: Doctor of Philosophy in Computer Engineering
- Institution: University of Tennessee, Knoxville
- Date: August 2013
- Major Professor: Dr. Itamar Arel
- Committee: Dr. James Plank, Dr. Michael Langston, Dr. Kwai Wong
This research was supported by the Intelligence Advanced Research Projects Activity (IARPA) via the Army Research Office and the National Science Foundation.
Citation: R. A. Coop, “Mitigation of Catastrophic Interference in Neural Networks and Ensembles using a Fixed Expansion Layer,” Ph.D. Dissertation, University of Tennessee, Knoxville, 2013.