Exploiting the Availability-Continuity Trade-off in Imperfect Retraining of Machine Learning Systems
Machine Learning Systems (MLSs) often combine diverse models to achieve complex objectives but face performance degradation due to dataset shifts. Regular performance monitoring and model retraining are essential to mitigate this risk. However, model retraining may not always fully restore the syste...
Saved in:
| Published in: | Proceedings - International Symposium on Software Reliability Engineering pp. 406 - 417 |
|---|---|
| Main Authors: | , |
| Format: | Conference Proceeding |
| Language: | English |
| Published: |
IEEE
21.10.2025
|
| Subjects: | |
| ISSN: | 2332-6549 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Machine Learning Systems (MLSs) often combine diverse models to achieve complex objectives but face performance degradation due to dataset shifts. Regular performance monitoring and model retraining are essential to mitigate this risk. However, model retraining may not always fully restore the system's performance, which is known as the imperfect retraining problem. This study examines model retraining policies to maintain MLS performance in the face of imperfect retraining. First, we demonstrate real-world applications that encounter imperfect retraining in computer vision and natural language processing tasks. Next, we theoretically analyze two retraining policies, progressive and conservative, to counteract performance degradation. We formulate the dynamics of model degradation and retraining using semi-Markov processes and quantitatively evaluate service availability and continuity, which measures how long the service can maintain its performance. The numerical analysis results demystify the notable trade-off between service availability and continuity, guiding a proposed retraining strategy to better sustain MLS performance. |
|---|---|
| ISSN: | 2332-6549 |
| DOI: | 10.1109/ISSRE66568.2025.00048 |