Advancing Production Systems with Online Reinforcement Learning: Real-time Monitoring, Control, and Optimization
Bakhtiyar Doskenov
*
Oregon State University, Oregon, USA.
Olanrewaju Okuyelu
California State University, Northridge, California, USA.
*Author to whom correspondence should be addressed.
Abstract
Modern production systems are increasingly complex and variable, requiring adaptive and intelligent solutions for real-time monitoring and control. Traditional methods, such as linear models and static optimization, often fall short in addressing the dynamic, high-dimensional demands of industrial environments. Online reinforcement learning (RL) offers a compelling alternative by enabling systems to continuously learn and optimize decision-making through real-time interactions with their environment. This review explores the current advancements in online RL, focusing on its applications in predictive maintenance, dynamic scheduling, and process optimization. Key methodologies, including Deep RL, policy-based approaches, and hybrid frameworks, are examined for their ability to enhance scalability, adaptability, and efficiency in Industry 4.0 ecosystems. While online RL holds great promise, challenges such as computational demands, algorithmic stability, and limited real-world validation remain significant barriers to its widespread adoption. The lack of standardized benchmarks further hinders the evaluation and comparability of RL solutions across different industrial contexts. The findings underscore the potential of RL to significantly reduce operational costs by optimizing resource utilization, minimizing downtime through predictive interventions, and streamlining production workflows. These improvements collectively enhance productivity and support the creation of more agile and efficient industrial processes. To address existing gaps, this paper synthesizes recent advancements, identifies unresolved challenges, and outlines critical research pathways, including the development of efficient algorithms, the integration of domain knowledge for improved stability, and the deployment of multi-agent RL systems for distributed manufacturing networks. By providing actionable insights, this review highlights the transformative potential of online RL in creating intelligent, autonomous, and resilient production systems that deliver tangible cost and efficiency benefits in real-world applications.
Keywords: Online reinforcement learning, real-time monitoring, production systems, dynamic scheduling, predictive maintenance, process optimization, industry 4.0