Deep reinforcement learning-based approach for control of Two Input–Two Output process control system

Kadu, Anil; Khandekar, Aniket

Deep reinforcement learning-based approach for control of Two Input–Two Output process control system

International Journal on Smart Sensing and Intelligent Systems

Volume 18 (2025): Issue 1 (January 2025)

By:

Anil Kadu and Aniket Khandekar

Open Access

|Jul 2025

Figures & Tables

Overall structure of the MIMO control system.

TITO system with controller (TITO). TITO, two input–two output.

Simple flow chart of TITO control system using DDPG. DDPG, deep deterministic policy gradient; TITO, two input–two output.

Critic Network design for DDPG for TITO system. DDPG, deep deterministic policy gradient; TITO, two input–two output.

Actor network design for DDPG for TITO system. DDPG, deep deterministic policy gradient; TITO, two input–two output.

Simulink model for TITO system. TITO, two input–two output.

Reward function representation using DDPG for TITO system. DDPG, deep deterministic policy gradient; TITO, two input–two output.

Training performance of the DDPG agent for TITO for set point tracking. DDPG, deep deterministic policy gradient; TITO, two input–two output.

Reward function progression of the DDPG agent for TITO system for set point tracking. DDPG, deep deterministic policy gradient; TITO, two input–two output.

Simulation of transfer function on Loop 1.

MV values for Loop 1. MV, manipulated variable.

Simulation of transfer function on Loop 2.

Comparison of proposed method on Loop 1 with traditional methods. DDPG, deep deterministic policy gradient.

Comparison of proposed methods on Loop 2 with traditional methods. DDPG, deep deterministic policy gradient.

Response to disturbance on Loop 1. DDPG, deep deterministic policy gradient.

Response to disturbance on Loop 2. DDPG, deep deterministic policy gradient.

Parameters for configuration of DDPG agent

Parameter	Description	Value
Discount factor (γ)	Future reward discounting	0.99
Target smooth factor (τ)	Target network update rate	0.001
Actor learning rate	Learning rate for actor updates	0.0001
Critic learning rate	Learning rate for critic updates	0.001
Mini-batch size	Sample size for experience replay	64
Experience buffer length	Total memory for experience replay	1,000,000

Analogy of the traditional system with DRL principles

DRL component	Traditional control equivalent	Description
Agent	Controller	Decides the actions to control the system.
Environment	Plant/process	The system is being controlled.
State	System measurements	Information about the system’s current status.
Action	Control input	Adjustments made to influence the process.
Reward	Error feedback	Guides the agent to improve performance.
Policy	Control law	Strategy linking states to optimal actions.

Performance indices of Loop 2

Method	ISE	IAE	ITSE	ITAE	Overshoot (%)	Settling time
DDPG	137.7	79.13	3.217e + 04	1.707e + 04	0	42
NDT[PI]	122.1	82.69	4.434e + 04	2.515e + 04	60	110
Mvall [PI]	510.3	275.3	2.228e + 05	1.305e + 04	0	380
Wang et al [PID]	81.82	61.27	2.947e + 04	1.856e + 04	30	85

Performance indices of Loop 1

Method	ISE	IAE	ITSE	ITAE	Overshoot (%)	Settling time
DDPG	18.31	29.92	722.9	3325	35	48
NDT [PI]	26.82	39.9	6631	1.032e + 04	25	100
Mvall [PI]	34.61	47.25	488.3	1880	0	150
Wang et al [PID]	16.26	24.82	3206	6517	20	53

DOI: https://doi.org/10.2478/ijssis-2025-0029 | Journal eISSN: 1178-5608

Journal RSS Feed

Language: English

Submitted on: Mar 1, 2025

Published on: Jul 1, 2025

Published by: Professor Subhas Chandra Mukhopadhyay

In partnership with: Paradigm Publishing Services

Publication frequency: 1 times per year

Keywords:

Multivariable coupled system,

Deep reinforcement learning control strategy,

Deep Deterministic Policy Gradient,

Two Input–Two Output Process control system,

Wood–Berry distillation column model

Related subjects:

Engineering,

Introductions and overviews,

Engineering, other

© 2025 Anil Kadu, Aniket Khandekar, published by Professor Subhas Chandra Mukhopadhyay
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.

Previous article Volume 18 (2025): Issue 1 (January 2025)Next article

Deep reinforcement learning-based approach for control of Two Input–Two Output process control system

Figures & Tables

Figure 1:

Figure 2:

Figure 3:

Figure 4:

Figure 5:

Figure 6:

Figure 7:

Figure 8:

Figure 9:

Figure 10:

Figure 11:

Figure 12:

Figure 13:

Figure 14:

Figure 15:

Figure 16:

Figure 17:

Parameters for configuration of DDPG agent

Analogy of the traditional system with DRL principles

Performance indices of Loop 2

Performance indices of Loop 1

Paradigm

My account