Reinforcement learning approach for optimal control of multiple electric locomotives in a heavy-haul freight train:A Double-Switch-Q-network architecture

摘要

Electric locomotives provide high tractive power for fast acceleration of heavy-haul freight trains, and significantly reduce the energy consumption with regenerative braking. This paper proposes a reinforcement learning (RL) approach for the optimal control of multiple electric locomotives in a heavy-haul freight train, without using the prior knowledge of train dynamics and the pre-designed velocity profile. The optimization takes the velocity, energy consumption and coupler force as objectives, considering the constraints on locomotive notches and their change rates, speed restrictions, traction and regenerative braking. Besides, since the problem in this paper has continuous state space and large action space, and the adjacent actions’ influences on states share similarities, we propose a Double-Switch Q-network (DSQ-network) architecture to achieve fast approximation of the action-value function, which enhances the parameter sharing of states and actions, and denoises the action-value function. In the numerical experiments, we test DSQ-network in 28 cases using the data of China Railways HXD3B electric locomotive. The results indicate that compared with table-lookup Q-learning, DSQ-network converges much faster and uses less storage space in the optimal control of electric locomotives. Besides, we analyze 1)the influences of ramps and speed restrictions on the optimal policy, and 2)the inter-dependent and inter-conditioned relationships between multiple optimization objectives. Finally, the factors that influence the convergence rate and solution accuracy of DSQ-network are discussed based on the visualization of the high-dimensional value functions.