Learning Network Design Solution Strategies Using Reinforcement Learning

Specialeforsvar ved Sangmin Lee

Titel: Learning Network Design Solution Strategies Using Reinforcement Learning 

Abstract: This paper proposes a reinforcement learning (RL) framework to handle combinatorial optimization problems, in particular network design problems, which play crucial to the freight transportation industry. The focus is directed at applying RL methods to unsplittable multicommodity capacitated fixed-charge network design (UMCFND) problems. We construct Markov game models with two different rewarding schemes such that they capture the objective function and all constraints in UMCFND problems. The main objective of the Markov game that we formulate is to find a set of stationary policies in Nash equilibrium while it minimizes the total cost in the UMCFND problem. To this end, we employ Independent Deep Q-Network (IDQN) by assuming that all agents in the Markov game do not perceive the existence of other agents. We compare two different rewarding schemes, and show that the agents perform better when each of them receives the reward which is almost exclusive to its own action. Furthermore, our IDQN algorithm is compared with the commercial solver, Gurobi. Results demonstrate that our algorithm performs well on small instances, while it is not as efficient as Gurobi for large instances with regard to the solution quality and the running time

Vejledere: Trine Krogh Boomsma
                   Klaus Kähler Holst, Mærsk
Censor:     Pierre Pinson, DTU