Utility tunnel is a public tunnel used for centralized laying of municipal pipelines such as electric power, telecommunication, water supply, drainage, heat, and gas in urban underground areas, which is an important infrastructure to guarantee the operation of the city and is considered a "lifeline" project. In recent years, fueled by favorable national policies, the construction of utility tunnels has been actively promoted in China. However, the maintenance of prefabricated assembled utility tunnels presents significant challenges due to the complex dynamic action such as soil loads, ground traffic loads and groundwater pressure and so on throughout their lifecycle. It is difficult to meet the requirements of dynamic changes in the service state of the infrastructure when using traditional optimization methods of maintenance strategies. In contrast to traditional multi-objective optimization methods primarily based on Genetic Algorithms (GA), which encounter challenges in dynamic environments and complex objectives, the Deep Q-Learning Network (DQN) method provides a dynamic and adaptive solution. In view of this status, an optimization method of maintenance strategies is proposed in this paper, employing the reinforcement learning method based on DQN to enhance the efficiency and effectiveness of utility tunnel maintenance plan formulation. Firstly, it presents the principle of the DQN algorithm and its application within the reinforcement learning framework. Furthermore, the maintenance process is modeled as a Markov decision process (MDP), incorporating Life-Cycle Assessment (LCA) and reliability considerations. The DQN-based method enables real-time learning and feedback, facilitating continuous adaptation to changing conditions and more effective balancing of multiple objectives, in contrast to GA, which typically depends on static evaluations and extensive iterations. Finally, it can be validated by the illustrative example that the superior performance of the DQN-based maintenance strategy over traditional GA-based maintenance strategy is apparent. The potential for practical application in urban infrastructure management highlights the need for further research.