基于強化學習的人道主義應(yīng)急物資分配優(yōu)化研究
Research on the Optimization of Humanitarian Emergency Material Allocation Based on Reinforcement Learning
ZHANGJianjunYANGYundan ZHOU Yizhuo
(School of Economics and Management, Tongji University, Shanghai 2Ooo92,China)
Abstract: The efcient allocation of limited humanitarian aid supplies following major emergencies is a critical research topic,aiming to meet the material needs of affected areas while reducing the sufering of disaster victims. This paper addresses this issue by modeling a Mixed Integer Nonlinear Programming (MINLP) problem,which involves solving multi-period dynamic optimization allocation strategies.Reinforcement Learning (RL),as one of the two mainstream methods for current strategy exploration,is particularly suitable for dynamic resource allocation scenarios due to its strong scalability and adaptability to external dynamics through interaction with the environment and feedback signals. We employ the Dueling DQN algorithm to solve for the optimal policy,overcoming the overestimation of Q-values that has been a drawback in previous RL applications to humanitarian aid distribution. This approach more accurately estimates the action-value function for affcted regions. Additionally,the paper introduces a novel stochastic demand assumption,enhancing the model's realism and validity by better reflecting the actual conditions of disaster scenarios. The effectiveness of the proposed method is demonstrated using a numerical example based on the Ya'an earthquake,making this the first study to substantiate the optimization of emergency resource allocation using real data sources with RL. Comparative analysis shows that the Dueling DQN algorithm reduces the total cost by approximately 5% compared to traditional DQN methods, indicating a more effective reduction in the sufering of affected populations. This aligns with the“people-oriented”rescue principle of China and holds significant theoretical and practical implications for humanitarian-based emergencyresponses.
Key words: deep reinforcement learning; humanitarian; emergency supplies distribution
0 引言
在重大突發(fā)事件發(fā)生后,拯救生命、減輕受災民眾痛苦是災害救援的首要目標。(剩余11650字)
- 中國管理學多議...
- “利他”=“利己“嗎?...
- 中國哲學視域下價值邏輯范疇的決...
- 傳統(tǒng)儒家視角下的西方組織文化理...
- 數(shù)字經(jīng)濟賦能農(nóng)業(yè)新質(zhì)生產(chǎn)力:理...
- 新質(zhì)生產(chǎn)力賦能工商管理碩士教育...
- 企業(yè)管理信息系統(tǒng)的數(shù)據(jù)安全治理...
- 基于數(shù)據(jù)挖掘的新能源汽車用戶感...
- 制造業(yè)創(chuàng)新生態(tài)系統(tǒng)數(shù)字化轉(zhuǎn)型的...
- 人工智能下管理理論的新思考...
- 數(shù)字化轉(zhuǎn)型與重污染企業(yè)綠色技術(shù)...
- 供應(yīng)鏈數(shù)字化、綠色創(chuàng)新質(zhì)量與環(huán)...
- 財務(wù)柔性對企業(yè)數(shù)字化轉(zhuǎn)型的影響...
- 基于知識圖譜的國內(nèi)外企業(yè)家精神...
- 德國商學院案例教學的創(chuàng)新實踐研...
- 我國地方政府債務(wù)風險評價研究...
- 基于強化學習的人道主義應(yīng)急物資...
- 醫(yī)患人格相似性對在線醫(yī)療交互效...