Emergency Responder Stationing
Published:
π Key Contributions
- We introduce a novel solution approach using Deep Reinforcement Learning and Combinatorial Optimization techniques to enable real-time decision-making.
- We use DDPG to train agents for performing redistribution actions (city-scale) and reallocation actions (region-scale).
- We utilize a Transformer-based actor to handle variable numbers of responders and depots during region-level reallocation.
- We map continuous actions exactly to discrete actions using combinatorial optimization (min-cost flow + max-weight matching), preserving gradient flow while ensuring feasibility.
- We signal the performance of high-level actions through low-level critics.
- Our trained DRL agents achieve 1000x faster decision-making while reducing response times to between 5 and 13 seconds on real-world datasets.
π High-Level Overview of the SOTA Approach with Hierarchical Coordination
This diagram illustrates our state-of-the-art hierarchical coordination framework that combines queuing based city-scale redistributions and MCTS based region-level reallocations of responders.
π§ Region-Level Reallocation via DDPG Training
We leverage DDPG to train agents that perform region-level reallocation of responders, enabling efficient adaptation to changing demand at a broader geographic scale.
ποΈ City-Level Redistribution via DDPG Training
At the city scale, DDPG is used to train agents for fine-grained redistribution of responders, allowing precise real-time response in dense urban environments.
π Publication
Published as a full paper at ICML 2024 β βMulti-Agent Reinforcement Learning with Hierarchical Coordination for Emergency Responder Stationing.β [OpenReview]
π» Code & Data
Reproducible code, training scripts, and Nashville & Seattle datasets: [Code & Data]
π₯ 3-Minute Overview
Summarising the challenges, solution approach, and results: [Short Video]