Parallel server systems are stochastic processing networks with applications such as manufacturing, supply chains, ride-hailing, and call centers. Heterogeneous customers arrive at the system, and only a subset of servers can serve the customer types specified in the flexibility graph. The system operator’s goal is to minimize delays that depend on scheduling policies and flexibility graphs. A long line of literature focuses on designing near-optimal scheduling policies given a given flexibility graph. On the contrary, we fix the scheduling policy to the so-called MaxWeight scheduling, considering good delay performance, and focus on designing a near-optimal sparse flexibility graph. Our contribution is tripled.
We first analyze the expected delays in the asymptotic regime of heavy traffic in terms of the properties of the flexibility graph, and use this result to develop the transport polytope, the deterministic equivalent of parallel server queues. Transform design problems in terms of perspective. We then design the sparseest flexibility graph that achieves the specified delay performance and shows the robustness of the design requiring uncertainty. Third, given the sequential arrival of the budget for adding edges in time, we present the optimal schedule for adding them to the flexibility graph. These results were obtained by proving new results for transport polytopes and are an independent concern. In particular, translating difficulties into simpler models, namely transport polytopes, allows us to develop a unified framework that answers several design questions.