📚 Publications
* Equal contribution
Under Review
Resolving Complex Social Dilemmas by Aligning Preferences with Counterfactual Regret
Shuqing Shi, Yudi Zhang, Joel Z. Leibo, Yali Du
We propose a novel approach to resolve complex social dilemmas by aligning agent preferences using counterfactual regret minimization. Our method enables agents to learn cooperative strategies in mixed-motive scenarios where traditional approaches fail.
ICLR 2026
BRIDGE: Bi-level Reinforcement Learning for Dynamic Group Structure in Coalition Formation Games
Shuqing Shi, Nam Phuong Tran, Hao Liang, Debmalya Mandal, Long Tran-Thanh, Yali Du
In The Thirteenth International Conference on Learning Representations (ICLR), 2026
Coalition formation is fundamental to multi-agent cooperation, yet existing approaches typically treat it as a static problem. We propose BRIDGE, a bi-level reinforcement learning framework that jointly learns when to form coalitions (high-level) and how to coordinate within them (low-level).
@inproceedings{shi2026bridge,
title={BRIDGE: Bi-level Reinforcement Learning for Dynamic Group Structure in Coalition Formation Games},
author={Shi, Shuqing and Tran, Nam Phuong and Liang, Hao and Mandal, Debmalya and Tran-Thanh, Long and Du, Yali},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2026}
}
ICLR 2026
SocialJax: An Evaluation Suite for Multi-agent Reinforcement Learning in Sequential Social Dilemmas
Zihao Guo*, Shuqing Shi*, Richard Willis, Tristan Tomilin, Joel Z. Leibo, Yali Du
In The Thirteenth International Conference on Learning Representations (ICLR), 2026
We present SocialJax, a JAX-based evaluation suite for multi-agent reinforcement learning in sequential social dilemmas. Built for speed, SocialJax runs thousands of episodes in seconds on a single GPU while providing standard implementations of classic social dilemmas.
@inproceedings{guo2026socialjax,
title={SocialJax: An Evaluation Suite for Multi-agent Reinforcement Learning in Sequential Social Dilemmas},
author={Guo, Zihao and Shi, Shuqing and Willis, Richard and Tomilin, Tristan and Leibo, Joel Z. and Du, Yali},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2026}
}
NeurIPS 2025
🌟 Spotlight (Top 3%)
Causality Meets Locality: Provably Generalizable and Scalable Policy Learning for Networked Systems
Hao Liang*, Shuqing Shi*, Yudi Zhang, Biwei Huang, Yali Du
In The Thirty-Ninth Annual Conference on Neural Information Processing Systems (NeurIPS), 2025
We propose a principled approach that combines causal inference with locality principles for provably generalizable and scalable policy learning in networked systems. Our method achieves state-of-the-art performance while providing theoretical guarantees.
@inproceedings{liang2025causality,
title={Causality Meets Locality: Provably Generalizable and Scalable Policy Learning for Networked Systems},
author={Liang, Hao and Shi, Shuqing and Zhang, Yudi and Huang, Biwei and Du, Yali},
booktitle={The Thirty-Ninth Annual Conference on Neural Information Processing Systems},
year={2025}
}
NeurIPS D&B 2025
Evaluating Generalization Capabilities of LLM-Based Agents in Mixed-Motive Scenarios Using Concordia
Chandler Smith et al., Shuqing Shi et al.
In NeurIPS Datasets and Benchmarks Track, 2025
We evaluate the generalization capabilities of LLM-based agents in mixed-motive scenarios using the Concordia framework, providing insights into how large language models perform in multi-agent social interactions.
@inproceedings{smith2025evaluating,
title={Evaluating Generalization Capabilities of LLM-Based Agents in Mixed-Motive Scenarios Using Concordia},
author={Smith, Chandler and others},
booktitle={NeurIPS Datasets and Benchmarks Track},
year={2025}
}
NeurIPS 2024
Learning the Expected Core of Strictly Convex Stochastic Cooperative Games
Nam P. Tran, Shuqing Shi, Debmalya Mandal, Yali Du, Long Tran-Thanh
In The Thirty-Eighth Annual Conference on Neural Information Processing Systems (NeurIPS), 2024
We study the problem of learning the expected core in strictly convex stochastic cooperative games, providing theoretical analysis and practical algorithms for stable coalition value distribution under uncertainty.
@inproceedings{tran2024learning,
title={Learning the Expected Core of Strictly Convex Stochastic Cooperative Games},
author={Tran, Nam P. and Shi, Shuqing and Mandal, Debmalya and Du, Yali and Tran-Thanh, Long},
booktitle={The Thirty-Eighth Annual Conference on Neural Information Processing Systems},
year={2024}
}
IJCNN 2022
Solving Poker Games Efficiently: Adaptive Memory Based Deep Counterfactual Regret Minimization
Shuqing Shi, Xiaobin Wang, Dong Hao, Zhiyou Yang, Hong Qu
In International Joint Conference on Neural Networks (IJCNN), 2022
We propose an adaptive memory-based approach to deep counterfactual regret minimization for solving large-scale poker games more efficiently, reducing memory requirements while maintaining solution quality.
@inproceedings{shi2022solving,
title={Solving Poker Games Efficiently: Adaptive Memory Based Deep Counterfactual Regret Minimization},
author={Shi, Shuqing and Wang, Xiaobin and Hao, Dong and Yang, Zhiyou and Qu, Hong},
booktitle={International Joint Conference on Neural Networks},
year={2022}
}