About
I am a PhD student at the University of Manchester, advised within the Department of Computer Science. My research focuses on reinforcement learning and distributed computation — I am interested in how we can scale RL training reliably and efficiently through decentralized algorithms.
Currently I am investigating natural policy gradient methods in decentralized settings, aiming to develop algorithms that are both theoretically grounded and practically scalable. Previously I worked on parameter-efficient multi-agent policy learning, introducing low-rank agent-specific adaptation to cooperative MARL.
Research Interests
- Reinforcement Learning Scalable RL training, natural policy gradient, and policy optimization theory
- Decentralized Optimization Consensus-based and gossip-based methods for distributed learning
Publications
Low-Rank Agent-Specific Adaptation (LoRASA) for Multi-Agent Policy Learning
arXiv preprint · February 2025
Introduces LoRASA, which appends small low-rank adaptation matrices to each layer of a shared policy, enabling agent-specific specialization in cooperative MARL while reducing memory and compute costs. Achieves competitive or superior performance on StarCraft and MuJoCo benchmarks.
arXiv →Contact
I am happy to discuss research, collaborations, or questions about my work.
| beining.zhang@postgrad.manchester.ac.uk | |
| Google Scholar | Profile |
| GitHub | ZBN111 |