Simin Li (李思民)

I am a Ph.D. student (2021.09-) at the State Key Laboratory of Software Development Environment (SKLSDE) and School of Computer Science and Engineering (SCSE) , Beihang University, Beijing, China, supervised by Prof. Xianglong Liu. I also receive supervision from Prof. Yaodong Yang at Peking University (2023.03-). I am a visiting scholar (2024.06 - 2025.05) at Nanyang Technological University (NTU), under the supervision of Prof. Bo An. Before that, I obtained my BSc degree in 2020 from Beihang University (Summa Cum Laude).

new I expect to gratuate at November 2025. I will be joining the Chinese University of Hong Kong as a Honorary Research Assistant from 2025.07-2025.09. After PhD graduation, I will be joining the Chinese University of Hong Kong as a postdoc, supervised by Prof. Qi Dou.

[Prospective students] Our group has positions for PhD students, Master students, and visiting students. If you are interested, please send me an email with your CV and publications (if any).

[Companies] I am open to cooperation. I am particularly interested in research on single/multi-agent control and their robustness. I am also interested in AI alignment and deception. Please send me an email for contact.



Email: lisiminsimon@buaa.edu.cn

Google Scholar / CV

Research

I work on Trustworthy AI for multi-agent reinforcement learning (MARL) during my PhD. My research goal is to make reinforcement learning safe and robust, including practical adversarial attack for RL/MARL, robustness evaluation of MARL and adversarial defense.

new I am moving towards the research direction of Vision-Language-Action Models and AI Alignment. New works coming soon.

Now my research mainly includes:
  • Robust Multi-Agent System
  • AI Alignment
  • Vision-Language-Action Models

I previously work on trustworthy AI for computer vision, including digital world attacks for privacy protection and evaluating naturalness of physical world attacks. Apart from trustworthy AI, I am lucky to work with prominent researchers in various fields, including complex networks, human-computer interaction, robotics, time series forecasting, smart transportation and microelectronics. They have greatly broadened my view and allow me to think in a multidisciplinary way.
News

[2025.06] One first-authored paper on MARL attack accepted by Neural Networks

[2025.06] One first-authored paper on robust MARL accepted by IEEE TNNLS

[2025.05] One co-first-authored paper on robust financial trading to be resubmitted to KDD

[2024.05] Five papers (two first-authored, one corresponding) submitted to NeurIPS 2025

[2024.06] One first-authored paper submitted to IEEE TPAMI

[2024.02] One co-authored paper on collision avoidance accepted by IEEE RAL

[2024.01] One first-authored paper on defending Byzantine adversary of MARL accepted by ICLR 2024

[2024.01] One co-authored paper on partial symmetry for MARL accepted by AAAI 2024

[2022.11] One first-authored paper on naturalness of physical world adversarial attack accepted by CVPR 2023

[2022.07] One first-authored paper on privacy protection of fingerprints accepted by IEEE TIP.

[2022.04] One co-authored paper on robustness testing of MARL accepted by CVPR 2022 workshop.

Selected Publication
AAAI2024

Empirical Study on Robustness and Resilience in Cooperative Multi-Agent Reinforcement Learning
Simin Li, Zihao Mao, Hanxiao Li, Zonglei Jing, Zhuohang Bian, Jun Guo, Li Wang, Zhuoran Han, Ruixiao Xu, Xin Yu, Chengdong Ma, Yuqing Ma, Bo An, Yaodong Yang, Weifeng Lv, Xianglong Liu.

Submitted to NeurIPS 2025.

We provide a comprehensive evaluation to throughly study the effect of robustness and resilience in MARL.

AAAI2024

Vulnerable Agent Identification in Large-Scale Multi-Agent Reinforcement Learning
Simin Li, Yuwei Zheng, Zihao Mao, Linhao Wang, Ruixiao Xu, Chengdong Ma, Xin Yu, Yuqing Ma, Xin Wang, Jie Luo, Bo An, Yaodong Yang, Weifeng Lv, Xianglong Liu.

Submitted to NeurIPS 2025.

We design a principled method to identify vulnerable agents in large-scale multi-agent systems.

AAAI2024

Adversarial Policy Transfer in Mixed Cooperative-Competitive Games
Ruixiao Xu, Zhiqian Liu, Zhixia Zhang, Simin Li (corresponding), Yaodong Yang, Xianglong Liu.

Submitted to NeurIPS 2025.

We propose a transferable adversarial policy framework for mixed cooperative-competitive games.

AAAI2024

Robust Multi-Agent Control via Maximum Entropy Heterogeneous-Agent Reinforcement Learning
Simin Li* (co-first), Yifan Zhong*, Jiarong Liu*, Jianing Guo, Siuyuan Qi, Ruixiao Xu, Xin Yu, Siyi Hu, Haobo Fu, Qiang Fu, Xiaojun Chang, Yujing Hu, Bo An, Xianglong Liu, Yaodong Yang.

Submitted to IEEE TPAMI (Under Review).

We develop theories on Maximum Entropy Heterogeneous-Agent RL, which is principally the optimal way of MARL learning. We proof it is robust to attacks in state, action, reward and environment transitions. Our algorithm outperforms strong baselines in 34 out of 38 tasks, and is robust to perturbations with different modalities across 14 magnitudes.

AAAI2024

Bayesian Robust Financial Trading with Adversarial Synthetic Market Data
Haochong Xia*, Simin Li* (co-first), Ruixiao Xu*, Zhixia Zhang, Hongxiang Wang, Zhiqian Liu, Teng Yao Long, Molei Qin, Chuqiao Zong, Bo An.

Submitted to KDD 2025 (Current Status: Resubmit).

We use Bayesian robust RL techniques for robust financial trading.

ICLR2024

Byzantine Robust Cooperative Multi-Agent Reinforcement Learning as a Bayesian Game
Simin Li, Jun Guo, Jingqiao Xiu, Xin Yu, Jiakai Wang, Aishan Liu, Yaodong Yang, Xianglong Liu.

Accepted by ICLR, 2024
pdf / Project page

We study robustness of MARL against Byzantine action perturbations by formulating it as a Bayesian game. We provide a rigorious formulation of this problem and an algorithm with strong empirical performance.

AAAI2024

Robust Multi-Agent Reinforcement Learning by Mutual Information Regularization
Simin Li, Ruixiao Xu, Jingqiao Xiu, Yuwei Zheng, Pu Feng, Yuqing Ma, Bo An, Yaodong Yang, Xianglong Liu.

Accepted by IEEE TNNLS.
pdf

We proof that minimizing mutual information as a regularization term is minimizing a lower bound of robustness in MARL under all potential threat scenarios.

IEEE TCYB

Attacking Cooperative Multi-Agent Reinforcement Learning by Adversarial Minority Influence
Simin Li, Jun Guo, Jingqiao Xiu, Yuwei Zheng, Pu Feng, Xin Yu, Jiakai Wang, Aishan Liu, Yaodong Yang, Bo An, Wenjun Wu, Xianglong Liu.

Accepted by Neural Networks.
pdf / Project page

We propose the first adversarial policy attack for c-MARL, which is strong and practical. Our attack provides the first demonstration that adversarial policy is effective against real world robot swarms.

CVPR 2023 Towards Benchmarking and Assessing Visual Naturalness of Physical World Adversarial Attacks
Simin Li, Shuning Zhang, Gujun Chen, Dong Wang, Pu Feng, Jiakai Wang, Aishan Liu, Xin Yi, Xianglong Liu.
Accepted by CVPR, 2023
pdf / Project page

We take the first step to evaluate the naturalness of physical world adversarial examples by a human oriented approach. We collect the first dataset with human naturalness ratings and human gaze, unveil insights of how contextual and behavioral features will affect attack naturalness, and propose an algorithm to automatically evaluate naturalness by aligning human behavior and algorithm prediction.

TIP

Hierarchical Perceptual Noise Injection for Social Media Fingerprint Privacy Protection
Simin Li, Huangxinxin Xu, Jiakai Wang, Aishan Liu, Fazhi He, Xianglong Liu, Dacheng Tao.

Accepted by IEEE TIP.
pdf / Project page

While billions of people are sharing their daily life images on social media everyday, hackers can easily steal fingerprint from the shared images. We leverage adversarial attack to protect such privacy leakage, such that hackers cannot extract fingerprints even they get the shared images in social media. Our method, FingerSafe, is strong for protection and natural for daily use.

PontTuset

Towards Comprehensive Testing on the Robustness of Cooperative Multi-agent Reinforcement Learning
Jun Guo, Yonghong Chen, Yihang Hao, Zixin Yin, Yin Yu, Simin Li (corresponding author).
Accepted by CVPR Workshop, 2022
pdf

We propose a testing framework to evaluate the robustness of multi-agent reinforcement learning (MARL) algoritms from the aspect of observation, action and reward. Our work first point out state-of-the-art MARL algorithms, including QMIX and MAPPO, are non-robust in multiple aspects, and point out the urgent need to test and enhance the robustness of MARL algorithms.

AAAI2024

Leveraging Partial Symmetry for Multi-Agent Reinforcement Learning
Xin Yu, Rongye Shi, Pu Feng, Yongkai Tian, Simin Li, Shuhao Liao, Wenjun Wu.

Accepted by AAAI, 2024

Symmetry has been used in MARL as a prior to incorporate domain knowledge in the environment, which enhance sample efficiency and performance. In this paper, we extend symmetry to paritial symmetry that considers uncertainties in environment with non-uniform field, including uneven terrain, wind, etc.

AAAI2024

Exploiting Spatio-Temporal Symmetry for Multi-Agent Reinforcement Learning
Xin Yu, Rongye Shi, Yongkai Tian, Li Wang, Tianhao Peng, Simin Li, Pu Feng, Wenjun Wu.

Submitted to IJCAI, 2024

Symmetry are everywhere in real world, yet current MARL algorithms are agnostic of such symmetry by design. We extend the idea of symmetry to temporal domain, proposing spatial-temporal symmetry network, which includes adds a stronger induction bias during network training.

AAAI2024

Lyapunov-Informed Multi-Agent Reinforcement Learning
Pu Feng, Rongye Shi, Size Wang, Xin Yu, Junkang Liang, Jiakai Wang, Simin Li, Wenjun Wu.

Submitted to IJCAI, 2024

Many MARL tasks specify certain goal states where special rewards are granted. The optimal policy in such task could be characterized by Lyapnov stability, where the policy asymptotically converge to the goal states from any initial, making the goal states stable equilibria. We formulate such process as a Lyapunov Markov game, and proof it facilitates the training process to find a stable suboptimal policy more easily and then converge to an optimal policy more efficiently.

AIView SPF-RL: Multi-robots Collision Avoidance with Soft Potential Field informed reinforcement learning
Pu Feng, Xin Yu, Wenjun Wu, Yongkai Tian, Junkang Liang, Simin Li.
Submitted to RAL, 2024.

Motivated by soft potential field theory, we propose an algorithm to avoid collision in robot swarms.

PontTuset A Survey on Adversarial Attacks and Defenses for Deep Reinforcement Learning (in Chinese)
Aishan Liu, Jun Guo, Simin Li, Yisong Xiao, Xianglong Liu, Dacheng Tao.
Accepted by Chinese Journal of Computers (计算机学报, top journal in China, CCF-A), 2023.

We provide a comprehensive survey of attack and defenses for deep reinforcement learning. We first analyze adversarial attacks from the perspectives of state-based, reward-based and action based attacks. Then, we illustrate adversarial defenses from adversarial training, adversarial detection, certified robustness and robust learning. Finally, we investigate interesting topics including adversaries for good and model robustness understanding for DRL, and highlights open issues and future challenges in this field.

Simulation Platform and Verification for Adversarial Multi-Agent Reinforcement Learning in Unmanned Aerial Vehicle Swarms (in Chinese)
Shuangcheng Liu, Simin Li (corresponding author), Hainan Li, Jingqiao Xiu, Aishan Liu, Xianglong Liu.
Accepted by Journal of Cybersecurity (网络空间安全科学学报, Chinese journal on AI secuity), 2023.

We provide an AirSim-based unmanned aerial vehicle (UAV) simulator. Based on this simulator, we identify several critical adversarial attacks in multi-UAV combat.

PontTuset Behavioral Dynamics and Safety Monitoring Methods for Intelligent Systems (in Chinese)
Simin Li, Jiakai Wang, Aishan Liu, Xianglong Liu.
Accepted by Journal of Cybersecurity (网络空间安全科学学报, Chinese journal on AI secuity), 2023.

We advocate the research on behavioral dynamics, which provides both microscopic and macroscopic understanding on adversarial vulnerability. We argue that combining the search of network science and game theory with AI safety could potentially benefit the understanding on micro information transmission and macro agent-wise intereaction.

PontTuset Theories and methods for full life cycle intelligent systems security testing
Jiakai Wang, Aishan Liu, Simin Li, Xianglong Liu, Wenjun Wu.
Accepted by Artificial Intelligence Security (智能安全, Chinese journal on AI secuity), 2023.

We propose our recent insight to test the security of an intelligent system from full life cycles, including vulnerabilities in model training, testing and deployment and their testing techniques. We offer insights on safety standards, safety testing platforms and sketch our method on security evaluation of autonomous driving.

Awards

Youth Talent Support Program of the China Association for Science and Technology for Doctoral Students (中国科协青年人才托举工程博士生专项计划), 2024.

National Scholarship, 2024.

State-Sponsored Scholarship for joint PhD students, 2024 (120K RMB).

Doctoral Research Excellence Academic Fund, 2024 (40K RMB).

Academic Services

[Workshop@CVPR]I serve as Program Commitee at workshop The Art of Robustness: Devil and Angel in Adversarial Machine Learning at CVPR 2023.

[Reviewer]I am a reviewer of ICLR, ICML, NeurIPS, etc. and area chair of DAI.