Yingwei Ma (马迎伟) is a Member of Technical Staff on the RL team @ Moonshot AI. His research lies at the intersection of foundation model and software engineering. He had planned to pursue his PhD at HKUST under the supervision of Prof. S.C. CHEUNG, but decided to postpone this plan as he got AGI-pilled. Currently, he focuses on using agent techniques (or Agentic LLM) to solve end-to-end SE/Research problems.
[arxiv'25] Yihong Dong, Xue Jiang, Yongding Tao, Huanyu Liu, Kechi Zhang, Lili Mou, Rongyu Cao, Yingwei Ma, Jue Chen, Binhua Li, Zhi Jin, Fei Huang, Yongbin Li, Ge Li, RL-PLUS: Countering Capability Boundary Collapse of LLMs in Reinforcement Learning. arXiv preprint arXiv:2508.00222. [paper]
[arxiv'24] Yalan Lin, Yingwei Ma, Rongyu Cao, Binhua Li, Fei Huang, Xiaodong Gu, Yongbin Li, LLMs as Continuous Learners: Improving the Reproduction of Defective Code in Software Issues. arXiv preprint arXiv:2411.13941. [paper]
[arxiv'24] Zhenyu Pan, Rongyu Cao, Yongchang Cao, Yingwei Ma, Binhua Li, Fei Huang, Han Liu, Yongbin Li, Codev-Bench: How Do LLMs Understand Developer-Centric Code Completion?. arXiv preprint arXiv:2410.01353. [paper]
[Technical Report] Yingwei Ma (Co-author), contributed to Coding Agentic Capabilities for Kimi K2, Kimi K2: Open Agentic Intelligence. arXiv preprint arXiv:2507.20534. [paper]
[ASE'25] Yingwei Ma, Binhua Li, Yihong Dong, Xue Jiang, Rongyu Cao, Fei Huang, Yongbin Li, Thinking Longer, Not Larger: Enhancing Software Engineering Agents via Scaling Test-Time Compute. arXiv preprint arXiv:2503.23803. ASE’25, CCF-A. Accepted as an Industry Full Paper. [paper]
[ISSTA'25
] Yingwei Ma, Rongyu Cao, Yongchang Cao, Yue Zhang, Jue Chen, Yibo Liu, Yuchen Liu, Binhua Li, Fei Huang, Yongbin Li, Lingma SWE-GPT: An Open Development-Process-Centric Language Model for Automated Software Improvement. The ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA’25), CCF-A. (ACM SIGSOFT Distinguished Paper Award, The Best Award At The Conference) [paper] [link]
[FSE'25] Yingwei Ma, Qingping Yang, Rongyu Cao, Binhua Li, Fei Huang, Yongbin Li, Alibaba LingmaAgent: Improving Automated Issue Resolution via Comprehensive Repository Exploration. arXiv preprint arXiv:2406.01422. FSE’25, CCF-A. Accepted as an Industry Full Paper. [paper]
[ICLR'24] Yingwei Ma, Yue Liu, Yue Yu, Yuanliang Zhang, Yu Jiang, Changjian Wang, Shanshan Li, At Which Training Stage Does Code Data Help LLMs Reasoning?. The 12th International Conference on Learning Representations (ICLR-24) , Vienna Austria, May 7th-11th, 2024. (Spotlight, Top 5%) [paper]
[SANER'23
] Yingwei Ma, Yue Yu, Shanshan Li, Zhouyang Jia, Jun Ma, Rulin Xu, Wei Dong and Xiangke Liao, MulCS: Towards a Unified Code Representation for Multilingual Code Search. 30th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), Macao SAR, China, March 21st-24th, 2023.(IEEE TCSE Distinguished Paper Award, The Best Award At The Conference) [paper]
[AAAI'26] Zhenhao Zhu, Yue Liu, Yingwei Ma, Hongcheng Gao, Nuo Chen, Yanpei Guo, Wenjie Qu, Huiying Xu, Xinzhong Zhu, Jiaheng Zhang, ExtendAttack: Attacking Servers of LRMs via Extending Reasoning. The 40th Annual AAAI Conference on Artificial Intelligence (AAAI-26) [paper]
[AAAI'26] Xue Jiang, Yihong Dong, Zheng Fang, Yingwei Ma, Tangxinyu Wang, Rongyu Cao, Binhua Li, Zhi Jin, Wenpin Jiao, Yongbin Li, Ge Li, Large Language Model Unlearning for Source Code. The 40th Annual AAAI Conference on Artificial Intelligence (AAAI-26) [paper]
[ICLR'25] Jie Cheng, Ruixi Qiao, Yingwei Ma, Gang Xiong, Qinghai Miao, Binhua Li, Yongbin Li, Yisheng Lv, Scaling Offline Model-Based RL via Jointly-Optimized World-Action Model Pretraining. The Thirteenth International Conference on Learning Representations (ICLR-25) [paper]
[ICML'25] Yue Liu, Xiaoxin He, Miao Xiong, Yingwei Ma, Jiaheng Zhang, Bryan Hooi, FLIPATTACK: JAILBREAK LLMS VIA FLIPPING. (ICML-25) [paper]
[EMNLP'25 Findings] Bo Yang, Qingping Yang, Yingwei Ma, Runtao Liu, UTMath: Math Evaluation with Unit Test via Reasoning-to-Coding Thoughts. arXiv preprint arXiv:2411.07240. [paper]
[NeurIPS'24] Yue Liu, Shihao Zhu, Jun Xia, Yingwei Ma, Jian Ma, Wenliang Zhong, Xinwang Liu, Guannan Zhang, Kejun Zhang, End-to-end learnable clustering for intent learning in recommendation. The 38th Annual Conference on Neural Information Processing Systems (NeurIPS-24) [paper]
[2025.05-]
Technical Staff in RL team
, Moonshot AI.
[2024.02-2025.05]
Researcher in TONGYI Lab
, Alibaba Cloud.
[2021.9-2023.12]
M.E. in National University of Defense Technology (NUDT).
Supervisor: Prof. Shanshan Li
National Scholarship (国家奖学金).
[2017.9-2021.6]
B.E. in Yanshan University(YSU) at Qinhuangdao, Hebei Province.
Supervisor: Prof. Fengda Zhao
National Scholarship (国家奖学金).
[2014.9-2017.6]
High school in Zhuozhou Middle School, Hebei Province.
Provincial First Prize in the Mathematics Olympiad (数学奥林匹克竞赛省一等奖).