About Me

Yingwei Ma (马迎伟) is a Member of Technical Staff on the RL team @ Moonshot AI. His research lies at the intersection of foundation model and software engineering. He had planned to pursue his PhD at HKUST under the supervision of Prof. S.C. CHEUNG, but decided to postpone this plan as he got AGI-pilled. Currently, he focuses on using agent techniques (or Agentic LLM) to solve end-to-end SE/Research problems.

News

[2026.01] Introducing Kimi-K2.5: It seamlessly integrates vision and language understanding with advanced agentic capabilities, instant and thinking modes, as well as conversational and agentic paradigms. [Github] [Huggingface]
[2025.11] Two papers have been accepted by AAAI'26.
[2025.11] Introducing Kimi-K2-Thinking: Kimi K2 Thinking is the latest, most capable version of open-source thinking model. Starting with Kimi K2, we built it as a thinking agent that reasons step-by-step while dynamically invoking tools. [Github] [Huggingface]
[2025.09] One paper has been accepted by ASE'25.
[2025.09] Introducing Kimi-K2-0905: Kimi K2-Instruct-0905 demonstrates significant improvements in performance on public benchmarks and real-world coding agent tasks. [Github] [Huggingface]
[2025.08] Two papers have been accepted by EMNLP'25 Findings.
[2025.07] Introducing Kimi-K2: A Open-source LLM for Agentic Intelligence. [Github] [Huggingface] [Paper]
[2025.06] Introducing Kimi-Dev: A Strong and Open-source Coding LLM for Issue Resolution. [Github] [Huggingface]
[2025.06] I won the ACM SIGSOFT Distinguished Paper Award.
[2025.03] One patent on code processing method has been published.
[2025.03] Alibaba LingmaAgent has been accepted by FSE'25.
[2025.01] One paper has been accepted by ICLR'25.
[2025.01] Two patents on code processing and intelligent software development methods have been published.
[2024.12] One paper has been accepted by ISSTA'25.
[2024.09] One paper has been accepted by NeurIPS'24.
[2024.06] Alibaba Lingma Agent obtained SOTA on SWE-Bench Lite.
[2024.01] One paper has been accepted by ICLR'24 (Spotlight, top 5%).
[2023.05] One paper has been accepted by Internetware'23.
[2023.03] I won the IEEE TCSE Distinguished Paper Award.
[2023.03] Two papers have been accepted by SANER'23.
[2022.09] I won the Outstanding Students of the School of Computer Science, National University of Defense Technology.

Publication

Preprints:

[arxiv'26] Jiaran Zhang, Luck Ma, Yanhao Li, Fanqi Wan, Di Qi, Xu Zhao, Jieyi Hou, Zhe Xie, Mengqiang Ren, Xin Wu, Zhewei Huang, Liangyu Chen, Yingwei Ma, Qi Han, Xiangyu Zhang, DOCKSMITH: Scaling Reliable Coding Environments via an Agentic Docker Builder. arXiv preprint arXiv:2602.00592. [paper] [dataset]
[arxiv'25] Kelin Fu, Tianyu Liu, Zeyu Shang, Yingwei Ma, Jian Yang, Jiaheng Liu, Kaigui Bian, Multi-Docker-Eval: A ‘Shovel of the Gold Rush’ Benchmark on Automatic Environment Building for Software Engineering. arXiv preprint arXiv:2512.06915v2. [paper] [benchmark]
[arxiv'25] Yihong Dong, Xue Jiang, Yongding Tao, Huanyu Liu, Kechi Zhang, Lili Mou, Rongyu Cao, Yingwei Ma, Jue Chen, Binhua Li, Zhi Jin, Fei Huang, Yongbin Li, Ge Li, RL-PLUS: Countering Capability Boundary Collapse of LLMs in Reinforcement Learning. arXiv preprint arXiv:2508.00222. [paper]
[arxiv'24] Yalan Lin, Yingwei Ma, Rongyu Cao, Binhua Li, Fei Huang, Xiaodong Gu, Yongbin Li, LLMs as Continuous Learners: Improving the Reproduction of Defective Code in Software Issues. arXiv preprint arXiv:2411.13941. [paper]
[arxiv'24] Zhenyu Pan, Rongyu Cao, Yongchang Cao, Yingwei Ma, Binhua Li, Fei Huang, Han Liu, Yongbin Li, Codev-Bench: How Do LLMs Understand Developer-Centric Code Completion?. arXiv preprint arXiv:2410.01353. [paper]

Peer-Reviewed And Technical Report:

[Technical Report] Yingwei Ma (Co-author), I did my best to enhance K2.5’s agentic capabilities across SWE, terminal usage, research scenarios, and security, Kimi K2.5: Visual Agentic Intelligence. arXiv preprint arXiv:2602.02276. [paper]
[Technical Report] Yingwei Ma (Co-author), contributed to Coding Agentic Capabilities for Kimi K2, Kimi K2: Open Agentic Intelligence. arXiv preprint arXiv:2507.20534. [paper]
[Technical Report] Yingwei Ma (Co-author), introducing Kimi-Dev: A Strong and Open-source Coding LLM for Issue Resolution, Kimi-Dev: Agentless Training as Skill Prior for SWE-Agents. arXiv preprint arXiv:2509.23045. [paper]
[ASE'25] Yingwei Ma, Binhua Li, Yihong Dong, Xue Jiang, Rongyu Cao, Fei Huang, Yongbin Li, Thinking Longer, Not Larger: Enhancing Software Engineering Agents via Scaling Test-Time Compute. arXiv preprint arXiv:2503.23803. ASE’25, CCF-A. Accepted as an Industry Full Paper. [paper]
[ISSTA'25 🏆] Yingwei Ma, Rongyu Cao, Yongchang Cao, Yue Zhang, Jue Chen, Yibo Liu, Yuchen Liu, Binhua Li, Fei Huang, Yongbin Li, Lingma SWE-GPT: An Open Development-Process-Centric Language Model for Automated Software Improvement. The ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA’25), CCF-A. (ACM SIGSOFT Distinguished Paper Award, The Best Paper Award at The Conference) [paper] [link]
[FSE'25] Yingwei Ma, Qingping Yang, Rongyu Cao, Binhua Li, Fei Huang, Yongbin Li, Alibaba LingmaAgent: Improving Automated Issue Resolution via Comprehensive Repository Exploration. arXiv preprint arXiv:2406.01422. FSE’25, CCF-A. Accepted as an Industry Full Paper. [paper]
[ICLR'24] Yingwei Ma, Yue Liu, Yue Yu, Yuanliang Zhang, Yu Jiang, Changjian Wang, Shanshan Li, At Which Training Stage Does Code Data Help LLMs Reasoning?. The 12th International Conference on Learning Representations (ICLR-24) , Vienna Austria, May 7th-11th, 2024. (Spotlight, Top 5%) [paper]
[SANER'23 🏆] Yingwei Ma, Yue Yu, Shanshan Li, Zhouyang Jia, Jun Ma, Rulin Xu, Wei Dong and Xiangke Liao, MulCS: Towards a Unified Code Representation for Multilingual Code Search. 30th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), Macao SAR, China, March 21st-24th, 2023.(IEEE TCSE Distinguished Paper Award, The Best Paper Award at The Conference) [paper]
[AAAI'26] Zhenhao Zhu, Yue Liu, Yingwei Ma, Hongcheng Gao, Nuo Chen, Yanpei Guo, Wenjie Qu, Huiying Xu, Xinzhong Zhu, Jiaheng Zhang, ExtendAttack: Attacking Servers of LRMs via Extending Reasoning. The 40th Annual AAAI Conference on Artificial Intelligence (AAAI-26) [paper]
[AAAI'26] Xue Jiang, Yihong Dong, Zheng Fang, Yingwei Ma, Tangxinyu Wang, Rongyu Cao, Binhua Li, Zhi Jin, Wenpin Jiao, Yongbin Li, Ge Li, Large Language Model Unlearning for Source Code. The 40th Annual AAAI Conference on Artificial Intelligence (AAAI-26) [paper]
[ICLR'25] Jie Cheng, Ruixi Qiao, Yingwei Ma, Gang Xiong, Qinghai Miao, Binhua Li, Yongbin Li, Yisheng Lv, Scaling Offline Model-Based RL via Jointly-Optimized World-Action Model Pretraining. The Thirteenth International Conference on Learning Representations (ICLR-25) [paper]
[ICML'25] Yue Liu, Xiaoxin He, Miao Xiong, Yingwei Ma, Jiaheng Zhang, Bryan Hooi, FLIPATTACK: JAILBREAK LLMS VIA FLIPPING. (ICML-25) [paper]
[EMNLP'25 Findings] Bo Yang, Qingping Yang, Yingwei Ma, Runtao Liu, UTMath: Math Evaluation with Unit Test via Reasoning-to-Coding Thoughts. arXiv preprint arXiv:2411.07240. [paper]
[NeurIPS'24] Yue Liu, Shihao Zhu, Jun Xia, Yingwei Ma, Jian Ma, Wenliang Zhong, Xinwang Liu, Guannan Zhang, Kejun Zhang, End-to-end learnable clustering for intent learning in recommendation. The 38th Annual Conference on Neural Information Processing Systems (NeurIPS-24) [paper]

Patents

[2025] Yingwei Ma. 异常数据复现方法以及异常代码复现方法 (Abnormal data reproduction method and abnormal code reproduction method). Patent Application No. CN119166410B, Alibaba (China) Co., Ltd. (Application Date: 2025.02.14, Publication Date: 2025.02.14)
[2024] Yingwei Ma. 代码处理方法以及代码修复测试方法 (Code Processing Method and Code Repair Testing Method). Patent Application No. CN118760612A, Alibaba (China) Co., Ltd. (Application Date: 2024.09.04, Publication Date: 2024.10.11)
[2024] Yingwei Ma. 代码处理模型训练、代码任务处理以及代码开发方法 (Code Processing Model Training, Code Task Processing and Code Development Method). Patent Application No. CN118860900A, Alibaba (China) Co., Ltd. (Application Date: 2024.09.19, Publication Date: 2024.10.29)

Experience

[2025.05-]

Technical Staff in RL team , Moonshot AI.
[2024.02-2025.05]

Researcher in TONGYI Lab , Alibaba Cloud.

Education

[2021.9-2023.12]

M.E. in National University of Defense Technology (NUDT).

Supervisor: Prof. Liao, Xiangke and Shanshan Li

National Scholarship (国家奖学金).
[2017.9-2021.6]

B.E. in Yanshan University(YSU) at Qinhuangdao, Hebei Province.

Supervisor: Prof. Fengda Zhao

National Scholarship (国家奖学金).
[2014.9-2017.6]

High school in Zhuozhou Middle School, Hebei Province.

Provincial First Prize in the Mathematics Olympiad (数学奥林匹克竞赛省一等奖).

Academic Service

Program committee for AAAI’25
Program committee for EXPRESS@ISSTA’25
Reviewer for ICML’25, ICLR’25, NeurIPS’25
Reviewer for NeurIPS’24, ICLR’24
Reviewer for EMNLP’23