About me

I am currently a second-year M.S. student at CCIIP Lab at HUST with Prof. Wei Wei. I received the B.S. in Computer Science and Technology from HUST.
My current research interests lie in RL algorithm for LLM/MLLM, especially on alignment and reasoning.

šŸ”„ News

  • 2025/01 1 paper got accepted by ICLR2025
  • 2024/02 1 paper got accepted by Findings of NAACL2024
  • 2023/10 I receive National Scholarship (Top 1% in HUST)
  • 2023/05 1 paper got accepted by ACL2023 main conference

šŸ“ Selected Publications

(See full list in the publication section or [google scholar])
  • Process Reinforcement through Implicit Rewards [pdf] [code]
    Ganqu Cui*, Lifan Yuan*, Zefan Wang*, Hanbin Wang*, Wendi Li* , Bingxiang He*, Yuchen Fan*, Tianyu Yu*, Qixin Xu*, Weize Chen, Jiarui Yuan, Huayu Chen, Kaiyan Zhang, Xingtai Lv, Shuo Wang, Yuan Yao, Xu Han, Hao Peng, Yu Cheng, Zhiyuan Liu, Maosong Sun, Bowen Zhou, Ning Ding


  • Free Process Rewards without Process Labels [pdf] [code]
    Lifan Yuan*, Wendi Li*, Huayu Chen, Ganqu Cui, Ning Ding, Kaiyan Zhang, Bowen Zhou, Zhiyuan Liu, Hao Peng


  • Process Reward Model with Q-value Rankings [pdf] [code] (ICLR2025)
    Wendi Li, Yixuan Li

  • Reinforcement Learning with Token-level Feedback for Controllable Text Generation [pdf] [code] (Findings of NAACL2024)
    Wendi Li, Wei Wei, Kaihe Xu, Wenfeng Xie, Dangyang Chen, Yu Cheng


  • TREA: Tree-Structure Reasoning Schema for Conversational Recommendation [pdf] [code] (ACL2023)
    Wendi Li, Wei Wei, Xiaoye Qu, Xian-Ling Mao, Ye Yuan, Wenfeng Xie, Dangyang Chen


Misc

I’m enjoying listening to all kinds of music. I’m a big fan of Billie Elish and Lana Del Ray.

I also love Literature. Some of my favorite novelists: Jose Saramago, Albert Camus, Murakami Haruki, Yoko Tawada, Xuetao Shuang.