Ruoxuan Zhang

PhD Candidate at Jilin University

Github Hugging Face

About Me

您的头像

I am a first-year Ph.D. student at the College of Computer Science, Jilin University, advised by Prof. Hongxia Xie. My research interests focus on Recipe Generation, Sequential Image Synthesis, and Embodied Agents. I received my Master’s degree under the supervision of Prof. Dantong Ouyang. Welcome collaborations and discussions!

Email: zhangrx25@mails.jlu.edu.cn

Publications

Paper cover
Beyond Success: Refining Elegant Robot Manipulation from Mixed-Quality Data via Just-in-Time Intervention
CVPR, 2026
Yanbo Mao, Jianlong Fu, Ruoxuan Zhang, Hongxia Xie, Meibao Yao
[arXiv]
Paper cover
MindPower: Enabling Theory-of-Mind Reasoning in VLM-based Embodied Agents
CVPR, 2026
Ruoxuan Zhang, Qiyun Zheng, Zhiyu Zhou, Ziqi Liao, Siyu Wu, Jian-Yu Jiang-Lin, Bin Wen, Hongxia Xie, Jianlong Fu, Wen-Huang Cheng
[arXiv][Homepage]
Paper cover
CookAnything: A Framework for Flexible and Consistent Multi-Step Recipe Image Generation
ACM Multimedia (MM), 2025
Ruoxuan Zhang, Bin Wen, Hongxia Xie, Yi Yao, Songhan Zuo, Jian-Yu Jiang-Lin, Hong-Han Shuai, Wen-Huang Cheng
[arXiv][Homepage]
Paper cover
RecipeGen: A Step-Aligned Multimodal Benchmark for Real-World Recipe Generation
ACM Multimedia Dataset Track (MM Dataset), 2025
Ruoxuan Zhang, Jidong Gao, Bin Wen, Hongxia Xie, Chenming Zhang, Hong-Han Shuai, Wen-Huang Cheng
[arXiv] [Homepage] [Hugging Face]
Paper cover
EmoArt: A Multidimensional Dataset for Emotion-Aware Artistic Generation
ACM Multimedia Dataset Track (MM Dataset), 2025
Cheng Zhang, Hongxia Xie, Bin Wen, Songhan Zuo, Ruoxuan Zhang, Wen-Huang Cheng
[arXiv] [Homepage] [Hugging Face]

Scholarships

Education

Interests

My personal interests include Chinese ancient architecture, Chinese grotto art, and I am a BIG FAN of JJ Lin and David Tao!