About Me
I am an undergraduate majoring in Intelligent Science and Technology at China University of Geosciences (Wuhan), holding a GPA of 4.03 / 5.00 (3rd of 51). I conducted research under Prof. Chang Tang (May – Oct 2024) and Assoc. Prof. Bo Zhao (Dec 2024 – May 2025).
My interests center on 3D vision and multimodal large language models (MLLMs), focusing on spatial-temporal perception, 3D scene understanding & reconstruction, and cross-modal information fusion.
Rather than incremental benchmark tuning, I strive to deliver research with lasting impact. My goals are to (1) define novel tasks and establish strong baselines; (2) tackle challenges once considered intractable through innovative approaches; and (3) develop broadly generalizable frameworks that unify related tasks.
Publications
We introduce STI-Bench, a comprehensive benchmark designed to evaluate MLLMs' spatial-temporal understanding capabilities through challenging tasks including object appearance estimation, pose prediction, displacement measurement, and motion analysis. Our benchmark encompasses 300+ videos and 2,000+ QA pairs across desktop, indoor, and outdoor scenarios, revealing significant challenges in current state-of-the-art models for precise distance estimation and motion analysis.
Honors & Awards
- National First Prize, China Undergraduate Mathematical Contest in Modeling (2024)
- National Third Prize, China Engineering Robot Contest (2023)
- National Endeavor Scholarship (2023, 2024)