Imaging
Reliable multimodal signal acquisition for RGB, event, depth, LiDAR, and robot state.
I am a Ph.D. student in the Institute of Artificial Intelligence and Robotics (IAIR) at Xi'an Jiaotong University, supervised by Prof. Ping Wei. I received my M.S. degree from Huazhong University of Science and Technology, supervised by Assoc. Prof. Gang Peng.
My research interests focus on robotics, embodied intelligence, multimodal learning, and video understanding. I aspire to develop AI technologies that genuinely benefit society.
Welcome to contact me for any discussion and cooperation.
我完成了我的博士毕业答辩。
One paper accepted by ICRA2026. [WaveComm]
One paper accepted by IEEE Transactions on Multimedia. [Video Temporal Grounding]
One paper accepted by Pattern Recognition. [Video Temporal Grounding]
Invited speaker at ICCIR 2025, delivering an oral presentation on video multimodal learning. [ICCIR 2025]
Winner of the First Prize in the CVPR2025 Robotwin Dual-Arm Collaboration Challenge (Real-World Track). [Robotwin Challenge]
Winner of the Second Prize in the CVPR2025 Robotwin Dual-Arm Collaboration Challenge (Simulation Round1). [Robotwin Challenge]
One paper accepted at CVPR 2024. [Video Temporal Grounding]
One paper accepted by IEEE Transactions on Multimedia. [Video Temporal Action Localization]
One paper accepted by IEEE Transactions on Circuits and Systems for Video Technology. [Video Temporal Action Localization]
I explore unmanned systems and embodied agents that perceive scenes, understand multimodal signals, and plan reliable actions under challenging real-world conditions.
Reliable multimodal signal acquisition for RGB, event, depth, LiDAR, and robot state.
Discriminative representations for temporal grounding, manipulation, and scene understanding.
Reasoning with multimodal large models, task-driven feedback, and embodied knowledge.
Robotic action planning, trajectory optimization, and real-world challenge systems.
Selected Publications. The list may not be up-to-date. Please find my latest publications on Google Scholar.
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2026)
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2026) Project leader
IEEE Transactions on Circuits and Systems for Video Technology (TCSVT 2024)
2023
IEEE Transactions on Multimedia (TMM 2023)
CN Patent 魏平, 杨进. 基于多模态统一表征的视频语言时序定位方法及系统. No. 2025102048215
Method and system for video-language temporal grounding based on unified multimodal representations
CN Patent 彭刚, 杨进. 一种基于深度强化学习的机械臂运动规划方法和系统. No. 2022105019028
Robotic arm motion planning method and system based on deep reinforcement learning
CN Patent 彭刚, 杨进, 黎莉, 尹智. 一种智能清洗机器人路径规划方法及系统. No. 2021104000462
Path planning method and system for an intelligent cleaning robot
Real-World Track
Simulation Round1
Oral presentation on video multimodal learning
Hyperbolic Multiview Pretraining for Generalizable Robotic Manipulation
Multiview pretraining for robotic manipulation with geometry-aware representation learning.
Multi-Task Robotic Manipulation
Multi-task robotic manipulation system for robust perception and action under embodied settings.
Moment Retrieval and Highlight Detection System
A web interface for video-language temporal grounding, moment retrieval, and highlight detection demos.
Dense reward and stage incentive mechanisms
Reinforcement learning systems for robotic trajectory planning and intelligent cleaning robot path planning.