Xi'an Jiaotong University seal College of Artificial Intelligence seal IAIR Est. 1986 logo

Biography

Hi, I am Jin Yang. I received my Ph.D. degree in June 2026 from the Institute of Artificial Intelligence and Robotics (IAIR) at Xi'an Jiaotong University. I was supervised by Prof. Ping Wei.

My research mainly focuses on Multimodal Learning. I am interested in effective representation, fusion, and reasoning methods for multimodal information. My current goal is to build intelligent models and robotic systems that can perceive multimodal signals and make reliable decisions.

Welcome to contact me for any discussion and cooperation.

Multimodal LearningRobotics & Embodied IntelligenceReinforcement Learning

News

  • Two papers accepted by IROS2026. [RVAF][D6SR]

  • I successfully defended my Ph.D. dissertation.

  • One paper accepted by ICRA2026. [WaveComm]

  • One paper accepted by IEEE Transactions on Multimedia. [VABooster]

  • One paper accepted by Pattern Recognition. [UPM]

  • Invited speaker at ICCIR 2025, delivering an oral presentation on video multimodal learning. [ICCIR 2025]

  • Winner of the First Prize in the CVPR2025 Robotwin Dual-Arm Collaboration Challenge (Real-World Track). [Robotwin Challenge]

  • Winner of the Second Prize in the CVPR2025 Robotwin Dual-Arm Collaboration Challenge (Simulation Round1). [Robotwin Challenge]

  • One paper accepted at CVPR 2024. [TaskWeave]

  • One paper accepted by IEEE Transactions on Multimedia. [TransGMC]

  • One paper accepted by IEEE Transactions on Circuits and Systems for Video Technology. [TFFormer]

Publications

Selected Publications. The list may not be up-to-date. Please find my latest publications on Google Scholar.

Differential Amplifier-Inspired AmpAttention for Multi-View Robotic Manipulation thumbnail

Differential Amplifier-Inspired AmpAttention for Multi-View Robotic Manipulation

Jin Yang, Ping Wei, Nanning Zheng

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2026)

A Mechanically Decoupled Six-Axis Spherical Rolling Robot For Stable Propeller-Driven Rolling thumbnail

A Mechanically Decoupled Six-Axis Spherical Rolling Robot For Stable Propeller-Driven Rolling

Xijian Deng, Leqi Ding, Jiayi Chen, Ping Wei, Jin Yang

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2026) Project leader

Lightweight Communication for Collaborative Perception via Wavelet Feature Distillation thumbnail

Lightweight Communication for Collaborative Perception via Wavelet Feature Distillation

Jin Yang, Erdemt Bao

IEEE International Conference on Robotics & Automation (ICRA 2026) Co-first author & Project leader

Hyperbolic Multiview Pretraining for Robotic Manipulation thumbnail

Hyperbolic Multiview Pretraining for Robotic Manipulation

Jin Yang, Ping Wei, Yixin Chen

2026

Learning Visual-Audio Dissonance for Moment Retrieval and Highlight Detection thumbnail

Learning Visual-Audio Dissonance for Moment Retrieval and Highlight Detection

Jin Yang, Ping Wei, Nanning Zheng

IEEE Transactions on Multimedia (TMM 2025)

Learning Unified Patterns of Multimodalities for Joint Moment Retrieval and Highlight Detection thumbnail

Learning Unified Patterns of Multimodalities for Joint Moment Retrieval and Highlight Detection

Jin Yang, Ping Wei

Pattern Recognition (PR 2025)

Task-Driven Exploration: Decoupling and Inter-Task Feedback for Joint Moment Retrieval and Highlight Detection thumbnail

Task-Driven Exploration: Decoupling and Inter-Task Feedback for Joint Moment Retrieval and Highlight Detection

Jin Yang, Ping Wei, Huan Li, Ziyang Ren

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2024)

Cross Time-Frequency Transformer for Temporal Action Localization thumbnail

Cross Time-Frequency Transformer for Temporal Action Localization

Jin Yang, Ping Wei, Nanning Zheng

IEEE Transactions on Circuits and Systems for Video Technology (TCSVT 2024)

Perceptual Consistency-Driven Abstraction via Minimax Optimization for Joint Moment Retrieval and Highlight Detection thumbnail

Perceptual Consistency-Driven Abstraction via Minimax Optimization for Joint Moment Retrieval and Highlight Detection

Jin Yang, Ping Wei, Huan Li, Nanning Zheng

2023

Gated Multi-Scale Transformer for Temporal Action Localization thumbnail

Gated Multi-Scale Transformer for Temporal Action Localization

Jin Yang, Ping Wei, Ziyang Ren, Nanning Zheng

IEEE Transactions on Multimedia (TMM 2023)

Patents

CN Patent 魏平, 杨进. 基于多模态统一表征的视频语言时序定位方法及系统. No. 2025102048215

Method and system for video-language temporal grounding based on unified multimodal representations

CN Patent 彭刚, 杨进. 一种基于深度强化学习的机械臂运动规划方法和系统. No. 2022105019028

Robotic arm motion planning method and system based on deep reinforcement learning

CN Patent 彭刚, 杨进, 黎莉, 尹智. 一种智能清洗机器人路径规划方法及系统. No. 2021104000462

Path planning method and system for an intelligent cleaning robot

Awards

  • 2025 National Scholarship for Doctoral Students
  • 2025 "Hu Baosheng" Outstanding Student Scholarship Rank 1, Speech by the Student Representative
  • 2023 Honor of Excellent Postgraduate at University Level
  • 2021 Huawei Scholarship
  • 2020 Honor of Merit Postgraduate at University Level
  • 2020/2017/2016 First Prize of Scholarship at University Level
  • 2020 Shenzhen Stock Exchange Scholarship
  • 2019 Third Prize of World Robot Contest (Trico-Robot Challenge) Rank 2
  • 2018 Third Prize of National College Student Robot Contest (Robocon) Rank 1, Team Leader
  • 2018 Second Prize of USA Undergraduate Mathematical Modeling Contest Rank 1
  • 2017 First Prize of National Undergraduate IoT Design Contest Rank 1
  • 2017 First Prize of Tianjin Undergraduate Robot Contest (Robot Creative Design) Rank 1
  • 2017 Tianjin Undergraduate Innovation and Entrepreneurship project Host

Projects

WebUI demo

WebUI for Joint Video Moment Retrieval and Highlight Detection

A WebUI for video moment retrieval and highlight detection. It retrieves moments of interest from long untrimmed videos and supports zero-shot adaptation to open-set content.