Yuhang Zheng (郑宇航)

I am currently a research intern in Lightillusions and EncoSmart, focusing on 3D perception and robotic manipulation. And I am fortunate to work closely with Dr. Xiaoxiao Long, Dr. Chao Yang and Prof. Ping Tan recently. Previously, I was a research intern at Tsinghua AIR, where I started studying robotic perception.

I am an active member of AnySyn3D, a non-profit research interest group comprising individuals with a strong interest in exploring research problems and cutting-edge technologies in any topics of 3D.

I am actively looking for the PhD opportunity in Fall 2025.

Email  /  Google Scholar  /  GitHub  / 

profile photo
Research

My long-term research goal is to build agents that efficiently understand the physical world. These agents will then apply their knowledge to interact with the physical world, fostering adaptability and enabling continuous skill acquisition. So I am up for anything related research and currently focusing on {3D Vision, Robotic Manipulation} for Embodied AI:

  • 👀3D Vision: 3D scene understanding; Scene representations; 3D Generation.
  • 🤖Robotics: Robotic manipulation; Autonomous driving.
  • Publications
    (: corresponding author; *: equal contribution)
    dise GaussianGrasper: 3D Language Gaussian Splatting for Open-vocabulary Robotic Grasping
    Yuhang Zheng, Xiangyu Chen, Yupeng Zheng, Songen Gu, Runyi Yang, Bu Jin, Pengfei Li, Chengliang Zhong, Zengmao Wang, Lina Liu, Chao Yang, Dawei Wang, Zhen Chen, Xiaoxiao Long, Meiqing Wang.
    RA-L, 2024
    [Homepage] [arXiv] [Code]

    We introduced GaussianGrasper, a robot grasping system implemented by a 3D Gaussian field endowed with open-vocabulary semantics and accurate geometry that is capable of rapid updates to support open-world robotic grasping guided by language.

    dise Adaptive Surface Normal Constraint for Geometric Estimation from Monocular Images
    Xiaoxiao Long*, Yuhang Zheng*, Yupeng Zheng, Beiwen Tian, Cheng Lin, Lingjie Liu, Zhao Hao, Guyue Zhou, Wenping Wang.
    TPAMI, 2024
    [Homepage] [arXiv] [Code]

    We presented a simple but effective Adaptive Surface Normal (ASN) constraint to capture reliable geometric context, utilized to jointly estimate depth and surface normal with high quality.

    dise Lift System Optimization for Hover-capable Flapping Wing Micro Air Vehicle
    Shengjie Xiao, Yongqi Shi, Zemin Wang, Zhe Ni, Yuhang Zheng, Huichao Deng, Xilun Ding
    Frontiers of Mechanical Engineering , 2024
    [Paper]

    We presented a new lift system with high lift and aerodynamic efficiency, which effectively utilizing the high lift mechanism of hummingbirds to help improve aerodynamic performance.

    dise ECT: Fine-grained Edge Detection with Learned Cause Tokens
    Shaocong Xu, Xiaoxue Chen, Yuhang Zheng, Guyue Zhou, Yurong Chen, Hongbin Zha, Hao Zhao
    Image and Vision Computing, 2024
    [arXiv] [Code]

    We designed (1) a two-stage transformer-based network to bridge the relationship between the fine-grained edges (reflectance, illumination, depth and normal) and generic edge detection tasks and (2) a cause-aware decoder, modeling edge cause as four learnable tokens.

    dise TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes
    Bu Jin, Yupeng Zheng, Pengfei Li, Weize Li, Yuhang Zheng, Sujie Hu, Xinyu Liu, Jinwei Zhu, Zhijie Yan, Haiyang Sun, Kun Zhan, Peng Jia, Xiaoxiao Long, Yilun Chen, Hao Zhao
    ECCV, 2024
    [Homepage] [arXiv] [Code]

    We introduced the new task of outdoor 3D dense captioning with TOD3Cap dataset; We proposed TOD3Cap network, leveraging the BEV representation to encode sparse outdoor scenes, and combine Relation Q-Former with LLaMA-Adapter to dense captioning in the open-world.

    dise MonoOcc: Digging into Monocular Semantic Occupancy Prediction
    Yupeng Zheng*, Xiang Li*, Pengfei Li, Yuhang Zheng, Bu Jin, Chengliang Zhong, Xiaoxiao Long, Hao Zhao, Qichao Zhang
    ICRA, 2024
    [arXiv] [Code]

    We presented MonoOcc, a high-performance and efficient framework for monocular semantic occupancy prediction. We (1) propose an auxiliary semantic loss as supervision and an image-conditioned cross-attention module to refine voxel feature, and (2) employ a distillation module to transfer richer knowledge.

    dise 3D Implicit Transporter for Temporally Consistent Keypoint Discovery
    Chengliang Zhong, Yuhang Zheng, Yupeng Zheng, Hao Zhao, Li Yi, Xiaodong Mu, Ling Wang, Pengfei Li, Guyue Zhou, Chao Yang, Xinliang Zhang, Jian Zhao
    ICCV, 2023, Oral presentation
    [arXiv] [Code]

    We presented 3D Implicit Transporter, a self-supervised method to discover temporally correspondent 3D keypoints from point cloud sequences. Extensive evaluations show that our keypoints are temporally consistent for both rigid and nonrigid object categories.

    dise INT2: Interactive Trajectory Prediction at Intersections
    Zhijie Yan, Pengfei Li, Zheng Fu, Shaocong Xu, Yongliang Shi, Xiaoxue Chen, Yuhang Zheng, Yang Li, Tianyu Liu, Chuxuan Li, Nairui Luo, Xu Gao, Yilun Chen, Zuoxu Wang, Yifeng Shi, Pengfei Huang, Zhengxiao Han, Jirui Yuan, Jiangtao Gong, Guyue Zhou, Hang Zhao, Hao Zhao
    ICCV, 2023
    [Homepage] [Dataset] [Code]

    We presented a new interactive trajectory prediction dataset named INT2, which is short for INTeractive trajectory prediction at INTersections with high quality, large scale and rich information.

    dise Enhancing Daily Life Through an Interactive Desktop Robotics System
    Yuhang Zheng*, Qiyao Wang*, Chengliang Zhong, He Liang, Zhengxiao Han, Yupeng Zheng
    CICAI, 2023 🏆Best demo award
    [Paper]

    We developed an intelligent desktop operating robot designed to assist humans in their daily lives by comprehending natural language with large language models and performing a variety of desktop-related tasks.

    dise DPF: Learning Dense Prediction Fields with Weak Supervision
    Xiaoxue Chen, Yuhang Zheng, Yupeng Zheng, Qiang Zhou, Hao Zhao, Guyue Zhou, Ya-Qin Zhang
    CVPR, 2023
    [arXiv] [Code]

    We proposed dense prediction fields (DPFs), a new paradigm that makes dense value predictions for point coordinate queries. An implicit neural function is used to model the DPFs, which are compatible with point-level supervision.

    dise Adapt: Action-aware Driving Caption Transformer
    Bu Jin, Xinyu Liu, Yupeng Zheng, Pengfei Li, Hao Zhao, Tong Zhang, Yuhang Zheng, Guyue Zhou, Jingjing Liu
    ICRA, 2023
    [arXiv] [Code]

    We presented Adapt (Action-aware Driving cAPtion Transformer), a new end-to-end transformer-based framework for generating action narration and reasoning for self-driving vehicle.

    dise Steps: Joint Self-supervised Nighttime Image Enhancement and Depth Estimation
    Yupeng Zheng, Chengliang Zhong, Pengfei Li, Huan-ang Gao, Yuhang Zheng, Bu Jin, Ling Wang, Hao Zhao, Guyue Zhou, Qichao Zhang, Dongbin Zhao
    ICRA, 2023
    [arXiv] [Code]

    We presented STEPS, the first method that jointly learns a nighttime image enhancer and a depth estimator with a self-supervised manner. And a newly proposed uncertain pixel masking strategy is used to tightly entangle these two task.

    Patents
  • Automatic Recognition Method for Groove Form and Joint Type of Welding Groove Schematic Diagram
    Meiqing Wang, Yuhang Zheng, Zijian Wu, Chenhao Ye, Hao Luo
    Chinese Invention Patent, Substantive Examination. CN117102738A, 2023
  • Automatic Groove Size Information Analysis Method for Welding Groove Schematic Diagram
    Meiqing Wang, Yuhang Zheng, Zijian Wu, Hao Luo, Chenhao Ye
    Chinese Invention Patent, Substantive Examination. CN117133010A, 2023
  • Process Knowledge Element Extraction Method for Welding Process Text
    Meiqing Wang, Yuhang Zheng, Jinjian Duan
    Chinese Invention Patent, Substantive Examination. CN115577709A, 2023
  • Flapping Wing Elastic Energy Storage Mechanism of Miniature Bionic Flapping Wing Aircraft
    Huichao Deng, Zhe Ni, Zemin Wang, Yuhang Zheng, Yongqi Shi, Shutong Zhang
    Chinese Invention Patent, Patent Grant. CN113148145B, 2022
  • Bionic Flapping Mechanism Applied to Hovering Type Micro Flapping Wing Aircraft
    Huichao Deng, Yongqi Shi, Yuhang Zheng, Zemin Wang, Zhe Ni, Shutong Zhang
    Chinese Invention Patent, Patent Grant. CN113148146B,, 2022
  • Awards

  • Outstanding Graduate Student Award, 2024.
  • Alumni Scholarship, 2023.
  • Outstanding Undergraduate Award, 2022.
  • [First Prize] Chinese Mathematics Competitions, Chinese Mathematical Society, 2021.
  • Misc.

    Outside of research, I enjoy playing basketball🏀, swimming🏊‍ and traveling🚢.


    Website template from Jon Barron.


    © Yuhang Zheng | Last update: July.9, 2024