Yuhang Zheng (郑宇航)
I am currently a research intern in Lightillusions and EncoSmart, focusing on 3D perception and robotic manipulation. And I am fortunate to work closely with Dr. Xiaoxiao Long, Dr. Chao Yang and Prof. Ping Tan recently. Previously, I was a research intern at Tsinghua AIR, where I started studying robotic perception.
I am an active member of AnySyn3D, a non-profit research interest group comprising individuals with a strong interest in exploring research problems and cutting-edge technologies in any topics of 3D.
I am actively looking for the PhD opportunity in Fall 2025.
Email /
Google Scholar /
GitHub /
|
|
Research
My long-term research goal is to build agents that efficiently understand the physical world. These agents will then apply their knowledge to interact with the physical world, fostering adaptability and enabling continuous skill acquisition.
So I am up for anything related research and currently focusing on
{3D Vision, Robotic Manipulation} for Embodied AI:
👀3D Vision: 3D scene understanding; Scene representations; 3D Generation.
🤖Robotics: Robotic manipulation; Autonomous driving.
|
(†: corresponding author; *: equal contribution)
|
GaussianGrasper: 3D Language Gaussian Splatting for Open-vocabulary Robotic Grasping
Yuhang Zheng,
Xiangyu Chen,
Yupeng Zheng,
Songen Gu,
Runyi Yang,
Bu Jin,
Pengfei Li,
Chengliang Zhong,
Zengmao Wang,
Lina Liu,
Chao Yang,
Dawei Wang,
Zhen Chen,
Xiaoxiao Long†,
Meiqing Wang†.
RA-L, 2024
[Homepage]
[arXiv]
[Code]
We introduced GaussianGrasper, a robot grasping system implemented by a 3D Gaussian field endowed with open-vocabulary semantics and accurate geometry that is capable of rapid updates to support open-world robotic grasping guided by language.
|
|
Adaptive Surface Normal Constraint for Geometric Estimation from Monocular Images
Xiaoxiao Long*,
Yuhang Zheng*,
Yupeng Zheng,
Beiwen Tian,
Cheng Lin,
Lingjie Liu,
Zhao Hao†,
Guyue Zhou,
Wenping Wang†.
TPAMI, 2024
[Homepage]
[arXiv]
[Code]
We presented a simple but effective Adaptive Surface Normal (ASN) constraint to capture reliable geometric context, utilized to jointly estimate depth and surface normal with high quality.
|
|
Lift System Optimization for Hover-capable Flapping Wing Micro Air Vehicle
Shengjie Xiao,
Yongqi Shi,
Zemin Wang,
Zhe Ni,
Yuhang Zheng,
Huichao Deng†,
Xilun Ding
Frontiers of Mechanical Engineering , 2024
[Paper]
We presented a new lift system with high lift and aerodynamic efficiency, which effectively utilizing the high lift mechanism of hummingbirds to help improve aerodynamic performance.
|
|
ECT: Fine-grained Edge Detection with Learned Cause Tokens
Shaocong Xu,
Xiaoxue Chen,
Yuhang Zheng,
Guyue Zhou,
Yurong Chen,
Hongbin Zha,
Hao Zhao†
Image and Vision Computing, 2024
[arXiv]
[Code]
We designed (1) a two-stage transformer-based network to bridge the relationship between the fine-grained edges (reflectance, illumination, depth and normal) and generic edge detection tasks and (2) a cause-aware decoder, modeling edge cause as four learnable tokens.
|
|
TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes
Bu Jin,
Yupeng Zheng†,
Pengfei Li,
Weize Li,
Yuhang Zheng,
Sujie Hu,
Xinyu Liu,
Jinwei Zhu,
Zhijie Yan,
Haiyang Sun,
Kun Zhan,
Peng Jia,
Xiaoxiao Long,
Yilun Chen,
Hao Zhao
ECCV, 2024
[Homepage]
[arXiv]
[Code]
We introduced the new task of outdoor 3D dense captioning with TOD3Cap dataset; We proposed TOD3Cap network, leveraging the BEV representation to encode sparse outdoor scenes, and combine Relation Q-Former with LLaMA-Adapter to dense captioning in the open-world.
|
|
MonoOcc: Digging into Monocular Semantic Occupancy Prediction
Yupeng Zheng*,
Xiang Li*,
Pengfei Li,
Yuhang Zheng,
Bu Jin,
Chengliang Zhong,
Xiaoxiao Long,
Hao Zhao,
Qichao Zhang†
ICRA, 2024
[arXiv]
[Code]
We presented MonoOcc, a high-performance and efficient framework for monocular semantic occupancy prediction. We (1) propose an auxiliary semantic loss as supervision and an image-conditioned cross-attention module to refine voxel feature, and (2) employ a distillation module to transfer richer knowledge.
|
|
3D Implicit Transporter for Temporally Consistent Keypoint Discovery
Chengliang Zhong,
Yuhang Zheng,
Yupeng Zheng,
Hao Zhao†,
Li Yi,
Xiaodong Mu,
Ling Wang,
Pengfei Li,
Guyue Zhou,
Chao Yang,
Xinliang Zhang,
Jian Zhao
ICCV, 2023, Oral presentation
[arXiv]
[Code]
We presented 3D Implicit Transporter, a self-supervised method to discover temporally correspondent 3D keypoints from point cloud sequences. Extensive evaluations show that our keypoints are temporally consistent for both rigid and nonrigid object categories.
|
|
INT2: Interactive Trajectory Prediction at Intersections
Zhijie Yan,
Pengfei Li,
Zheng Fu,
Shaocong Xu,
Yongliang Shi,
Xiaoxue Chen,
Yuhang Zheng,
Yang Li,
Tianyu Liu,
Chuxuan Li,
Nairui Luo,
Xu Gao,
Yilun Chen,
Zuoxu Wang,
Yifeng Shi,
Pengfei Huang,
Zhengxiao Han,
Jirui Yuan,
Jiangtao Gong,
Guyue Zhou,
Hang Zhao,
Hao Zhao†
ICCV, 2023
[Homepage]
[Dataset]
[Code]
We presented a new interactive trajectory prediction dataset named INT2, which is short for INTeractive trajectory prediction at INTersections with high quality, large scale and rich information.
|
|
Enhancing Daily Life Through an Interactive Desktop Robotics System
Yuhang Zheng*,
Qiyao Wang*,
Chengliang Zhong†,
He Liang,
Zhengxiao Han,
Yupeng Zheng†
CICAI, 2023 🏆Best demo award
[Paper]
We developed an intelligent desktop operating robot designed to assist humans in their daily lives by comprehending natural language with large language models and performing a variety of desktop-related tasks.
|
|
DPF: Learning Dense Prediction Fields with Weak Supervision
Xiaoxue Chen,
Yuhang Zheng,
Yupeng Zheng,
Qiang Zhou,
Hao Zhao†,
Guyue Zhou,
Ya-Qin Zhang
CVPR, 2023
[arXiv]
[Code]
We proposed dense prediction fields (DPFs), a new paradigm that makes dense value predictions for point coordinate queries. An implicit neural function is used to model the DPFs, which are compatible with point-level supervision.
|
|
Adapt: Action-aware Driving Caption Transformer
Bu Jin,
Xinyu Liu,
Yupeng Zheng,
Pengfei Li,
Hao Zhao†,
Tong Zhang,
Yuhang Zheng,
Guyue Zhou,
Jingjing Liu
ICRA, 2023
[arXiv]
[Code]
We presented Adapt (Action-aware Driving cAPtion Transformer), a new end-to-end transformer-based framework for generating action narration and reasoning for self-driving vehicle.
|
|
Steps: Joint Self-supervised Nighttime Image Enhancement and Depth Estimation
Yupeng Zheng,
Chengliang Zhong,
Pengfei Li,
Huan-ang Gao,
Yuhang Zheng,
Bu Jin,
Ling Wang,
Hao Zhao,
Guyue Zhou,
Qichao Zhang,
Dongbin Zhao†
ICRA, 2023
[arXiv]
[Code]
We presented STEPS, the first method that jointly learns a nighttime image enhancer and a depth estimator with a self-supervised manner. And a newly proposed uncertain pixel masking strategy is used to tightly entangle these two task.
|
Patents
Automatic Recognition Method for Groove Form and Joint Type of Welding Groove Schematic Diagram
Meiqing Wang, Yuhang Zheng, Zijian Wu, Chenhao Ye, Hao Luo
Chinese Invention Patent, Substantive Examination. CN117102738A, 2023
Automatic Groove Size Information Analysis Method for Welding Groove Schematic Diagram
Meiqing Wang, Yuhang Zheng, Zijian Wu, Hao Luo, Chenhao Ye
Chinese Invention Patent, Substantive Examination. CN117133010A, 2023
Process Knowledge Element Extraction Method for Welding Process Text
Meiqing Wang, Yuhang Zheng, Jinjian Duan
Chinese Invention Patent, Substantive Examination. CN115577709A, 2023
Flapping Wing Elastic Energy Storage Mechanism of Miniature Bionic Flapping Wing Aircraft
Huichao Deng, Zhe Ni, Zemin Wang, Yuhang Zheng, Yongqi Shi, Shutong Zhang
Chinese Invention Patent, Patent Grant. CN113148145B, 2022
Bionic Flapping Mechanism Applied to Hovering Type Micro Flapping Wing Aircraft
Huichao Deng, Yongqi Shi, Yuhang Zheng, Zemin Wang, Zhe Ni, Shutong Zhang
Chinese Invention Patent, Patent Grant. CN113148146B,, 2022
|
Awards
Outstanding Graduate Student Award, 2024.
Alumni Scholarship, 2023.
Outstanding Undergraduate Award, 2022.
[First Prize] Chinese Mathematics Competitions, Chinese Mathematical Society, 2021.
|
Misc.
Outside of research, I enjoy playing basketball🏀, swimming🏊 and traveling🚢.
|
|