Home   ·   Publications   ·   More

Mingyuan Zhang 张明远

I am a Ph.D. student at MMLab@NTU, Nanyang Technological University, supervised by Prof. Ziwei Liu. My research focuses on 3D human modeling. I am particularly interested in the perception and synthesis of human motion and human-environment interactions.

Previously, I was a full-time algorithm researcher at X-Lab@SenseTime Research where I was advised by Prof. Hongsheng Li. I obtained my B.Eng. in Computer Science and Engineering from Beihang University where I was advised by Prof. Xianglong Liu. Besides, I have been fortunate to work with Haiyu Zhao and Lei Yang at SenseTime.

Email: mingyuan001@e.ntu.edu.sg / zhangmy718@gmail.com

Google Scholar / Github / Linkedin


[2022-07]   One paper (HuMMan) accepted to ECCV 2022 for Oral presentation.

[2022-05]   One paper (AvatarCLIP) accepted to SIGGRAPH 2022 (journal track).

[2022-03]   Two papers, Balanced MSE (Oral) and GE-ViTs, accepted to CVPR 2022.

[2022-01]   One paper (BiBERT) accepted to ICLR 2022.

[2021-08]   Start my journey at MMLab@NTU!

Selected Publications  [Full list]

* indicates equal contribution, ✉ indicates corresponding / co-corresponding author

MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model

Mingyuan Zhang*, Zhongang Cai*, Liang Pan, Fangzhou Hong, Xinying Guo, Lei Yang, Ziwei Liu
arXiV, 2022
[Paper]  [Project Page]  [Video] [Code] [Colab Demo] [Hugging Face Demo] Star

The first text-driven motion generation pipeline based on diffusion models with probabilistic mapping, realistic synthesis and multi-level manipulation ability.

AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars

Fangzhou Hong*, Mingyuan Zhang*, Liang Pan, Zhongang Cai, Lei Yang, Ziwei Liu
ACM Transactions on Graphics (SIGGRAPH), 2022
[Paper]  [Project Page]  [Video] [Code] [Colab Demo] Star

AvatarCLIP is the first zero-shot text-driven pipeline, which empowers layman users to generate and animate 3D Avatars by natural language description.

Balanced MSE for Imbalanced Visual Regression

Jiawei Ren, Mingyuan Zhang, Cunjun Yu, Ziwei Liu
Conference on Computer Vision and Pattern Recognition (CVPR), 2022 (Oral Presentation)
[Paper]  [Project Page]  [Talk] [Code] [Hugging Face Demo] Star

A statistically principled loss function to address the train/test mismatch in imbalanced regression, coincides with the supervised contrastive loss.

Delving Deep into the Generalization of Vision Transformers under Distribution Shifts

Chongzhi Zhang*, Mingyuan Zhang*, Shanghang Zhang*, Daisheng Jin, Qiang Zhou, Zhongang Cai, Haiyu Zhao, Shuai Yi, Xianglong Liu, Ziwei Liu
Conference on Computer Vision and Pattern Recognition (CVPR), 2022
[Paper]  [Code] Star

A systematical comparison of the generalization ability between CNNs and ViTs. Three representative generalization-enhancement techniques are applied to ViTs to further explore their inner properties.

Playing for 3D Human Recovery

Zhongang Cai*, Mingyuan Zhang*, Jiawei Ren*, Chen Wei, Daxuan Ren, Zhengyu Lin, Haiyu Zhao, Lei Yang, Chen Change Loy, Ziwei Liu
arXiV, 2021
[Paper]  [Code] Star

A large-scale synthetic human dataset collected using GTA-5 game engine, providing stable performance boost to both frame-based and video-based HMR.

BiBERT: Accurate Fully Binarized BERT

Haotong Qin*, Yifu Ding*, Mingyuan Zhang*, Qinghua Yan, Aishan Liu, Qingqing Dang, Ziwei Liu, Xianglong Liu
International Conference on Learning Representations (ICLR), 2022
[Paper]  [Code] Star

BiBERT is the first fully binarized BERT. It introduces an efficient Bi-Attention structure and a DMD scheme, which yields impressive 59.2x and 31.2x saving on FLOPs and model size.

BiPointNet: Binary Neural Network for Point Clouds

Haotong Qin*, Zhongang Cai*, Mingyuan Zhang*, Yifu Ding, Haiyu Zhao, Shuai Yi, Xianglong Liu✉, Hao Su
International Conference on Learning Representations (ICLR), 2021
[Paper]  [Code] Star

BiPointNet is the first fully binarized network for point cloud learning. BiPointNet gives an impressive 14.7x speedup and 18.9x storage saving on real-world resource-constrained devices.

Efficient Attention: Attention with Linear Complexities

Zhuoran Shen*, Mingyuan Zhang*, Haiyu Zhao, Shuai Yi, Hongsheng Li
Winter Conference on Applications of Computer Vision (WACV), 2021
[Paper]  [Code] Star

Efficient Attention reduces the memory and computational complexities of the attention mechanism from quadratic to linear. It demonstrates significant improvement in performance-cost trade-offs on a variety of tasks including object detection, instance segmentation, stereo depth estimation, and temporal action lcoalization.

Updated: 2022-10-17