About me
Di He (贺笛) is an Assistant Professor at Peking University. He was previously a senior researcher at the Machine Learning Group, Microsoft Research Asia. He obtained his bachelor, master and Ph.D. degrees from Peking University, advised by Liwei Wang.
Di’s main research focuses on generative models (e.g., large language models) and AI4Science. The primary goal of his work is to develop efficient algorithms that can capture accurate and robust features from data through deep neural networks. To achieve this goal, Di focuses on providing a deeper understanding of different neural network architectures for different practical scenarios and their optimization processes. Di won the ICLR 2023 Outstanding Paper Award and ICLR 2024 Outstanding Paper Honorable Mention. He has been serving on the PCs and Senior PCs of the top machine learning and artificial intelligence conferences, such as ICML, NIPS, ICLR.
Publications (Full List)
Kai Yang, Jan Ackermann, Zhenyu He, Guhao Feng, Bohang Zhang, Yunzhen Feng, Qiwei Ye, Di He, Liwei Wang, “Do Efficient Transformers Really Save Computation?”, ICML 2024
Mingqing Xiao, Yixin Zhu, Di He, Zhouchen Lin, “Temporal Spiking Neural Networks with Synaptic Delay for Graph Reasoning”, ICML 2024
Tianlang Chen, Shengjie Luo, Di He, Shuxin Zheng, Tie-Yan Liu, Liwei Wang, “GeoMFormer: A General Architecture for Geometric Molecular Representation Learning”, ICML 2024
Zhenyu He, Guhao Feng, Shengjie Luo, Kai Yang, Liwei Wang, Jingjing Xu, Zhi Zhang, Hongxia Yang, Di He, “Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation”, ICML 2024
Zhenyu He, Zexuan Zhong, Tianle Cai, Jason D. Lee, Di He, “REST: Retrieval-Based Speculative Decoding”, NAACL 2024
Krzysztof Marcin Choromanski, Shanda Li, Valerii Likhosherstov, Kumar Avinava Dubey, Shengjie Luo, Di He, Yiming Yang, Tamas Sarlos, Thomas Weingarten, Adrian Weller, “Learning a Fourier Transform for Linear Relative Positional Encodings in Transformers”, AISTATS 2024
Shuqi Lu, Lin Yao, Xi Chen, Hang Zheng, Di He, Guolin Ke, “3D Molecular Generation via Virtual Dynamics”, TMLR
Ruichen Li, Haotian Ye, Du Jiang, Xuelan Wen, Chuwei Wang, Zhe Li, Xiang Li, Di He, Ji Chen, Weiluo Ren, Liwei Wang, “A Computational Framework for Neural Network-based Variational Monte Carlo with Forward Laplacian”, Nature Machine Intelligence
Bohang Zhang, Jingchu Gai, Yiheng Du, Qiwei Ye, Di He, Liwei Wang, “Beyond Weisfeiler-Lehman: A Quantitative Framework for GNN Expressiveness”, ICLR 2024 Outstanding Paper Honorable Mention
Mingqing Xiao, Qingyan Meng, Zongpeng Zhang, Di He, Zhouchen Lin, “Hebbian Learning based Orthogonal Projection for Continual Learning of Spiking Neural Networks”, ICLR 2024
Guhao Feng, Bohang Zhang, Yuntian Gu, Haotian Ye, Di He, Liwei Wang, “Towards Revealing the Mystery behind Chain of Thought: a Theoretical Perspective”, NeurIPS 2023 Oral, top 0.7%
Haiyang Wang, Chen Shi, Shaoshuai Shi, Meng Lei, Sen Wang, Di He, Bernt Schiele, Liwei Wang, “DSVT: Dynamic Sparse Voxel Transformer with Rotated Sets”, CVPR 2023
Bohang Zhang, Guhao Feng, Yiheng Du, Di He, Liwei Wang, “A Complete Expressiveness Hierarchy for Subgraph GNNs via Subgraph Weisfeiler-Lehman Tests”, ICML 2023
Bohang Zhang, Shengjie Luo, Liwei Wang, Di He, “Rethinking the Expressive Power of GNNs via Graph Biconnectivity”, ICLR 2023 Outstanding Paper Award
Shengjie Luo, Tianlang Chen, Yixian Xu, Shuxin Zheng, Tie-Yan Liu, Liwei Wang, Di He, “One Transformer Can Understand Both 2D & 3D Molecular Data”, ICLR 2023
QuanLin Wu, Hang Ye, Yuntian Gu, Huishuai Zhang, Liwei Wang, Di He, “Denoising Masked Autoencoders are Certifiable Robust Vision Learners”, ICLR 2023
Di He, Shanda Li, Wenlei Shi, Xiaotian Gao, Jia Zhang, Jiang Bian, Liwei Wang, Tie-Yan Liu, “Learning Physics-Informed Neural Networks without Stacked Back-propagation”, AISTATS 2023
Huishuai Zhang, Da Yu, Yiping Lu, Di He, “Adversarial Noises Are Linearly Separable for (Nearly) Random Neural Networks”, AISTATS 2023
Shengjie Luo, Shanda Li, Shuxin Zheng, Tie-Yan Liu, Liwei Wang, Di He, “Your Transformer May Not be as Powerful as You Expect”, NeurIPS 2022
Chuwei Wang, Shanda Li, Di He, Liwei Wang, “Is $L^2$ Physics Informed Loss Always Suitable for Training Physics Informed Neural Network?”, NeurIPS 2022
Bohang Zhang, Du Jiang, Di He, Liwei Wang, “Rethinking Lipschitz Neural Networks for Certified L-infinity Robustness”, NeurIPS 2022 Oral, 1.7% acceptance rate
Mingqing Xiao, Qingyan Meng, Zongpeng Zhang, Di He, Zhouchen Lin, “Online Training Through Time for Spiking Neural Networks”, NeurIPS 2022
Rui Li, Jianan Zhao, Chaozhuo Li, Di He, Yiqi Wang, Yuming Liu, Hao Sun, Senzhang Wang, Weiwei Deng, Yanming Shen, Xing Xie, Qi Zhang, “HousE: Knowledge Graph Embedding with Householder Parameterization”, ICML 2022
Tianyu Pang, Huishuai Zhang, Di He, Yinpeng Dong, Hang Su, Wei Chen, Jun Zhu, Tie-Yan Liu, ““Two Coupled Rejection Metrics Can Tell Adversarial Examples Apart”, CVPR 2022
Bohang Zhang, Du Jiang, Di He, Liwei Wang, “Boosting the Certified Robustness of L-infinity Distance Nets”, ICLR 2022
Shengjie Luo, Shanda Li, Tianle Cai, Di He, Dinglan Peng, Shuxin Zheng, Guolin Ke, Liwei Wang, Tie-Yan Liu, “Stable, Fast and Accurate: Kernelized Attention with Relative Positional Encoding”, NeurIPS 2021
Chengxuan Ying, Tianle Cai, Shengjie Luo, Shuxin Zheng, Guolin Ke, Di He, Yanming Shen, Tie-Yan Liu, “Do Transformers Really Perform Bad for Graph Representation?”, Winner of KDD CUP 2021 – Graph Prediction Track,NeurIPS 2021
Shuqi Lu, Di He, Chenyan Xiong, Guolin Ke, Waleed Malik, Zhicheng Dou, Paul Bennett, Tie-Yan Liu, Arnold Overwijk, “Less is More: Pre-train a Strong Text Encoder for Dense Retrieval Using a Weak Decoder”, EMNLP 2021
Dinglan Peng, Shuxin Zheng, Yatao Li, Guolin Ke, Di He, Tie-Yan Liu, “How could Neural Networks Understand Programs”, ICML 2021
Bohang Zhang, Tianle Cai, Zhou Lu, Di He, Liwei Wang, “Towards Certifying ℓ∞ Robustness using Neural Networks with ℓ∞-dist Neurons”, ICML 2021
Tianle Cai, Shengjie Luo, Keyulu Xu, Di He, Tie-Yan Liu, Liwei Wang, “GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training”, ICML 2021
Qiyu Wu, Chen Xing, Yatao Li, Guolin Ke, Di He, Tie-Yan Liu, “Taking Notes on the Fly Helps Language Pre-training”, ICLR 2021
Guolin Ke, Di He, Tie-Yan Liu, “Rethinking Positional Encoding in Language Pre-training”, ICLR 2021
Mingqing Xiao, Shuxin Zheng, Chang Liu, Yaolong Wang, Di He, Guolin Ke, Jiang Bian, Zhouchen Lin, Tie-Yan Liu, “Invertible Image Rescaling”, ECCV 2020 (Oral)
Ruibin Xiong, Yunchang Yang, Di He, Kai Zheng, Shuxin Zheng, Chen Xing, Huishuai Zhang, Yanyan Lan, Liwei Wang, Tie-Yan Liu, “On Layer Normalization in the Transformer Architecture”, ICML 2020
Runtian Zhai, Chen Dan, Di He, Huan Zhang, Boqing Gong, Pradeep Ravikumar, Cho-Jui Hsieh, Liwei Wang, “MACER: Attack-free and Scalable Robust Training via Maximizing Certified Radius”, ICLR 2020
Jinhua Zhu, Yingce Xia, Lijun Wu, Di He, Tao Qin, Wengang Zhou, Houqiang Li, Tie-Yan Liu, “Incorporating BERT into Neural Machine Translation”, ICLR 2020
Tianle Cai, Ruiqi Gao, Jikai Hou, Siyu Chen, Dong Wang, Di He, Zhihua Zhang, Liwei Wang, “Gram-Gauss-Newton Method: Learning Overparameterized Neural Networks for Regression Problems”, NeurIPS 2019 Beyond First Order Method in ML Workshop
Yiping Lu, Zhuohan Li, Di He, Zhiqing Sun, Bin Dong, Tao Qin, Liwei Wang, Tie-Yan Liu, “Understanding and Improving Transformer From a Multi-Particle Dynamic System Point of View”, NeurIPS 2019 Machine Learning and the Physical Sciences Workshop
Zhiqing Sun, Zhuohan Li, Haoqing Wang, Di He, Zi Lin, Zhihong Deng, ”Fast Structured Decoding for Sequence Models”, NeurIPS 2019
Runtian Zhai, Tianle Cai, Di He, Chen Dan, Kun He, John Hopcroft, Liwei Wang, “Adversarially Robust Generalization Just Requires More Unlabeled Data”, Preprint
Lijun Wu, Jinhua Zhu, Di He, Fei Gao, Tao Qin, Jianhuang Lai, Tie-Yan Liu, “Machine Translation With Weakly Paired Documents”, EMNLP 2019
Zhuohan Li, Di He, Fei Tian, Tao Qin, Liwei Wang, Tie-Yan Liu, “Hint-based Training for Non-autoregressive Translation”, EMNLP 2019
Linyuan Gong, Di He, Zhuohan Li, Tao Qin, Liwei Wang, Tie-Yan Liu, “Efficient Training of BERT by Progressively Stacking”, ICML 2019
Chaoyu Guan, Xiting Wang, Quanshi Zhang, Runjin Chen, Di He, Xing Xie, “Towards a Deep and Unified Understanding of Deep Neural Models in NLP”, ICML 2019
Jun Gao, Di He, Xu Tan, Tao Qin, Liwei Wang, Tie-Yan Liu, “Representation Degeneration Problem in Training Natural Language Generation Models”, ICLR 2019
Chengyue Gong, Di He, Xu Tan, Tao Qin, Liwei Wang, Tie-Yan Liu, “FRAGE: Frequency-Agnostic Word Representation”, NeurIPS 2018
Tianyu He, Xu Tan, Yingce Xia, Di He, Tao Qin, Zhibo Chen, Tie-Yan Liu, “Layer-wise Coordination Between Encoder and Decoder for Neural Machine Translation”, NeurIPS 2018
Zhuohan Li, Di He, Fei Tian, Wei Chen, Tao Qin, Liwei Wang, Tie-Yan Liu, “Towards Binary-valued Gates for Robust LSTM Training”, ICML 2018
Di He, Hanqing Lu, Yingce Xia, Tao Qin, Liwei Wang, Tie-Yan Liu, “Decoding with Value Networks for Neural Machine Translation”, NeurIPS 2017
Di He, Yingce Xia, Tao Qin, Liwei Wang, Nenghai Yu, Tie-Yan Liu, Wei-Ying Ma, “Dual Learning for Machine Translation”, NeurIPS 2016
Wei Chen, Di He, Tie-Yan Liu, Tao Qin, Yixin Tao, Liwei Wang, “Generalized Second Price Auction with Probabilistic Broad Match”, Proceedings of the fifteenth ACM conference on Economics and Computation (EC), 2014
Di He, Wei Chen, Liwei Wang, Tie-Yan Liu, “A Game-Theoretic Machine Learning Approach for Revenue Maximization in Sponsored Search”, IJCAI 2013
Yining Wang, Liwei Wang, Yuanzhi Li, Di He, Wei Chen, Tie-Yan Liu, “A Theoretical Analysis of NDCG Type Ranking Measures”, Annual Conference on Learning Theory (COLT), 2013
Past supervised undergraduates
Qizhe Xie (CMU, 2016)
Chengyue Gong (UT Austin, 2018)
Jun Gao (University of Toronto, 2018)
Zhuohan Li (UC Berkeley, 2019)
Zhiqing Sun (CMU, 2019)
Yiping Lu (Stanford, 2019)
Linyuan Gong (UC Berkeley, 2020)
Runtian Zhai (CMU, 2020)
Tianle Cai (Princeton, 2020)
Yunzhen Feng (NYU, 2021)
Chengxuan Ying (Citadel, 2022)
Shanda Li (CMU, 2022)
Chuwei Wang (Caltach, 2023)
Haotian Ye (Stanford, 2023)