李 诚

电 话:(0551) 63603412

Email:chengli7@ustc.edu.cn

个人主页:https://mr-cheng-li.github.io/


主要研究方向: 高性能大模型并行计算与存储系统



李诚,德国马普学会软件系统所(MPI-SWS)博士,中国科学技术大学计算机学院特任教授、博士生导师。国家高层次青年人才、安徽省青年教学名师、教坛新秀、优秀青年研究生导师。CCF专委工委副主任。主要研究领域包括大模型基础系统优化与并行存储。承担国家自然科学基金重点/面上/专项课题/青年、JKW高技术专项、新一代人工智能国家科技重大专项课题等重要科研任务,与字节、阿里、华为等公司长期保持紧密合作。在计算机系统领域顶级会议和期刊发表论文50余篇。研究成果获世界人工智能大会青年优秀论文奖、阿里巴巴优秀合作项目奖等。个人荣获高校计算机专业优秀教师奖励计划、CCF高专委青年学者激励计划、ACM中国新星提名等。指导学生多次获得CCF数据库专委优博激励计划、ACM中国SIGHPC优博、ACM中国SIGCSE优博、中国科学院院长特别奖、中国科协青年人才托举工程等奖励。近年来,已有近十名毕业生进入世界领先的大模型公司,参与千问、豆包等大模型的训练推理优化工作,并入选字节TopSeed、小红书RedStar、华为天才少年、中国电信优才计划等企业人才项目。


代表性论著

1. Libra: Flexible Request Partitioning and Scheduling for Serving Unbalanced and Dynamic LLM Workloads. Chaoyi Ruan*, Yinhe Chen*, Dongqi Tian, Yandong Shi, Yongji Wu, Jialin Li, Cheng Li. To appear in the Proceedings of the 23rd USENIX Symposium on Networked Systems Design and Implementation (NSDI ’26), 2026. (Co-first authors)

2. Mantle: Efficient Hierarchical Metadata Management for Cloud Object Storage Services. Jiahao Li, Biao Cao*, Jielong Jian, Cheng Li*, Sen Han, Yiduo Wang, Yufei Wu, Kang Chen, Zhihui Yin, Qiushi Chen, Jiwei Xiong, Jie Zhao, Fengyuan Liu, Yan Xing, Liguo Duan, Miao Yu, Ran Zheng, Feng Wu, and Xianjun Meng. In Proceedings of the ACM SIGOPS 31st Symposium on Operating Systems Principles (SOSP '25), 2025. (Co-corresponding authors)

3. DHeLlam: General-Purpose, Automatic Micro-batch Co-execution for Distributed LLM Training. Haiquan Wang, Chaoyi Ruan, Jia He, Jiaqi Ruan, Chengjie Tang, Xiaosong Ma, Cheng Li*. In the Proceedings of the 43rd IEEE International Conference on Computer Design (ICCD’25), 2025. (Best paper award)

4. BigMac: A Communication-Efficient Mixture-of-Experts Model Structure for Fast Training and Inference. Zewen Jin, Shengnan Wang, Jiaan Zhu, Hongrui Zhan, Youhui Bai, Lin Zhang, Zhenyu Ming, Cheng Li*. In the Proceedings of the AAAI Conference on Artificial Intelligence (AAAI’25), 2025.

5. AutoCCL: Automated Collective Communication Tuning for Accelerating Distributed and Parallel DNN Training. Guanbin Xu, Zhihao Le, Yinhe Chen, Zhiqi Lin, Zewen Jin, Youshan Miao, Cheng Li.* In the Proceedings of the 22nd USENIX Symposium on Networked Systems Design and Implementation (NSDI’25), 2025.

6. nnScaler: Constraint-Guided Parallelization Plan Generation for Deep Learning Training. Zhiqi Lin, Youshan Miao, Quanlu Zhang, Fan Yang, Yi Zhu, Cheng Li, Saeed Maleki, Xu Cao, Ning Shang, Yilei Yang, Weijiang Xu, Mao Yang, Lintao Zhang, Lidong Zhou. In the Proceedings of the 18th USENIX Symposium on Operating Systems Design and Implementation (OSDI’24).

7. SPFresh: Incremental In-Place Update for Billion-Scale Vector Search. Yuming Xu, Hengyu Liang, Jin Li, Shuotao Xu, Qi Chen*, Qianxi Zhang, Cheng Li*, Ziyue Yang, Fan Yang, Yuqing Yang, Peng Cheng, Mao Yang. In Proceedings of the 29th Symposium on Operating Systems Principles (SOSP’23), 2023. (Co-corresponding authors)

8. gsampler: General and efficient gpu-based graph sampling for graph learning. Ping Gong, Renjie Liu, Zunyao Mao, Zhenkun Cai, Xiao Yan*, Cheng Li*, Minjie Wang, Zhuozhao Li. In Proceedings of the 29th Symposium on Operating Systems Principles (SOSP’23), 2023. (Co-corresponding authors)

9. Gradient compression supercharged high-performance data parallel dnn training. Youhui Bai, Cheng Li*, Quan Zhou, Jun Yi, Ping Gong, Feng Yan, Ruichuan Chen, Yinlong Xu, In Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles (SOSP’21), 2021.

10. Making Geo-Replicated Systems Fast as Possible, Consistent when Necessary. Cheng Li, Daniel Porto, Allen Clement, Johannes Gehrke, Nuno Preguiça, Rodrigo Rodrigues. In the Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation (OSDI’12), 2012.