Conference
|
Paper Content |
HPCA'26
|
[pdf]
[code]
AUM: Unleashing the Efficiency Potential of Shared Processors with Accelerator Units for LLM Serving
Authors: Xinkai Wang, Chao Li, Yiming Zhuansun, Jinyang Guo, Xiaofeng Hou, Jing Wang, Luping Wang, Weigao Chen, Cheng Huang, Guodong Yang, Liping Zhang, Minyi Guo.
Conference: Proceedings of the 32nd IEEE International Symposium on High-Performance Computer Architecture (HPCA), Feb. 2026.
|
ASPLOS'26
|
[pdf]
[code]
MoE-APEX: An Efficient MoE Inference System with Adaptive Precision Expert Offloading
Authors: Peng Tang, Jiacheng Liu, Xiaofeng Hou, Yifei Pu, Jing Wang, Pheng-Ann Heng, Chao Li, Minyi Guo.
Conference: Proceedings of the 31st International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), March 2026.
|
NPC'25 (Best Student Paper Award) |
[pdf]
[code]
TriCooling-Sim: Efficient Thermal Simulation for High-Density Micro AI Data Centers
Authors: Jinyang Guo, Xinkai Wang, Jing Wang, Xiaofeng Hou, Chao Li, Minyi Guo.
Conference: Proceedings of the 21st IFIP International Conference on Network and Parallel Computing (NPC), Nov. 2025.
|
NPC'25
|
[pdf]
[code]
CGO: Cloud Game Orchestration via Resource Preception and CODEC Optimization
Authors: Taolei Wang, Chao Li, Jing Wang, Xiaofeng Hou, Minyi Guo.
Conference: Proceedings of the 21st IFIP International Conference on Network and Parallel Computing (NPC), Nov. 2025.
|
APPT'25
|
[pdf]
[code]
AsymServe: Demystifying and Optimizing LLM Serving Efficiency on CPU Acceleration Units
Authors: Xinkai Wang, Yiming Zhuansun, Chao Li, Jing Wang, Xiaofeng Hou, Lingyu Sun, Luping Wang, Minyi Guo.
Conference: Proceedings of the International Symposium on Advanced Parallel Processing Technology (APPT), July 2025.
|
APPT'25
|
[pdf]
[code]
Accelerating Large-Scale Out-of-GPU-Core GNN Training with Two-Level Historical Caching
Authors: Jing Wang, Taolei Wang, Juntao Huang, Yibo Liu, Xinkai Wang, Marius Kreutzer, Chao Li, Minyi Guo.
Conference: Proceedings of the International Symposium on Advanced Parallel Processing Technology (APPT), July 2025.
|
SC'24
|
[ACM link]
[slides]
[poster]
[talk]
[code]
Boosting Data Center Performance via Intelligently Managed Multi-backend Disaggregated Memory
Authors: Jing Wang, Hanzhang Yang, Chao Li, Yiming Zhuansun, Wang Yuan, Cheng Xu, Xiaofeng Hou, Minyi Guo, Yang Hu, Yaqian Zhao.
Conference: International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2024.
|
VLDB'24
|
[slides]
[code]
FlowWalker: A Memory-efficient and High-performance GPU-based Dynamic Graph Random Walk Framework
Authors: Junyi Mei, Shixuan Sun, Chao Li, Cheng Xu, Cheng Chen, Yibo Liu, Jing Wang, Cheng Zhao, Xiaofeng Hou, Minyi Guo, Bingsheng He, Xiaoliang Cong.
Conference: Proceedings of the Very Large Data Bases Endowment (VLDB), 2024.
|
IPDPS'24
|
[slides]
[code]
CoCG: Fine-grained Cloud Game Co-location on Heterogeneous Platform
Authors: Taolei Wang, Jing Wang, Chao Li, Cheng Xu, Xiaofeng Hou, Minyi Guo.
Conference: IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2024.
|
CCGrid'24
|
[slides]
[code]
Improving the Efficiency of Serverless Computing via Core-Level Power Management
Authors: Du Liu, Jing Wang, Xinkai Wang, Chao Li, Lu Zhang, Xiaofeng Hou, Xiaoxiang Shi, Minyi Guo.
Conference: International Symposium on Cluster, Cloud and Internet Computing (CCGrid), 2024.
|
ICME'24
|
[slides]
[code]
$M^2$SN: Adaptive and Dynamic Multi-modal Shortcut Network Architecture for Latency-aware Applications
Authors: Yifei Pu, Chi Wang, Xiaofeng Hou, Cheng Xu, Jiacheng Liu,Jing Wang, Minyi Guo, Chao Li.
Conference: International Conference on Multimedia and Expo (ICME), 2024.
|
SoCC'23
|
[pdf]
[code]
Not All Resources are Visible: Exploiting Fragmented Shadow Resources in Shared-State Scheduler Architecture
Authors: Xinkai Wang, Hao He, Yuancheng Li, Chao Li, Xiaofeng Hou, Jing Wang, Quan Chen, Jingwen Leng, Minyi Guo, Leibo Wang.
Conference: Proceedings of the 14th ACM Symposium on Cloud Computing (SoCC), Nov. 2023.
|
IPDPS'22
|
[slides]
[code]
[video]
Excavating the Potential of Graph Workload on RDMA-based Far Memory Architecture
Authors: Jing Wang, Chao Li, Taolei Wang, Lu Zhang, Pengyu Wang, Junyi Mei, Minyi Guo.
Conference: International Parallel and Distributed Processing Symposium (IPDPS), 2022.
|
ICCD'22
|
[slides]
[code]
[video]
HyFarM: Task Orchestration on Hybrid Far Memory for High Performance Per Bit
Authors: Jing Wang, Chao Li, Junyi Mei, Hao He, Taolei Wang, Pengyu Wang, Lu Zhang, Minyi Guo, Hanqing Wu, Dongbai Chen, Xiangwen Liu.
Conference: International Conference on Computer Design (ICCD), 2022.
|
ICPE'22 (Best Paper Award) |
[pdf]
[code]
Oversubscribing GPU Unified Virtual Memory: Implications and Suggestions
Authors: Chuanming Shao, Jinyang Guo, Pengyu Wang, Jing Wang, Chao Li, Minyi Guo.
Conference: International Conference on Performance Engineering (ICPE), 2022.
|
PACT'21
|
[pdf]
[code]
Skywalker: Efficient
Alias-method-based Graph Sampling and Random Walk on GPUs
Authors: Pengyu Wang, Chao Li, Jing Wang, Taolei Wang, Lu Zhang, Jingwen Leng, Quan Chen, Minyi Guo.
Conference: International Conference on Parallel Architectures and Compilation Techniques (PACT), 2021.
|