default search action
Weihao Cui
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j5]Cong Guo, Fengchen Xue, Jingwen Leng, Yuxian Qiu, Yue Guan, Weihao Cui, Quan Chen, Minyi Guo:
Accelerating Sparse DNNs Based on Tiled GEMM. IEEE Trans. Computers 73(5): 1275-1289 (2024) - [i6]Cong Guo, Fengchen Xue, Jingwen Leng, Yuxian Qiu, Yue Guan, Weihao Cui, Quan Chen, Minyi Guo:
Accelerating Sparse DNNs Based on Tiled GEMM. CoRR abs/2402.10876 (2024) - [i5]Chunyu Xue, Weihao Cui, Han Zhao, Quan Chen, Shulai Zhang, Pengyu Yang, Jing Yang, Shaobo Li, Minyi Guo:
A Codesign of Scheduling and Parallelization for Large Model Training in Heterogeneous Clusters. CoRR abs/2403.16125 (2024) - [i4]Han Zhao, Weihao Cui, Quan Chen, Shulai Zhang, Zijun Li, Jingwen Leng, Chao Li, Deze Zeng, Minyi Guo:
Towards Fast Setup and High Throughput of GPU Serverless Computing. CoRR abs/2404.14691 (2024) - [i3]Pai Zeng, Zhenyu Ning, Jieru Zhao, Weihao Cui, Mengwei Xu, Liwei Guo, Xusheng Chen, Yizhou Shan:
The CAP Principle for LLM Serving: A Survey of Long-Context Large Language Model Serving. CoRR abs/2405.11299 (2024) - [i2]Yangjie Zhou, Honglin Zhu, Qian Qiu, Weihao Cui, Zihan Liu, Cong Guo, Siyuan Feng, Jintao Meng, Haidong Lan, Jingwen Leng, Wenxi Zhu, Minwen Deng:
Vortex: Efficient Sample-Free Dynamic Tensor Program Optimization via Hardware-aware Strategy Space Hierarchization. CoRR abs/2409.01075 (2024) - 2023
- [j4]Han Zhao, Weihao Cui, Quan Chen, Minyi Guo:
ISPA: Exploiting Intra-SM Parallelism in GPUs via Fine-Grained Resource Management. IEEE Trans. Computers 72(5): 1473-1487 (2023) - [j3]Han Zhao, Weihao Cui, Quan Chen, Jingwen Leng, Deze Zeng, Minyi Guo:
Improving Cluster Utilization Through Adaptive Resource Management for Deep Neural Network and CPU Jobs Colocation. IEEE Trans. Computers 72(12): 3458-3472 (2023) - [c12]Yangjie Zhou, Yaoxu Song, Jingwen Leng, Zihan Liu, Weihao Cui, Zhendong Zhang, Cong Guo, Quan Chen, Li Li, Minyi Guo:
AdaptGear: Accelerating GNN Training via Adaptive Subgraph-Level Kernels on GPUs. CF 2023: 52-62 - [c11]Binghao Chen, Han Zhao, Weihao Cui, Yifu He, Shulai Zhang, Quan Chen, Zijun Li, Minyi Guo:
Maximizing the Utilization of GPUs Used by Cloud Gaming through Adaptive Co-location with Combo. SoCC 2023: 265-280 - [c10]Jiagan Cheng, Yilong Zhao, Zijun Li, Quan Chen, Weihao Cui, Minyi Guo:
Microless: Cost-Efficient Hybrid Deployment of Microservices on IaaS VMs and Serverless. ICPADS 2023: 2303-2310 - [c9]Weihao Cui, Zhenhua Han, Lingji Ouyang, Yichuan Wang, Ningxin Zheng, Lingxiao Ma, Yuqing Yang, Fan Yang, Jilong Xue, Lili Qiu, Lidong Zhou, Quan Chen, Haisheng Tan, Minyi Guo:
Optimizing Dynamic Neural Networks with Brainstorm. OSDI 2023: 797-815 - [i1]Yangjie Zhou, Yaoxu Song, Jingwen Leng, Zihan Liu, Weihao Cui, Zhendong Zhang, Cong Guo, Quan Chen, Li Li, Minyi Guo:
AdaptGear: Accelerating GNN Training via Adaptive Subgraph-Level Kernels on GPUs. CoRR abs/2305.17408 (2023) - 2022
- [j2]Wei Zhang, Quan Chen, Ningxin Zheng, Weihao Cui, Kaihua Fu, Minyi Guo:
Toward QoS-Awareness and Improved Utilization of Spatial Multitasking GPUs. IEEE Trans. Computers 71(4): 866-879 (2022) - [c8]Han Zhao, Weihao Cui, Quan Chen, Youtao Zhang, Yanchao Lu, Chao Li, Jingwen Leng, Minyi Guo:
Tacker: Tensor-CUDA Core Kernel Fusion for Improving the GPU Utilization while Ensuring QoS. HPCA 2022: 800-813 - [c7]Shulai Zhang, Weihao Cui, Quan Chen, Zhengnian Zhang, Yue Guan, Jingwen Leng, Chao Li, Minyi Guo:
PAME: precision-aware multi-exit DNN serving for reducing latencies of batched inferences. ICS 2022: 37:1-37:12 - [c6]Weihao Cui, Han Zhao, Quan Chen, Hao Wei, Zirui Li, Deze Zeng, Chao Li, Minyi Guo:
DVABatch: Diversity-aware Multi-Entry Multi-Exit Batching for Efficient Processing of DNN Services on GPUs. USENIX ATC 2022: 183-198 - 2021
- [j1]Weihao Cui, Quan Chen, Han Zhao, Mengze Wei, Xiaoxin Tang, Minyi Guo:
E2bird: Enhanced Elastic Batch for Improving Responsiveness and Throughput of Deep Learning Services. IEEE Trans. Parallel Distributed Syst. 32(6): 1307-1321 (2021) - [c5]Han Zhao, Weihao Cui, Quan Chen, Jieru Zhao, Jingwen Leng, Minyi Guo:
Exploiting Intra-SM Parallelism in GPUs via Persistent and Elastic Blocks. ICCD 2021: 290-298 - [c4]Weihao Cui, Han Zhao, Quan Chen, Ningxin Zheng, Jingwen Leng, Jieru Zhao, Zhuo Song, Tao Ma, Yong Yang, Chao Li, Minyi Guo:
Enable simultaneous DNN services based on deterministic operator overlap and precise latency prediction. SC 2021: 15 - 2020
- [c3]Han Zhao, Weihao Cui, Quan Chen, Jingwen Leng, Kai Yu, Deze Zeng, Chao Li, Minyi Guo:
CODA: Improving Resource Utilization by Slimming and Co-locating DNN and CPU Jobs. ICDCS 2020: 853-863
2010 – 2019
- 2019
- [c2]Weihao Cui, Mengze Wei, Quan Chen, Xiaoxin Tang, Jingwen Leng, Li Li, Mingyi Guo:
Ebird: Elastic Batch for Improving Responsiveness and Throughput of Deep Learning Services. ICCD 2019: 497-505 - [c1]Wei Zhang, Weihao Cui, Kaihua Fu, Quan Chen, Daniel Edward Mawhirter, Bo Wu, Chao Li, Minyi Guo:
Laius: Towards latency awareness and improved utilization of spatial multitasking accelerators in datacenters. ICS 2019: 58-68
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-07 01:18 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint