APA (7th ed.) Citation

Zhao, Y., Wu, D., & Wang, J. (2024, June 29). ALISA: Accelerating Large Language Model Inference via Sparsity-Aware KV Caching. 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA), 1005-1017. https://doi.org/10.1109/ISCA59077.2024.00077

Chicago Style (17th ed.) Citation

Zhao, Youpeng, Di Wu, and Jun Wang. "ALISA: Accelerating Large Language Model Inference via Sparsity-Aware KV Caching." 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA) 29 Jun. 2024: 1005-1017. https://doi.org/10.1109/ISCA59077.2024.00077.

MLA (9th ed.) Citation

Zhao, Youpeng, et al. "ALISA: Accelerating Large Language Model Inference via Sparsity-Aware KV Caching." 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA), 29 Jun. 2024, pp. 1005-1017, https://doi.org/10.1109/ISCA59077.2024.00077.

Warning: These citations may not always be 100% accurate.