Wang, P., Li, J., Ma, M., & Fan, X. (2022, May 23). Distributed Audio-Visual Parsing Based On Multimodal Transformer and Deep Joint Source Channel Coding. Proceedings of the ... IEEE International Conference on Acoustics, Speech and Signal Processing (1998), 4623-4627. https://doi.org/10.1109/ICASSP43922.2022.9746660
Chicago Style (17th ed.) CitationWang, Penghong, Jiahui Li, Mengyao Ma, and Xiaopeng Fan. "Distributed Audio-Visual Parsing Based On Multimodal Transformer and Deep Joint Source Channel Coding." Proceedings of the ... IEEE International Conference on Acoustics, Speech and Signal Processing (1998) 23 May. 2022: 4623-4627. https://doi.org/10.1109/ICASSP43922.2022.9746660.
MLA (9th ed.) CitationWang, Penghong, et al. "Distributed Audio-Visual Parsing Based On Multimodal Transformer and Deep Joint Source Channel Coding." Proceedings of the ... IEEE International Conference on Acoustics, Speech and Signal Processing (1998), 23 May. 2022, pp. 4623-4627, https://doi.org/10.1109/ICASSP43922.2022.9746660.