Compression Helps Deep Learning in Image Classification

Uloženo v:
Podrobná bibliografie
Název: Compression Helps Deep Learning in Image Classification
Autoři: En-Hui Yang, Hossam Amer, Yanbing Jiang
Zdroj: Entropy, Vol 23, Iss 881, p 881 (2021)
Informace o vydavateli: MDPI AG
Rok vydání: 2021
Sbírka: Directory of Open Access Journals: DOAJ Articles
Témata: image compression, deep learning, inception network, residual network, JPEG, Science, Astrophysics, QB460-466, Physics, QC1-999
Popis: The impact of JPEG compression on deep learning (DL) in image classification is revisited. Given an underlying deep neural network (DNN) pre-trained with pristine ImageNet images, it is demonstrated that, if, for any original image, one can select, among its many JPEG compressed versions including its original version, a suitable version as an input to the underlying DNN, then the classification accuracy of the underlying DNN can be improved significantly while the size in bits of the selected input is, on average, reduced dramatically in comparison with the original image. This is in contrast to the conventional understanding that JPEG compression generally degrades the classification accuracy of DL. Specifically, for each original image, consider its 10 JPEG compressed versions with their quality factor (QF) values from { 100 , 90 , 80 , 70 , 60 , 50 , 40 , 30 , 20 , 10 } . Under the assumption that the ground truth label of the original image is known at the time of selecting an input, but unknown to the underlying DNN, we present a selector called Highest Rank Selector (HRS). It is shown that HRS is optimal in the sense of achieving the highest Top k accuracy on any set of images for any k among all possible selectors. When the underlying DNN is Inception V3 or ResNet-50 V2, HRS improves, on average, the Top 1 classification accuracy and Top 5 classification accuracy on the whole ImageNet validation dataset by 5.6% and 1.9%, respectively, while reducing the input size in bits dramatically—the compression ratio (CR) between the size of the original images and the size of the selected input images by HRS is 8 for the whole ImageNet validation dataset. When the ground truth label of the original image is unknown at the time of selection, we further propose a new convolutional neural network (CNN) topology which is based on the underlying DNN and takes the original image and its 10 JPEG compressed versions as 11 parallel inputs. It is demonstrated that the proposed new CNN ...
Druh dokumentu: article in journal/newspaper
Jazyk: English
Relation: https://www.mdpi.com/1099-4300/23/7/881; https://doaj.org/toc/1099-4300; https://doaj.org/article/7de7ec1ee4b847ff9a21ac06293435ee
DOI: 10.3390/e23070881
Dostupnost: https://doi.org/10.3390/e23070881
https://doaj.org/article/7de7ec1ee4b847ff9a21ac06293435ee
Přístupové číslo: edsbas.52F7AA1E
Databáze: BASE
Popis
Abstrakt:The impact of JPEG compression on deep learning (DL) in image classification is revisited. Given an underlying deep neural network (DNN) pre-trained with pristine ImageNet images, it is demonstrated that, if, for any original image, one can select, among its many JPEG compressed versions including its original version, a suitable version as an input to the underlying DNN, then the classification accuracy of the underlying DNN can be improved significantly while the size in bits of the selected input is, on average, reduced dramatically in comparison with the original image. This is in contrast to the conventional understanding that JPEG compression generally degrades the classification accuracy of DL. Specifically, for each original image, consider its 10 JPEG compressed versions with their quality factor (QF) values from <semantics> { 100 , 90 , 80 , 70 , 60 , 50 , 40 , 30 , 20 , 10 } </semantics> . Under the assumption that the ground truth label of the original image is known at the time of selecting an input, but unknown to the underlying DNN, we present a selector called Highest Rank Selector (HRS). It is shown that HRS is optimal in the sense of achieving the highest Top k accuracy on any set of images for any k among all possible selectors. When the underlying DNN is Inception V3 or ResNet-50 V2, HRS improves, on average, the Top 1 classification accuracy and Top 5 classification accuracy on the whole ImageNet validation dataset by 5.6% and 1.9%, respectively, while reducing the input size in bits dramatically—the compression ratio (CR) between the size of the original images and the size of the selected input images by HRS is 8 for the whole ImageNet validation dataset. When the ground truth label of the original image is unknown at the time of selection, we further propose a new convolutional neural network (CNN) topology which is based on the underlying DNN and takes the original image and its 10 JPEG compressed versions as 11 parallel inputs. It is demonstrated that the proposed new CNN ...
DOI:10.3390/e23070881