Bibliographische Detailangaben
| Titel: |
New approaches for the integration of the discrete cosine transform in neural networks for fine-grained image classification |
| Autoren: |
Tan, Kelvin Sim Zhen |
| Publikationsjahr: |
2025 |
| Bestand: |
The University of Nottingham: Nottingham ePrints |
| Schlagwörter: |
convolutional neural network (CNN), fine-grained visual classification (FGVC), discrete cosine transform (DCT), compressed domain image classification, pointwise convolution |
| Beschreibung: |
A convolutional neural network (CNN) is a popular neural network architecture that excels in its ability to capture patterns in tasks with grid-structured inputs (e.g. visual recognition). Fine-grained visual classification (FGVC) uses CNN to categorise images of high intra-class and low inter-class variance. According to the literature, the 2D Discrete Cosine Transform (DCT) is one of the well-known transformations used in compression for its robustness and high data compaction properties. In compressed domain image classification, many works have focused on extracting features from the low DCT coefficients (L-DCTCs) through a fully pointwise vanilla CNN. Here, the abundant medium to high DCTCs have typically been discarded. Although pointwise convolution is capable of complex transformations, the spatial context and representation are limited. The area of compressed domain FGVC remains a relatively inactive field. It is therefore essential to explore compressed domain FGVC under DCT conditions to investigate the relationship between fine-grained features and the full spectrum of DCTCs. More specifically, this thesis intends to adopt and extend DCT techniques in compressed domain FGVC to address three topics: (1) the usability and inclusive learning of mid-band DCTCs; (2) the adaptive learning of DCT basis functions on composing the pointwise convolutional kernels; (3) the interaction between DCT channel groups in feature representations. The first contribution introduces the ‘Skipped Medium DCT CNN’. The M-DCTCs were processed via a skipping branch with a shallow convolutional block alongside the L-DCTCs which were passed through the main branch of the CNN. This architecture achieved a classification error drop of up to 7% over the standard model without the skipping branch. It highlights the importance of combining higher-frequency DCTCs with lower ones for improved robustness. The second contribution enhances the prior network by adaptively weighting the DCT basis functions to form a pointwise convolutional ... |
| Publikationsart: |
thesis |
| Dateibeschreibung: |
application/pdf |
| Sprache: |
English |
| Relation: |
https://eprints.nottingham.ac.uk/80089/1/18023285-KelvinTanSimZhen_FinalPhdThesis_CORRECTION-2.pdf; Tan, Kelvin Sim Zhen (2025) New approaches for the integration of the discrete cosine transform in neural networks for fine-grained image classification. PhD thesis, University of Nottingham. |
| Verfügbarkeit: |
https://eprints.nottingham.ac.uk/80089/ |
| Rights: |
cc_by |
| Dokumentencode: |
edsbas.11B06E81 |
| Datenbank: |
BASE |