Conventional Cervical Cytology Image Dataset with Cell Outline Annotations

Here we describe our new Bialystok dataset of 162 Papanicolaou cervical smear images containing 2419 cells with annotations in form of cell cytoplasm maps. The images are fragments of whole slides, including artefacts and dense cell clusters. Cell classification annotations are created in accordance...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:2023 International Symposium on Image and Signal Processing and Analysis (ISPA) s. 1 - 6
Hlavní autoři: Pater, Antonina, Siemion, Krzysztof, Deptuch, Karol, Roszkowiak, Lukasz, Zak, Jakub, Jakubowska, Katarzyna, Sulkowski, Stanislaw, Baltaziak, Marek, Koda, Mariusz, Korzynska, Anna
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 18.09.2023
Témata:
ISSN:1849-2266
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Here we describe our new Bialystok dataset of 162 Papanicolaou cervical smear images containing 2419 cells with annotations in form of cell cytoplasm maps. The images are fragments of whole slides, including artefacts and dense cell clusters. Cell classification annotations are created in accordance with the Bethesda system. It means that our dataset is coherent with the most widely used cytology reporting system, adaptable to different cell annotation methods. Additionally we perform segmentation with three common neural networks: U-Net, FPN and PSP-Net, and classification with VGG16 with four different classification cases. Then we compare our Bialystok dataset with another large outline annotation dataset Sipakmed qualitatively and as training data for deep learning. We train and test VGG16, on both datasets, including cross-testing. VGG16 reaches F_{1} of 99.09% on Sipakmed and 91.59% on Bialystok dataset. When cross-testing, the neural network trained on Sipakmed and tested on Bialystok reaches F_{1} of 87.83%. The neural network trained on Bialystok reaches F_{1} of 92.74 % when tested on Sipakmed. Results indicate Bialystok as more challenging for segmentation. The proposed dataset is useful for training deep learning methods to segment, detect or classify physiological and pathologic epithelial cells in whole slide smear images.
ISSN:1849-2266
DOI:10.1109/ISPA58351.2023.10279274