LapFormer: surgical tool detection in laparoscopic surgical video using transformer architecture
One of the most essential steps in the surgical workflow analysis is recognition of surgical tool presence. We propose a method to detect the presence of surgical tools in laparoscopic surgery videos, called LapFormer. The novelty of LapFormer is to use a Transformer architecture, which is a feed-fo...
Saved in:
| Published in: | Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization Vol. 9; no. 3; pp. 302 - 307 |
|---|---|
| Main Author: | |
| Format: | Journal Article |
| Language: | English Japanese |
| Published: |
Taylor & Francis
04.05.2021
Informa UK Limited |
| Subjects: | |
| ISSN: | 2168-1163, 2168-1171 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | One of the most essential steps in the surgical workflow analysis is recognition of surgical tool presence. We propose a method to detect the presence of surgical tools in laparoscopic surgery videos, called LapFormer. The novelty of LapFormer is to use a Transformer architecture, which is a feed-forward neural network architecture with attention mechanism, growing in popularity for natural language processing, for analysing inter-frame correlation in videos instead of using recurrent neural network families. To the best of our knowledge, no methods using a Transformer architecture for analysing laparoscopic surgery videos have been proposed. We evaluate our method on a dataset called Cholec80, which contains 80 videos of cholecystectomy surgeries. We confirm that our proposed method outperforms the conventional methods such as single-frame analysis with convolutional neural networks or multiple frame analysis with recurrent neural networks by 20.3 and 17.3 points in macro-F1 score, respectively. We also conduct an ablation study on how hyper-parameters for Transformer block in our proposed method affect the performance of the detection. |
|---|---|
| AbstractList | One of the most essential steps in the surgical workflow analysis is recognition of surgical tool presence. We propose a method to detect the presence of surgical tools in laparoscopic surgery videos, called LapFormer. The novelty of LapFormer is to use a Transformer architecture, which is a feed-forward neural network architecture with attention mechanism, growing in popularity for natural language processing, for analysing inter-frame correlation in videos instead of using recurrent neural network families. To the best of our knowledge, no methods using a Transformer architecture for analysing laparoscopic surgery videos have been proposed. We evaluate our method on a dataset called Cholec80, which contains 80 videos of cholecystectomy surgeries. We confirm that our proposed method outperforms the conventional methods such as single-frame analysis with convolutional neural networks or multiple frame analysis with recurrent neural networks by 20.3 and 17.3 points in macro-F1 score, respectively. We also conduct an ablation study on how hyper-parameters for Transformer block in our proposed method affect the performance of the detection. |
| Author | Kondo, Satoshi |
| Author_xml | – sequence: 1 givenname: Satoshi surname: Kondo fullname: Kondo, Satoshi email: satoshi.kondo@konicaminolta.com organization: Konica Minolta, Inc |
| BackLink | https://cir.nii.ac.jp/crid/1871991017630640896$$DView record in CiNii |
| BookMark | eNqFkEFPwyAYhomZiXPuJ5hw8NoJpVDQi2ZxarLEi56RAp2YrjTQafbvpW5q4kE5AHnzPR98zzEYtb61AJxiNMOIo_McM44xI7Mc5SnihFKKDsB4yDOMSzz6vjNyBKYxvqK0OGOE0TF4Xqpu4cPahgsYN2HltGpg730Dje2t7p1voWthozoVfNS-c_qn7s0Z6-EmunYF-6DaWH92giroFzfQm2BPwGGtmmin-3MCnhY3j_O7bPlwez-_Xma6QLTPLM8LhEydq7LShbCozEWlLTe65KgUWleVqVMuKKmJMYYSTThGAleFULTiZALorq9O_4zB1rILbq3CVmIkB1Pyy5QcTMm9qcRd_uK069Uwd5rINf_SZzu6dS6Bw455iYXACJeMIFYgLlgqu9qVuXZQpN59aIzs1bbxoU7itIuS_P3SB15xkAI |
| CitedBy_id | crossref_primary_10_1002_rcs_70089 crossref_primary_10_1109_TPAMI_2023_3243465 crossref_primary_10_1080_21681163_2022_2145238 crossref_primary_10_1038_s43856_024_00581_0 crossref_primary_10_1002_rcs_2445 crossref_primary_10_3390_bioengineering9120737 crossref_primary_10_1109_TMI_2023_3335406 crossref_primary_10_1109_TMI_2023_3279838 crossref_primary_10_1007_s11548_022_02691_3 crossref_primary_10_1093_bjsopen_zraf073 crossref_primary_10_1109_TIM_2023_3298396 crossref_primary_10_1007_s13042_023_01875_w crossref_primary_10_1080_21681163_2022_2152371 crossref_primary_10_1007_s00371_025_04161_8 crossref_primary_10_1049_htl2_12060 crossref_primary_10_1109_TMI_2022_3177077 |
| Cites_doi | 10.1007/s11263-015-0816-y 10.1109/CVPR.2015.7298594 10.1016/j.media.2019.101572 10.1109/CBMI.2015.7153616 10.3115/v1/D14-1179 10.1016/j.ipm.2009.03.002 10.1109/CVPR.2019.00033 10.1109/CVPR.2016.90 10.1162/neco.1997.9.8.1735 10.1109/TMI.2016.2593957 |
| ContentType | Journal Article |
| Copyright | 2020 Informa UK Limited, trading as Taylor & Francis Group 2020 |
| Copyright_xml | – notice: 2020 Informa UK Limited, trading as Taylor & Francis Group 2020 |
| DBID | RYH AAYXX CITATION |
| DOI | 10.1080/21681163.2020.1835550 |
| DatabaseName | CiNii Complete CrossRef |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 2168-1171 |
| EndPage | 307 |
| ExternalDocumentID | 10_1080_21681163_2020_1835550 1835550 |
| Genre | Research Article |
| GroupedDBID | 0BK 30N 4.4 AAGDL AAJMT AALDU AAMIU AAPUL AAQRR ABLIJ ABPAQ ABXUL ABXYU ACGFS ADCVX ADGTB ADMLS AEISY AFRVT AGDLA AHDZW AIJEM AIYEW AKBVH AKOOK ALMA_UNASSIGNED_HOLDINGS ALQZU AQTUD ARCSS BLEHA CCCUG EBS EUPTU GTTXZ H13 HZ~ KYCEM LJTGL M4Z O9- RIG RNANH ROSJB RTWRZ SNACF SOJIQ TBQAZ TDBHL TEN TFL TFT TFW TTHFI TUROJ RYH AAYXX CITATION |
| ID | FETCH-LOGICAL-c405t-e82400df2a7bc49e0729bce8dc78079ccbbdf49e953f3ddd53c381091b49a5b83 |
| IEDL.DBID | TFW |
| ISICitedReferencesCount | 29 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000581900900001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 2168-1163 |
| IngestDate | Sat Nov 29 06:34:10 EST 2025 Tue Nov 18 21:09:46 EST 2025 Mon Nov 10 09:14:55 EST 2025 Mon Oct 20 23:47:46 EDT 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 3 |
| Language | English Japanese |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c405t-e82400df2a7bc49e0729bce8dc78079ccbbdf49e953f3ddd53c381091b49a5b83 |
| ORCID | 0000-0002-4941-4920 |
| OpenAccessLink | https://cir.nii.ac.jp/crid/1871991017630640896 |
| PageCount | 6 |
| ParticipantIDs | nii_cinii_1871991017630640896 crossref_primary_10_1080_21681163_2020_1835550 informaworld_taylorfrancis_310_1080_21681163_2020_1835550 crossref_citationtrail_10_1080_21681163_2020_1835550 |
| PublicationCentury | 2000 |
| PublicationDate | 2021-05-04 |
| PublicationDateYYYYMMDD | 2021-05-04 |
| PublicationDate_xml | – month: 05 year: 2021 text: 2021-05-04 day: 04 |
| PublicationDecade | 2020 |
| PublicationTitle | Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization |
| PublicationYear | 2021 |
| Publisher | Taylor & Francis Informa UK Limited |
| Publisher_xml | – name: Taylor & Francis – name: Informa UK Limited |
| References | cit0011 cit0012 cit0010 Kitaev N (cit0007) 2020 Ba JL (cit0001) 2016; 1607 Namazi B (cit0009) 2019; 1905 cit0008 cit0006 cit0004 cit0015 cit0005 cit0016 cit0002 cit0013 cit0003 cit0014 |
| References_xml | – ident: cit0011 doi: 10.1007/s11263-015-0816-y – ident: cit0013 doi: 10.1109/CVPR.2015.7298594 – ident: cit0006 doi: 10.1016/j.media.2019.101572 – volume-title: International Conference on Learning Representaitons (ICLR), Virtual Conference, Formerly Addis Ababa, Ethiopia year: 2020 ident: cit0007 – ident: cit0010 doi: 10.1109/CBMI.2015.7153616 – ident: cit0002 doi: 10.3115/v1/D14-1179 – ident: cit0008 – ident: cit0015 – ident: cit0016 – ident: cit0012 doi: 10.1016/j.ipm.2009.03.002 – ident: cit0003 doi: 10.1109/CVPR.2019.00033 – ident: cit0004 doi: 10.1109/CVPR.2016.90 – ident: cit0005 doi: 10.1162/neco.1997.9.8.1735 – ident: cit0014 doi: 10.1109/TMI.2016.2593957 – volume: 1905 start-page: 08983 year: 2019 ident: cit0009 publication-title: arXiv – volume: 1607 start-page: 06450 year: 2016 ident: cit0001 publication-title: arXiv |
| SSID | ssj0000866365 ssib044168314 ssib039557987 ssib024195514 |
| Score | 2.317325 |
| Snippet | One of the most essential steps in the surgical workflow analysis is recognition of surgical tool presence. We propose a method to detect the presence of... |
| SourceID | crossref nii informaworld |
| SourceType | Enrichment Source Index Database Publisher |
| StartPage | 302 |
| SubjectTerms | Laparoscopy surgical workflow analysis transformer |
| Title | LapFormer: surgical tool detection in laparoscopic surgical video using transformer architecture |
| URI | https://www.tandfonline.com/doi/abs/10.1080/21681163.2020.1835550 https://cir.nii.ac.jp/crid/1871991017630640896 |
| Volume | 9 |
| WOSCitedRecordID | wos000581900900001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVAWR databaseName: Taylor and Francis Online Journals customDbUrl: eissn: 2168-1171 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000866365 issn: 2168-1163 databaseCode: TFW dateStart: 20130301 isFulltext: true titleUrlDefault: https://www.tandfonline.com providerName: Taylor & Francis |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV09T8MwELVQxQAD34gCRR5YA_lwHZsNISKGqmIooluIHRtFqpIqSfn93KVJaQfUAZYMVs6ynMvdu-T5HSG3UOBw4xrP8VmoHJZo7ijXtw6X3IYut8wmTdeSUTgei-lUvrZswqqlVWINbZdCEU2sxpc7UVXHiLv3PS48wBFQ3fkwBBhi2FTtgOzRxyfR--orCwB2HjT9JNHIQavuGM9vE20kqA35Ukg_eZatpZ_o8B8WfkQOWuxJH5fOckx2TH5C9tcUCU_JxyiZR7AUUz7QalE2YZHWRTGjqakb1lZOs5zOIMWiDGYxz_TPfXikr6DIpP-kdQeITUnXf1ackbfoefL04rRNGBwNWK52jECWaWr9JFSaSYNK40obkepQuKHUWqnUwrgcBjZI03QYaBQNk55iMhkqEZyTXl7k5oJQAGuJFoYLYRgLfVdBsNAm0H6gtUms7BPW7XysW4VybJQxi71WyLTbwBg3MG43sE_uVmbzpUTHNgO5_ljjuvk2YpeNTOJgi-0AfACWh1cPqk3A1xDUOBZyrpD88g9zX5E9HxkzSKdk16RXlwszILv6q86q8qZx7G9zvvE2 |
| linkProvider | Taylor & Francis |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LT4QwEG7MaqIefBvf9uAV5VFK680YNxrXPa3RG9LSGpINbFjW3-8MC4oHswe9cCBM0wxl5pvy9RtCLqDA4cY1nuOzSDks0dxRrm8dLrmNXG6ZTequJYNoOBSvr7J7FgZplVhD27lQRB2r8ePGzeiWEnfle1x4ACSgvPPhFoCIEMv25RByLdL6Rv2Xr30WgOw8qDtKopWDZu1Bnt9G-pGifgiYQgLKs6yTgPqb_zH1LbLRwE96M18v22TJ5DtkvSNKuEveBsmkD3Mx5TWdzso6MtKqKMY0NVVN3MppltMxZFlUwiwmmf5-Dk_1FRTJ9O-0ajGxKWn3f8Ueee7fjW7vnaYPg6MBzlWOEUg0Ta2fREozaVBsXGkjUh0JN5JaK5VauC_DwAZpmoaBRt0w6Skmk1CJYJ_08iI3B4QCXku0MFwIw1jkuwrihTaB9gOtTWLlIWGt62PdiJRjr4xx7DVapq0DY3Rg3DjwkFx-mU3mKh2LDGT3vcZVvT1i571M4mCB7SksApgeXj0oOAFiQ1zjWMu5QvKjP4x9TlbvR0-DePAwfDwmaz4SaJBdyU5Irypn5pSs6I8qm5Zn9Sr_BLMC9Vc |
| linkToPdf | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV07T8MwELYQIAQDb0R5emAN5OE6NhsCIhBV1QFEtxA7NopUJVWa8vu5S5NSBsQASwYrZ1nO5e675PN3hFxAgcONazzHZ6FyWKK5o1zfOlxyG7rcMpvUXUt6Yb8vhkM5aNiEk4ZWiTW0nQlF1LEaX-5xaltG3JXvceEBjoDqzochwBBdrNpXanEscOnn6HX-mQUQOw_qhpJo5aBZe47np5m-Zahv-qWQf_IsW8g_0dY_rHybbDbgk97MvGWHLJl8l2wsSBLukbdeMo5gKaa8ppNpWcdFWhXFiKamqmlbOc1yOoIcizqYxTjTX_fhmb6CIpX-nVYtIjYlXfxbsU9eovvn2wen6cLgaABzlWME0kxT6yeh0kwalBpX2ohUh8INpdZKpRbGZTewQZqm3UCjapj0FJNJV4nggCznRW4OCQW0lmhhuBCGsdB3FUQLbQLtB1qbxMoOYe3Ox7qRKMdOGaPYa5RM2w2McQPjZgM75HJuNp5pdPxmIBcfa1zVH0fsrJNJHPxiewo-AMvDqwflJgBsiGocKzlXSH70h7nPydrgLop7j_2nY7LuI3sGqZXshCxX5dScklX9UWWT8qz28U-h6_P7 |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=LapFormer%3A+surgical+tool+detection+in+laparoscopic+surgical+video+using+transformer+architecture&rft.jtitle=Computer+methods+in+biomechanics+and+biomedical+engineering.&rft.au=Kondo%2C+Satoshi&rft.date=2021-05-04&rft.issn=2168-1163&rft.eissn=2168-1171&rft.volume=9&rft.issue=3&rft.spage=302&rft.epage=307&rft_id=info:doi/10.1080%2F21681163.2020.1835550&rft.externalDBID=n%2Fa&rft.externalDocID=10_1080_21681163_2020_1835550 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2168-1163&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2168-1163&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2168-1163&client=summon |