LapFormer: surgical tool detection in laparoscopic surgical video using transformer architecture
One of the most essential steps in the surgical workflow analysis is recognition of surgical tool presence. We propose a method to detect the presence of surgical tools in laparoscopic surgery videos, called LapFormer. The novelty of LapFormer is to use a Transformer architecture, which is a feed-fo...
Saved in:
| Published in: | Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization Vol. 9; no. 3; pp. 302 - 307 |
|---|---|
| Main Author: | |
| Format: | Journal Article |
| Language: | English Japanese |
| Published: |
Taylor & Francis
04.05.2021
Informa UK Limited |
| Subjects: | |
| ISSN: | 2168-1163, 2168-1171 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | One of the most essential steps in the surgical workflow analysis is recognition of surgical tool presence. We propose a method to detect the presence of surgical tools in laparoscopic surgery videos, called LapFormer. The novelty of LapFormer is to use a Transformer architecture, which is a feed-forward neural network architecture with attention mechanism, growing in popularity for natural language processing, for analysing inter-frame correlation in videos instead of using recurrent neural network families. To the best of our knowledge, no methods using a Transformer architecture for analysing laparoscopic surgery videos have been proposed. We evaluate our method on a dataset called Cholec80, which contains 80 videos of cholecystectomy surgeries. We confirm that our proposed method outperforms the conventional methods such as single-frame analysis with convolutional neural networks or multiple frame analysis with recurrent neural networks by 20.3 and 17.3 points in macro-F1 score, respectively. We also conduct an ablation study on how hyper-parameters for Transformer block in our proposed method affect the performance of the detection. |
|---|---|
| AbstractList | One of the most essential steps in the surgical workflow analysis is recognition of surgical tool presence. We propose a method to detect the presence of surgical tools in laparoscopic surgery videos, called LapFormer. The novelty of LapFormer is to use a Transformer architecture, which is a feed-forward neural network architecture with attention mechanism, growing in popularity for natural language processing, for analysing inter-frame correlation in videos instead of using recurrent neural network families. To the best of our knowledge, no methods using a Transformer architecture for analysing laparoscopic surgery videos have been proposed. We evaluate our method on a dataset called Cholec80, which contains 80 videos of cholecystectomy surgeries. We confirm that our proposed method outperforms the conventional methods such as single-frame analysis with convolutional neural networks or multiple frame analysis with recurrent neural networks by 20.3 and 17.3 points in macro-F1 score, respectively. We also conduct an ablation study on how hyper-parameters for Transformer block in our proposed method affect the performance of the detection. |
| Author | Kondo, Satoshi |
| Author_xml | – sequence: 1 givenname: Satoshi surname: Kondo fullname: Kondo, Satoshi email: satoshi.kondo@konicaminolta.com organization: Konica Minolta, Inc |
| BackLink | https://cir.nii.ac.jp/crid/1871991017630640896$$DView record in CiNii |
| BookMark | eNqFkEFPwyAYhomZiXPuJ5hw8NoJpVDQi2ZxarLEi56RAp2YrjTQafbvpW5q4kE5AHnzPR98zzEYtb61AJxiNMOIo_McM44xI7Mc5SnihFKKDsB4yDOMSzz6vjNyBKYxvqK0OGOE0TF4Xqpu4cPahgsYN2HltGpg730Dje2t7p1voWthozoVfNS-c_qn7s0Z6-EmunYF-6DaWH92giroFzfQm2BPwGGtmmin-3MCnhY3j_O7bPlwez-_Xma6QLTPLM8LhEydq7LShbCozEWlLTe65KgUWleVqVMuKKmJMYYSTThGAleFULTiZALorq9O_4zB1rILbq3CVmIkB1Pyy5QcTMm9qcRd_uK069Uwd5rINf_SZzu6dS6Bw455iYXACJeMIFYgLlgqu9qVuXZQpN59aIzs1bbxoU7itIuS_P3SB15xkAI |
| CitedBy_id | crossref_primary_10_1002_rcs_70089 crossref_primary_10_1109_TPAMI_2023_3243465 crossref_primary_10_1080_21681163_2022_2145238 crossref_primary_10_1038_s43856_024_00581_0 crossref_primary_10_1002_rcs_2445 crossref_primary_10_3390_bioengineering9120737 crossref_primary_10_1109_TMI_2023_3335406 crossref_primary_10_1109_TMI_2023_3279838 crossref_primary_10_1007_s11548_022_02691_3 crossref_primary_10_1093_bjsopen_zraf073 crossref_primary_10_1109_TIM_2023_3298396 crossref_primary_10_1007_s13042_023_01875_w crossref_primary_10_1080_21681163_2022_2152371 crossref_primary_10_1007_s00371_025_04161_8 crossref_primary_10_1049_htl2_12060 crossref_primary_10_1109_TMI_2022_3177077 |
| Cites_doi | 10.1007/s11263-015-0816-y 10.1109/CVPR.2015.7298594 10.1016/j.media.2019.101572 10.1109/CBMI.2015.7153616 10.3115/v1/D14-1179 10.1016/j.ipm.2009.03.002 10.1109/CVPR.2019.00033 10.1109/CVPR.2016.90 10.1162/neco.1997.9.8.1735 10.1109/TMI.2016.2593957 |
| ContentType | Journal Article |
| Copyright | 2020 Informa UK Limited, trading as Taylor & Francis Group 2020 |
| Copyright_xml | – notice: 2020 Informa UK Limited, trading as Taylor & Francis Group 2020 |
| DBID | RYH AAYXX CITATION |
| DOI | 10.1080/21681163.2020.1835550 |
| DatabaseName | CiNii Complete CrossRef |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 2168-1171 |
| EndPage | 307 |
| ExternalDocumentID | 10_1080_21681163_2020_1835550 1835550 |
| Genre | Research Article |
| GroupedDBID | 0BK 30N 4.4 AAGDL AAJMT AALDU AAMIU AAPUL AAQRR ABLIJ ABPAQ ABXUL ABXYU ACGFS ADCVX ADGTB ADMLS AEISY AFRVT AGDLA AHDZW AIJEM AIYEW AKBVH AKOOK ALMA_UNASSIGNED_HOLDINGS ALQZU AQTUD ARCSS BLEHA CCCUG EBS EUPTU GTTXZ H13 HZ~ KYCEM LJTGL M4Z O9- RIG RNANH ROSJB RTWRZ SNACF SOJIQ TBQAZ TDBHL TEN TFL TFT TFW TTHFI TUROJ RYH AAYXX CITATION |
| ID | FETCH-LOGICAL-c405t-e82400df2a7bc49e0729bce8dc78079ccbbdf49e953f3ddd53c381091b49a5b83 |
| IEDL.DBID | TFW |
| ISICitedReferencesCount | 29 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000581900900001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 2168-1163 |
| IngestDate | Sat Nov 29 06:34:10 EST 2025 Tue Nov 18 21:09:46 EST 2025 Mon Nov 10 09:14:55 EST 2025 Mon Oct 20 23:47:46 EDT 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 3 |
| Language | English Japanese |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c405t-e82400df2a7bc49e0729bce8dc78079ccbbdf49e953f3ddd53c381091b49a5b83 |
| ORCID | 0000-0002-4941-4920 |
| OpenAccessLink | https://cir.nii.ac.jp/crid/1871991017630640896 |
| PageCount | 6 |
| ParticipantIDs | nii_cinii_1871991017630640896 crossref_primary_10_1080_21681163_2020_1835550 informaworld_taylorfrancis_310_1080_21681163_2020_1835550 crossref_citationtrail_10_1080_21681163_2020_1835550 |
| PublicationCentury | 2000 |
| PublicationDate | 2021-05-04 |
| PublicationDateYYYYMMDD | 2021-05-04 |
| PublicationDate_xml | – month: 05 year: 2021 text: 2021-05-04 day: 04 |
| PublicationDecade | 2020 |
| PublicationTitle | Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization |
| PublicationYear | 2021 |
| Publisher | Taylor & Francis Informa UK Limited |
| Publisher_xml | – name: Taylor & Francis – name: Informa UK Limited |
| References | cit0011 cit0012 cit0010 Kitaev N (cit0007) 2020 Ba JL (cit0001) 2016; 1607 Namazi B (cit0009) 2019; 1905 cit0008 cit0006 cit0004 cit0015 cit0005 cit0016 cit0002 cit0013 cit0003 cit0014 |
| References_xml | – ident: cit0011 doi: 10.1007/s11263-015-0816-y – ident: cit0013 doi: 10.1109/CVPR.2015.7298594 – ident: cit0006 doi: 10.1016/j.media.2019.101572 – volume-title: International Conference on Learning Representaitons (ICLR), Virtual Conference, Formerly Addis Ababa, Ethiopia year: 2020 ident: cit0007 – ident: cit0010 doi: 10.1109/CBMI.2015.7153616 – ident: cit0002 doi: 10.3115/v1/D14-1179 – ident: cit0008 – ident: cit0015 – ident: cit0016 – ident: cit0012 doi: 10.1016/j.ipm.2009.03.002 – ident: cit0003 doi: 10.1109/CVPR.2019.00033 – ident: cit0004 doi: 10.1109/CVPR.2016.90 – ident: cit0005 doi: 10.1162/neco.1997.9.8.1735 – ident: cit0014 doi: 10.1109/TMI.2016.2593957 – volume: 1905 start-page: 08983 year: 2019 ident: cit0009 publication-title: arXiv – volume: 1607 start-page: 06450 year: 2016 ident: cit0001 publication-title: arXiv |
| SSID | ssj0000866365 ssib044168314 ssib039557987 ssib024195514 |
| Score | 2.3170831 |
| Snippet | One of the most essential steps in the surgical workflow analysis is recognition of surgical tool presence. We propose a method to detect the presence of... |
| SourceID | crossref nii informaworld |
| SourceType | Enrichment Source Index Database Publisher |
| StartPage | 302 |
| SubjectTerms | Laparoscopy surgical workflow analysis transformer |
| Title | LapFormer: surgical tool detection in laparoscopic surgical video using transformer architecture |
| URI | https://www.tandfonline.com/doi/abs/10.1080/21681163.2020.1835550 https://cir.nii.ac.jp/crid/1871991017630640896 |
| Volume | 9 |
| WOSCitedRecordID | wos000581900900001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVAWR databaseName: Taylor & Francis Journals Complete customDbUrl: eissn: 2168-1171 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000866365 issn: 2168-1163 databaseCode: TFW dateStart: 20130301 isFulltext: true titleUrlDefault: https://www.tandfonline.com providerName: Taylor & Francis |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV07T8MwELYQYoCBN6JAkQfWQB52ErMhRMSAKoYiuoXYsVGkKqmSlN_PXR4lHVAHWDJEOcs6P-67-PN3hNzYLDWMa27ZRiUWk3ZqhW7ALaNY4tjSh6iRNMUmgskknM3Ea8cmrDpaJebQphWKaPZqXNyJrHpG3J3r-KEDOAKyOxdeAYbgTdYOoR-X5jR6X_1lAcDue009STSy0Kq_xvNbQ2sBak2-FMJPnmWD8BMd_EPHD8l-hz3pQztZjsiWzo_J3kCR8IR8vCSLCLqiy3taLctmW6R1UcxpquuGtZXTLKdzCLEog1ksMvXzHV7pKygy6T9p3QNiXdLhYcUpeYuepo_PVleEwVKA5WpLh8gyTY2bBFIxoVFpXCodpioI7UAoJSWMttCCe8ZL05R7CkXDhCOZSLgMvTOynRe5PidUBpLjsR23pWAGlQMNlz7k5wpV8F1nRFjv-Vh1CuVYKGMeO52Qae_AGB0Ydw4ckduV2aKV6NhkIIbDGtfNvxHTFjKJvQ22Y5gD0D18OpBtAr6GTc3HRM4OhX_xh7Yvya6LjBmkU7Irsl2XSz0mO-qrzqryupnY3yCX8GU |
| linkProvider | Taylor & Francis |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV07T8MwELZQQQIG3og3HlgDedhJzIYQFYjSqYhuIXZsFKlKqjTl93OXB6QD6gBLhihnWWfH9539-TtCrmyWGMY1t2yjYotJO7FCN-CWUSx2bOlD1IirYhPBcBiOx6J7FwZplZhDm1ooolqr8efGzeiWEnfjOn7oAJCA9M6FVwAiOKbtqxxiLernj_pv3_ssANl9r6ooiVYWmrUXeX5raSFELQiYQgDK0rQTgPrb_9H1HbLVwE96V8-XXbKisz2y2REl3Cfvg3jah77o4pbO5kW1MtIyzyc00WVF3MpomtEJRFlUwsynqfr5Dm_15RTJ9B-0bDGxLmj3vOKAvPYfRvePVlOHwVIA50pLh0g0TYwbB1IxoVFsXCodJioI7UAoJSUMuNCCe8ZLkoR7CnXDhCOZiLkMvUPSy_JMHxEqA8nx5I7bUjCD4oGGSx9SdIVC-K5zTFjr-kg1IuVYK2MSOY2WaevACB0YNQ48JtffZtNapWOZgeiOa1RW2yOmrmUSeUtsz2ESQPfw6UDCCRAb1jUfczk7FP7JH9q-JOuPo5dBNHgaPp-SDRcJNMiuZGekVxZzfU7W1GeZzoqLapZ_AXFM9I8 |
| linkToPdf | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LT4QwEG6MGqMH38b12YNXlEcL1JtRiUaz8aDRG9KXIdkAYVl_vzMs6HowHvTCgTBN0w6db9qv3xBy4jJtGTfcca3KHCZd7cR-xB2rWOa5MoSokbXFJqLhMH55EQ8dm3Dc0Soxh7ZToYh2rcafu9K2Z8Sd-V4Ye4AjILvz4RVgCI5Z-wJA5xCd_DF5_txmAcQeBm1BSbRy0Ky_x_NTS98i1Df9Uog_RZ7PxJ9k7R96vk5WO_BJL6beskHmTLFJVmYkCbfI631WJdAVU5_T8aRu10XalOWIatO0tK2C5gUdQYxFHcyyytXXd3inr6RIpX-jTY-ITU1nTyu2yVNy_Xh543RVGBwFYK5xTIw0U239LJKKCYNS41KZWKsodiOhlJQw3cIIHthAa80DhaphwpNMZFzGwQ6ZL8rC7BIqI8nx3I67UjCL0oGWyxASdIUy-L43IKwf-VR1EuVYKWOUep2SaT-AKQ5g2g3ggJx-mlVTjY7fDMTstKZNuzlip5VM0uAX20PwAegePj1INwFgw6oWYibnxiLc-0Pbx2Tp4SpJ72-Hd_tk2Uf2DFIr2QGZb-qJOSSL6r3Jx_VR6-MfiVXzQQ |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=LapFormer%3A+surgical+tool+detection+in+laparoscopic+surgical+video+using+transformer+architecture&rft.jtitle=Computer+methods+in+biomechanics+and+biomedical+engineering.&rft.au=Kondo%2C+Satoshi&rft.date=2021-05-04&rft.issn=2168-1163&rft.eissn=2168-1171&rft.volume=9&rft.issue=3&rft.spage=302&rft.epage=307&rft_id=info:doi/10.1080%2F21681163.2020.1835550&rft.externalDBID=n%2Fa&rft.externalDocID=10_1080_21681163_2020_1835550 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2168-1163&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2168-1163&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2168-1163&client=summon |