U-NetPlus: A Modified Encoder-Decoder U-Net Architecture for Semantic and Instance Segmentation of Surgical Instruments from Laparoscopic Images
With the advent of robot-assisted surgery, there has been a paradigm shift in medical technology for minimally invasive surgery. However, it is very challenging to track the position of the surgical instruments in a surgical scene, and accurate detection & identification of surgical tools is par...
Uloženo v:
| Vydáno v: | Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference Ročník 2019; s. 7205 - 7211 |
|---|---|
| Hlavní autoři: | , |
| Médium: | Konferenční příspěvek Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
United States
IEEE
01.07.2019
|
| Témata: | |
| ISSN: | 2694-0604, 1557-170X, 1558-4615, 2694-0604 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | With the advent of robot-assisted surgery, there has been a paradigm shift in medical technology for minimally invasive surgery. However, it is very challenging to track the position of the surgical instruments in a surgical scene, and accurate detection & identification of surgical tools is paramount. Deep learning-based semantic segmentation in frames of surgery videos has the potential to facilitate this task. In this work, we modify the U-Net architecture by introducing a pre-trained encoder and re-design the decoder part, by replacing the transposed convolution operation with an upsampling operation based on nearest-neighbor (NN) interpolation. To further improve performance, we also employ a very fast and flexible data augmentation technique. We trained the framework on 8 × 225 frame sequences of robotic surgical videos available through the MICCAI 2017 EndoVis Challenge dataset and tested it on 8 × 75 frame and 2 × 300 frame videos. Using our U-NetPlus architecture, we report a 90.20% DICE for binary segmentation, 76.26% DICE for instrument part segmentation, and 46.07% for instrument type (i.e., all instruments) segmentation, outperforming the results of previous techniques implemented and tested on these data. |
|---|---|
| AbstractList | With the advent of robot-assisted surgery, there has been a paradigm shift in medical technology for minimally invasive surgery. However, it is very challenging to track the position of the surgical instruments in a surgical scene, and accurate detection & identification of surgical tools is paramount. Deep learning-based semantic segmentation in frames of surgery videos has the potential to facilitate this task. In this work, we modify the U-Net architecture by introducing a pre-trained encoder and re-design the decoder part, by replacing the transposed convolution operation with an upsampling operation based on nearest-neighbor (NN) interpolation. To further improve performance, we also employ a very fast and flexible data augmentation technique. We trained the framework on 8 × 225 frame sequences of robotic surgical videos available through the MICCAI 2017 EndoVis Challenge dataset and tested it on 8 × 75 frame and 2 × 300 frame videos. Using our U-NetPlus architecture, we report a 90.20% DICE for binary segmentation, 76.26% DICE for instrument part segmentation, and 46.07% for instrument type (i.e., all instruments) segmentation, outperforming the results of previous techniques implemented and tested on these data. With the advent of robot-assisted surgery, there has been a paradigm shift in medical technology for minimally invasive surgery. However, it is very challenging to track the position of the surgical instruments in a surgical scene, and accurate detection & identification of surgical tools is paramount. Deep learning-based semantic segmentation in frames of surgery videos has the potential to facilitate this task. In this work, we modify the U-Net architecture by introducing a pre-trained encoder and re-design the decoder part, by replacing the transposed convolution operation with an upsampling operation based on nearest-neighbor (NN) interpolation. To further improve performance, we also employ a very fast and flexible data augmentation technique. We trained the framework on 8 × 225 frame sequences of robotic surgical videos available through the MICCAI 2017 EndoVis Challenge dataset and tested it on 8 × 75 frame and 2 × 300 frame videos. Using our U-NetPlus architecture, we report a 90.20% DICE for binary segmentation, 76.26% DICE for instrument part segmentation, and 46.07% for instrument type (i.e., all instruments) segmentation, outperforming the results of previous techniques implemented and tested on these data.With the advent of robot-assisted surgery, there has been a paradigm shift in medical technology for minimally invasive surgery. However, it is very challenging to track the position of the surgical instruments in a surgical scene, and accurate detection & identification of surgical tools is paramount. Deep learning-based semantic segmentation in frames of surgery videos has the potential to facilitate this task. In this work, we modify the U-Net architecture by introducing a pre-trained encoder and re-design the decoder part, by replacing the transposed convolution operation with an upsampling operation based on nearest-neighbor (NN) interpolation. To further improve performance, we also employ a very fast and flexible data augmentation technique. We trained the framework on 8 × 225 frame sequences of robotic surgical videos available through the MICCAI 2017 EndoVis Challenge dataset and tested it on 8 × 75 frame and 2 × 300 frame videos. Using our U-NetPlus architecture, we report a 90.20% DICE for binary segmentation, 76.26% DICE for instrument part segmentation, and 46.07% for instrument type (i.e., all instruments) segmentation, outperforming the results of previous techniques implemented and tested on these data. With the advent of robot-assisted surgery, there has been a paradigm shift in medical technology for minimally invasive surgery. However, it is very challenging to track the position of the surgical instruments in a surgical scene, and accurate detection & identification of surgical tools is paramount. Deep learning-based semantic segmentation in frames of surgery videos has the potential to facilitate this task. In this work, we modify the U-Net architecture by introducing a pre-trained encoder and re-design the decoder part, by replacing the transposed convolution operation with an upsampling operation based on nearest-neighbor (NN) interpolation. To further improve performance, we also employ a very fast and flexible data augmentation technique. We trained the framework on 8×225 frame sequences of robotic surgical videos available through the MICCAI 2017 EndoVis Challenge dataset and tested it on 8×75 frame and 2×300 frame videos. Using our U-NetPlus architecture, we report a 90.20% DICE for binary segmentation, 76.26% DICE for instrument part segmentation, and 46.07% for instrument type (i.e., all instruments) segmentation, outperforming the results of previous techniques implemented and tested on these data. |
| Author | Linte, Cristian A. Kamrul Hasan, S. M. |
| Author_xml | – sequence: 1 givenname: S. M. surname: Kamrul Hasan fullname: Kamrul Hasan, S. M. organization: Center for Imaging Science, Rochester Institute of Technology, Rochester, NY, USA – sequence: 2 givenname: Cristian A. surname: Linte fullname: Linte, Cristian A. organization: Biomedical Engineering and Center for Imaging Science, Rochester Institute of Technology, Rochester, NY, USA |
| BackLink | https://www.ncbi.nlm.nih.gov/pubmed/31947497$$D View this record in MEDLINE/PubMed |
| BookMark | eNpVkVFv0zAUhQ0aYt3YD0BIyI97SfGNYzvmAamUApU6QBp7jhz7pjNK7GInSPsX_GS6rkzwdKR7Pp1zpXNGTkIMSMhLYHMApt-srt4v5yUDPa9rIZWGJ-RCqxoEryVwAPGUzECIuqgkiBMyK6WuCiZZdUrOcv7BWMmYgOfklIOuVKXVjPy-Kb7g-K2f8lu6oFfR-c6jo6tgo8NUfMCD0gNFF8ne-hHtOCWkXUz0GgcTRm-pCY6uQx5NsLi_bgcMoxl9DDR29HpKW29NfyDSdO9l2qU40I3ZmRSzjbt9xnowW8wvyLPO9BkvjnpObj6uvi8_F5uvn9bLxabwXKixsMa1RreghJS26lwnwUnJO2ultaxsDWBZK9AonWZt7bBqaw7StkqXJVrg5-TdQ-5uagd0dv9UMn2zS34w6a6Jxjf_O8HfNtv4q1FclaUW-4DLY0CKPyfMYzP4bLHvTcA45abkFUiohbjvev1v12PJ3xn2wKsHwCPio30cmf8BpzSbeQ |
| ContentType | Conference Proceeding Journal Article |
| DBID | 6IE 6IH CBEJK RIE RIO CGR CUY CVF ECM EIF NPM 7X8 5PM |
| DOI | 10.1109/EMBC.2019.8856791 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Xplore IEEE Proceedings Order Plans (POP) 1998-present Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed MEDLINE - Academic PubMed Central (Full Participant titles) |
| DatabaseTitle | MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) MEDLINE - Academic |
| DatabaseTitleList | MEDLINE MEDLINE - Academic |
| Database_xml | – sequence: 1 dbid: NPM name: PubMed url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher – sequence: 3 dbid: 7X8 name: MEDLINE - Academic url: https://search.proquest.com/medline sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISBN | 9781538613115 1538613115 |
| EISSN | 1558-4615 2694-0604 |
| EndPage | 7211 |
| ExternalDocumentID | PMC7372295 31947497 8856791 |
| Genre | orig-research Research Support, U.S. Gov't, Non-P.H.S Journal Article Research Support, N.I.H., Extramural |
| GrantInformation_xml | – fundername: NIGMS NIH HHS grantid: R35 GM128877 |
| GroupedDBID | 6IE 6IF 6IH AAJGR ACGFS AFFNX ALMA_UNASSIGNED_HOLDINGS CBEJK M43 RIE RIO RNS 6IL 6IN ABLEC ADZIZ BEFXN BFFAM BGNUA BKEBE BPEOZ CGR CHZPO CUY CVF ECM EIF IEGSK IJVOP NPM OCL RIL 7X8 29G 5PM 6IK 6IM IPLJI |
| ID | FETCH-LOGICAL-i357t-cadba9b17566c4fdf61d663fcc6cc02ba1e28719e6d90b8de4b8316cb7922ec13 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 77 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000557295307149&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 2694-0604 1557-170X |
| IngestDate | Thu Aug 21 13:24:53 EDT 2025 Thu Oct 02 15:48:25 EDT 2025 Thu Apr 03 06:58:46 EDT 2025 Wed Aug 27 02:41:08 EDT 2025 |
| IsDoiOpenAccess | false |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i357t-cadba9b17566c4fdf61d663fcc6cc02ba1e28719e6d90b8de4b8316cb7922ec13 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
| OpenAccessLink | https://www.ncbi.nlm.nih.gov/pmc/articles/7372295 |
| PMID | 31947497 |
| PQID | 2341618551 |
| PQPubID | 23479 |
| PageCount | 7 |
| ParticipantIDs | pubmedcentral_primary_oai_pubmedcentral_nih_gov_7372295 pubmed_primary_31947497 ieee_primary_8856791 proquest_miscellaneous_2341618551 |
| PublicationCentury | 2000 |
| PublicationDate | 2019-07-01 |
| PublicationDateYYYYMMDD | 2019-07-01 |
| PublicationDate_xml | – month: 07 year: 2019 text: 2019-07-01 day: 01 |
| PublicationDecade | 2010 |
| PublicationPlace | United States |
| PublicationPlace_xml | – name: United States |
| PublicationTitle | Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference |
| PublicationTitleAbbrev | EMBC |
| PublicationTitleAlternate | Annu Int Conf IEEE Eng Med Biol Soc |
| PublicationYear | 2019 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0020051 ssib053545923 ssib042469959 ssib061542107 ssj0061641 |
| Score | 2.5114582 |
| Snippet | With the advent of robot-assisted surgery, there has been a paradigm shift in medical technology for minimally invasive surgery. However, it is very... |
| SourceID | pubmedcentral proquest pubmed ieee |
| SourceType | Open Access Repository Aggregation Database Index Database Publisher |
| StartPage | 7205 |
| SubjectTerms | Computer architecture Convolution Deep Learning Image Interpretation, Computer-Assisted Image segmentation Instruments Interpolation Laparoscopy Robotic Surgical Procedures Semantics Surgery Surgical Instruments Training |
| Title | U-NetPlus: A Modified Encoder-Decoder U-Net Architecture for Semantic and Instance Segmentation of Surgical Instruments from Laparoscopic Images |
| URI | https://ieeexplore.ieee.org/document/8856791 https://www.ncbi.nlm.nih.gov/pubmed/31947497 https://www.proquest.com/docview/2341618551 https://pubmed.ncbi.nlm.nih.gov/PMC7372295 |
| Volume | 2019 |
| WOSCitedRecordID | wos000557295307149&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Lb9NAEB6VigNceLRAeFSLxBFTr73eB7dSUlGJRJFCpdyifcwWS8Su4oTfwU9md2NMinrhYlvy2l7NjHa_8cw3A_DOO2pZznSmHVbBQZF5ZizzWaVzbQT3udA6NZsQ06lcLNTsAN4PXBhETMln-CFepli-a-02_io7lbLiIlLV7wnBd1ytwbmK1tVHLWmuTseTT-cxcStaQnqo755yF5D8Nx9yb4O5ePR_U3sMx3-ZemQ27EFP4ACbp_Bwr8jgEfy6yqa4mf3Ydh_JGZm0rvYBdpJxE9ns6-wzpjNJo8jZXlyBBDxL5rgKsq8t0Y0jlwlLhk_O8XrVk5Ya0noy367TCppGrNNcOxKZK-Rr2Ixjwcz2JrzjchVWr-4Yri7G386_ZH0fhqwuK7HJrHZGKxOABudBj85z6gJQ8dZya_PCaIrR71LIncqNdMiMLCm3RqiiQEvLZ3DYtA2-AKIsK4yvkKYUK1tKz4xXzDKFVDhGR3AUZbu82ZXaWPZiHcHbP1pbBvOPMQ3dYLvtlkUZPTQZcN8Inu-0ODwcVhcmmBIjELf0OwyIpbVv32nq76nEdmzeU6jq5d3TeQUPomntsnZfw2EQLL6B-_bnpu7WJ8E6FzIcp7PJSbLR311I67Q |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3fb9MwED5NAwn2wmCDFQYYiUfC4sSJY9620WkVbVWpm7S3yD_OEIkmU9Pyd_AnY7tZ6NBeeEqkOIl1d7G_y913B_DRGqpZzGQkDWbOQSniSGlmo0zGUvHcxlzK0GyCT6fFzY2Y7cCnnguDiCH5DD_70xDLN41e-19lJ0WR5dxT1R_5zlkdW6t3r7x9dXFLGouT4eTs3KdueVsIt3X9Ux6Ckv9mRG5tMRfP_m9y-3D4l6tHZv0u9Bx2sH4Be1tlBg_g93U0xdXs57r9Qk7JpDGVdcCTDGvPZ19GXzEcSRhFTrciC8QhWjLHhZN-pYmsDRkFNOleOcfvi462VJPGkvl6GdbQMGIZ5toSz10hY7cd-5KZza17xmjh1q_2EK4vhlfnl1HXiSGq0oyvIi2NkkI5qJHnTpPG5tQ4qGK1zrWOEyUpes9LYG5ErAqDTBUpzbXiIklQ0_Ql7NZNjUdAhGaJshnSkGSl08IyZQXTTCDlhtEBHHjZlrebYhtlJ9YBfLjTWuk-AB_VkDU267ZMUu-jFQ75DeDVRov9zW59YZwJPgB-T7_9AF9c-_6VuvoRimz79j2JyF4_PJ338OTyajIux6Pptzfw1JvZJof3GHadkPEtPNa_VlW7fBds9A-lMu0J |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+of+the+annual+international+conference+of+the+IEEE+Engineering+in+Medicine+and+Biology+Society&rft.atitle=U-NetPlus%3A+A+Modified+Encoder-Decoder+U-Net+Architecture+for+Semantic+and+Instance+Segmentation+of+Surgical+Instruments+from+Laparoscopic+Images&rft.au=Kamrul+Hasan%2C+S.+M.&rft.au=Linte%2C+Cristian+A.&rft.date=2019-07-01&rft.pub=IEEE&rft.eissn=1558-4615&rft.spage=7205&rft.epage=7211&rft_id=info:doi/10.1109%2FEMBC.2019.8856791&rft.externalDocID=8856791 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2694-0604&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2694-0604&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2694-0604&client=summon |