U-NetPlus: A Modified Encoder-Decoder U-Net Architecture for Semantic and Instance Segmentation of Surgical Instruments from Laparoscopic Images

With the advent of robot-assisted surgery, there has been a paradigm shift in medical technology for minimally invasive surgery. However, it is very challenging to track the position of the surgical instruments in a surgical scene, and accurate detection & identification of surgical tools is par...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference Ročník 2019; s. 7205 - 7211
Hlavní autoři: Kamrul Hasan, S. M., Linte, Cristian A.
Médium: Konferenční příspěvek Journal Article
Jazyk:angličtina
Vydáno: United States IEEE 01.07.2019
Témata:
ISSN:2694-0604, 1557-170X, 1558-4615, 2694-0604
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract With the advent of robot-assisted surgery, there has been a paradigm shift in medical technology for minimally invasive surgery. However, it is very challenging to track the position of the surgical instruments in a surgical scene, and accurate detection & identification of surgical tools is paramount. Deep learning-based semantic segmentation in frames of surgery videos has the potential to facilitate this task. In this work, we modify the U-Net architecture by introducing a pre-trained encoder and re-design the decoder part, by replacing the transposed convolution operation with an upsampling operation based on nearest-neighbor (NN) interpolation. To further improve performance, we also employ a very fast and flexible data augmentation technique. We trained the framework on 8 × 225 frame sequences of robotic surgical videos available through the MICCAI 2017 EndoVis Challenge dataset and tested it on 8 × 75 frame and 2 × 300 frame videos. Using our U-NetPlus architecture, we report a 90.20% DICE for binary segmentation, 76.26% DICE for instrument part segmentation, and 46.07% for instrument type (i.e., all instruments) segmentation, outperforming the results of previous techniques implemented and tested on these data.
AbstractList With the advent of robot-assisted surgery, there has been a paradigm shift in medical technology for minimally invasive surgery. However, it is very challenging to track the position of the surgical instruments in a surgical scene, and accurate detection & identification of surgical tools is paramount. Deep learning-based semantic segmentation in frames of surgery videos has the potential to facilitate this task. In this work, we modify the U-Net architecture by introducing a pre-trained encoder and re-design the decoder part, by replacing the transposed convolution operation with an upsampling operation based on nearest-neighbor (NN) interpolation. To further improve performance, we also employ a very fast and flexible data augmentation technique. We trained the framework on 8 × 225 frame sequences of robotic surgical videos available through the MICCAI 2017 EndoVis Challenge dataset and tested it on 8 × 75 frame and 2 × 300 frame videos. Using our U-NetPlus architecture, we report a 90.20% DICE for binary segmentation, 76.26% DICE for instrument part segmentation, and 46.07% for instrument type (i.e., all instruments) segmentation, outperforming the results of previous techniques implemented and tested on these data.
With the advent of robot-assisted surgery, there has been a paradigm shift in medical technology for minimally invasive surgery. However, it is very challenging to track the position of the surgical instruments in a surgical scene, and accurate detection & identification of surgical tools is paramount. Deep learning-based semantic segmentation in frames of surgery videos has the potential to facilitate this task. In this work, we modify the U-Net architecture by introducing a pre-trained encoder and re-design the decoder part, by replacing the transposed convolution operation with an upsampling operation based on nearest-neighbor (NN) interpolation. To further improve performance, we also employ a very fast and flexible data augmentation technique. We trained the framework on 8 × 225 frame sequences of robotic surgical videos available through the MICCAI 2017 EndoVis Challenge dataset and tested it on 8 × 75 frame and 2 × 300 frame videos. Using our U-NetPlus architecture, we report a 90.20% DICE for binary segmentation, 76.26% DICE for instrument part segmentation, and 46.07% for instrument type (i.e., all instruments) segmentation, outperforming the results of previous techniques implemented and tested on these data.With the advent of robot-assisted surgery, there has been a paradigm shift in medical technology for minimally invasive surgery. However, it is very challenging to track the position of the surgical instruments in a surgical scene, and accurate detection & identification of surgical tools is paramount. Deep learning-based semantic segmentation in frames of surgery videos has the potential to facilitate this task. In this work, we modify the U-Net architecture by introducing a pre-trained encoder and re-design the decoder part, by replacing the transposed convolution operation with an upsampling operation based on nearest-neighbor (NN) interpolation. To further improve performance, we also employ a very fast and flexible data augmentation technique. We trained the framework on 8 × 225 frame sequences of robotic surgical videos available through the MICCAI 2017 EndoVis Challenge dataset and tested it on 8 × 75 frame and 2 × 300 frame videos. Using our U-NetPlus architecture, we report a 90.20% DICE for binary segmentation, 76.26% DICE for instrument part segmentation, and 46.07% for instrument type (i.e., all instruments) segmentation, outperforming the results of previous techniques implemented and tested on these data.
With the advent of robot-assisted surgery, there has been a paradigm shift in medical technology for minimally invasive surgery. However, it is very challenging to track the position of the surgical instruments in a surgical scene, and accurate detection & identification of surgical tools is paramount. Deep learning-based semantic segmentation in frames of surgery videos has the potential to facilitate this task. In this work, we modify the U-Net architecture by introducing a pre-trained encoder and re-design the decoder part, by replacing the transposed convolution operation with an upsampling operation based on nearest-neighbor (NN) interpolation. To further improve performance, we also employ a very fast and flexible data augmentation technique. We trained the framework on 8×225 frame sequences of robotic surgical videos available through the MICCAI 2017 EndoVis Challenge dataset and tested it on 8×75 frame and 2×300 frame videos. Using our U-NetPlus architecture, we report a 90.20% DICE for binary segmentation, 76.26% DICE for instrument part segmentation, and 46.07% for instrument type (i.e., all instruments) segmentation, outperforming the results of previous techniques implemented and tested on these data.
Author Linte, Cristian A.
Kamrul Hasan, S. M.
Author_xml – sequence: 1
  givenname: S. M.
  surname: Kamrul Hasan
  fullname: Kamrul Hasan, S. M.
  organization: Center for Imaging Science, Rochester Institute of Technology, Rochester, NY, USA
– sequence: 2
  givenname: Cristian A.
  surname: Linte
  fullname: Linte, Cristian A.
  organization: Biomedical Engineering and Center for Imaging Science, Rochester Institute of Technology, Rochester, NY, USA
BackLink https://www.ncbi.nlm.nih.gov/pubmed/31947497$$D View this record in MEDLINE/PubMed
BookMark eNpVkVFv0zAUhQ0aYt3YD0BIyI97SfGNYzvmAamUApU6QBp7jhz7pjNK7GInSPsX_GS6rkzwdKR7Pp1zpXNGTkIMSMhLYHMApt-srt4v5yUDPa9rIZWGJ-RCqxoEryVwAPGUzECIuqgkiBMyK6WuCiZZdUrOcv7BWMmYgOfklIOuVKXVjPy-Kb7g-K2f8lu6oFfR-c6jo6tgo8NUfMCD0gNFF8ne-hHtOCWkXUz0GgcTRm-pCY6uQx5NsLi_bgcMoxl9DDR29HpKW29NfyDSdO9l2qU40I3ZmRSzjbt9xnowW8wvyLPO9BkvjnpObj6uvi8_F5uvn9bLxabwXKixsMa1RreghJS26lwnwUnJO2ultaxsDWBZK9AonWZt7bBqaw7StkqXJVrg5-TdQ-5uagd0dv9UMn2zS34w6a6Jxjf_O8HfNtv4q1FclaUW-4DLY0CKPyfMYzP4bLHvTcA45abkFUiohbjvev1v12PJ3xn2wKsHwCPio30cmf8BpzSbeQ
ContentType Conference Proceeding
Journal Article
DBID 6IE
6IH
CBEJK
RIE
RIO
CGR
CUY
CVF
ECM
EIF
NPM
7X8
5PM
DOI 10.1109/EMBC.2019.8856791
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Xplore
IEEE Proceedings Order Plans (POP) 1998-present
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
MEDLINE - Academic
PubMed Central (Full Participant titles)
DatabaseTitle MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
MEDLINE - Academic
DatabaseTitleList
MEDLINE
MEDLINE - Academic

Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
– sequence: 3
  dbid: 7X8
  name: MEDLINE - Academic
  url: https://search.proquest.com/medline
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISBN 9781538613115
1538613115
EISSN 1558-4615
2694-0604
EndPage 7211
ExternalDocumentID PMC7372295
31947497
8856791
Genre orig-research
Research Support, U.S. Gov't, Non-P.H.S
Journal Article
Research Support, N.I.H., Extramural
GrantInformation_xml – fundername: NIGMS NIH HHS
  grantid: R35 GM128877
GroupedDBID 6IE
6IF
6IH
AAJGR
ACGFS
AFFNX
ALMA_UNASSIGNED_HOLDINGS
CBEJK
M43
RIE
RIO
RNS
6IL
6IN
ABLEC
ADZIZ
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CGR
CHZPO
CUY
CVF
ECM
EIF
IEGSK
IJVOP
NPM
OCL
RIL
7X8
29G
5PM
6IK
6IM
IPLJI
ID FETCH-LOGICAL-i357t-cadba9b17566c4fdf61d663fcc6cc02ba1e28719e6d90b8de4b8316cb7922ec13
IEDL.DBID RIE
ISICitedReferencesCount 77
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000557295307149&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 2694-0604
1557-170X
IngestDate Thu Aug 21 13:24:53 EDT 2025
Thu Oct 02 15:48:25 EDT 2025
Thu Apr 03 06:58:46 EDT 2025
Wed Aug 27 02:41:08 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i357t-cadba9b17566c4fdf61d663fcc6cc02ba1e28719e6d90b8de4b8316cb7922ec13
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
OpenAccessLink https://www.ncbi.nlm.nih.gov/pmc/articles/7372295
PMID 31947497
PQID 2341618551
PQPubID 23479
PageCount 7
ParticipantIDs pubmedcentral_primary_oai_pubmedcentral_nih_gov_7372295
pubmed_primary_31947497
ieee_primary_8856791
proquest_miscellaneous_2341618551
PublicationCentury 2000
PublicationDate 2019-07-01
PublicationDateYYYYMMDD 2019-07-01
PublicationDate_xml – month: 07
  year: 2019
  text: 2019-07-01
  day: 01
PublicationDecade 2010
PublicationPlace United States
PublicationPlace_xml – name: United States
PublicationTitle Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference
PublicationTitleAbbrev EMBC
PublicationTitleAlternate Annu Int Conf IEEE Eng Med Biol Soc
PublicationYear 2019
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0020051
ssib053545923
ssib042469959
ssib061542107
ssj0061641
Score 2.5114582
Snippet With the advent of robot-assisted surgery, there has been a paradigm shift in medical technology for minimally invasive surgery. However, it is very...
SourceID pubmedcentral
proquest
pubmed
ieee
SourceType Open Access Repository
Aggregation Database
Index Database
Publisher
StartPage 7205
SubjectTerms Computer architecture
Convolution
Deep Learning
Image Interpretation, Computer-Assisted
Image segmentation
Instruments
Interpolation
Laparoscopy
Robotic Surgical Procedures
Semantics
Surgery
Surgical Instruments
Training
Title U-NetPlus: A Modified Encoder-Decoder U-Net Architecture for Semantic and Instance Segmentation of Surgical Instruments from Laparoscopic Images
URI https://ieeexplore.ieee.org/document/8856791
https://www.ncbi.nlm.nih.gov/pubmed/31947497
https://www.proquest.com/docview/2341618551
https://pubmed.ncbi.nlm.nih.gov/PMC7372295
Volume 2019
WOSCitedRecordID wos000557295307149&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Lb9NAEB6VigNceLRAeFSLxBFTr73eB7dSUlGJRJFCpdyifcwWS8Su4oTfwU9md2NMinrhYlvy2l7NjHa_8cw3A_DOO2pZznSmHVbBQZF5ZizzWaVzbQT3udA6NZsQ06lcLNTsAN4PXBhETMln-CFepli-a-02_io7lbLiIlLV7wnBd1ytwbmK1tVHLWmuTseTT-cxcStaQnqo755yF5D8Nx9yb4O5ePR_U3sMx3-ZemQ27EFP4ACbp_Bwr8jgEfy6yqa4mf3Ydh_JGZm0rvYBdpJxE9ns6-wzpjNJo8jZXlyBBDxL5rgKsq8t0Y0jlwlLhk_O8XrVk5Ya0noy367TCppGrNNcOxKZK-Rr2Ixjwcz2JrzjchVWr-4Yri7G386_ZH0fhqwuK7HJrHZGKxOABudBj85z6gJQ8dZya_PCaIrR71LIncqNdMiMLCm3RqiiQEvLZ3DYtA2-AKIsK4yvkKYUK1tKz4xXzDKFVDhGR3AUZbu82ZXaWPZiHcHbP1pbBvOPMQ3dYLvtlkUZPTQZcN8Inu-0ODwcVhcmmBIjELf0OwyIpbVv32nq76nEdmzeU6jq5d3TeQUPomntsnZfw2EQLL6B-_bnpu7WJ8E6FzIcp7PJSbLR311I67Q
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3fb9MwED5NAwn2wmCDFQYYiUfC4sSJY9620WkVbVWpm7S3yD_OEIkmU9Pyd_AnY7tZ6NBeeEqkOIl1d7G_y913B_DRGqpZzGQkDWbOQSniSGlmo0zGUvHcxlzK0GyCT6fFzY2Y7cCnnguDiCH5DD_70xDLN41e-19lJ0WR5dxT1R_5zlkdW6t3r7x9dXFLGouT4eTs3KdueVsIt3X9Ux6Ckv9mRG5tMRfP_m9y-3D4l6tHZv0u9Bx2sH4Be1tlBg_g93U0xdXs57r9Qk7JpDGVdcCTDGvPZ19GXzEcSRhFTrciC8QhWjLHhZN-pYmsDRkFNOleOcfvi462VJPGkvl6GdbQMGIZ5toSz10hY7cd-5KZza17xmjh1q_2EK4vhlfnl1HXiSGq0oyvIi2NkkI5qJHnTpPG5tQ4qGK1zrWOEyUpes9LYG5ErAqDTBUpzbXiIklQ0_Ql7NZNjUdAhGaJshnSkGSl08IyZQXTTCDlhtEBHHjZlrebYhtlJ9YBfLjTWuk-AB_VkDU267ZMUu-jFQ75DeDVRov9zW59YZwJPgB-T7_9AF9c-_6VuvoRimz79j2JyF4_PJ338OTyajIux6Pptzfw1JvZJof3GHadkPEtPNa_VlW7fBds9A-lMu0J
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+of+the+annual+international+conference+of+the+IEEE+Engineering+in+Medicine+and+Biology+Society&rft.atitle=U-NetPlus%3A+A+Modified+Encoder-Decoder+U-Net+Architecture+for+Semantic+and+Instance+Segmentation+of+Surgical+Instruments+from+Laparoscopic+Images&rft.au=Kamrul+Hasan%2C+S.+M.&rft.au=Linte%2C+Cristian+A.&rft.date=2019-07-01&rft.pub=IEEE&rft.eissn=1558-4615&rft.spage=7205&rft.epage=7211&rft_id=info:doi/10.1109%2FEMBC.2019.8856791&rft.externalDocID=8856791
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2694-0604&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2694-0604&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2694-0604&client=summon