GPU-Accelerated Feature Extraction for Real-Time Vision AI and LLM Systems Efficiency: Autonomous Image Segmentation, Unsupervised Clustering, and Smart Pattern Recognition for Scalable AI Processing with 6.6× Faster Performance, 2.5× Higher Accuracy, and UX-Centric UI Boosting Human-in-the-Loop Productivity

The high computational cost of digital image processing, requiring high-performance hardware and extensive resources, severely limits real-time applications. While advancements in algorithm design and GPU acceleration have significantly improved efficiency, modern AI-driven applications such as larg...

Full description

Saved in:
Bibliographic Details
Published in:ASMC proceedings pp. 1 - 8
Main Authors: Ahi, Kiarash, Wu, Stewart, Sriram, Satya, Fenger, Germain
Format: Conference Proceeding
Language:English
Published: IEEE 05.05.2025
Subjects:
ISSN:2376-6697
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract The high computational cost of digital image processing, requiring high-performance hardware and extensive resources, severely limits real-time applications. While advancements in algorithm design and GPU acceleration have significantly improved efficiency, modern AI-driven applications such as large language models (LLMs), Generative AI (GenAI), medical imaging, autonomous vehicle perception, photography, advanced nano-scale semiconductor metrology, satellite image analysis, high-precision manufacturing, robotics, and real-time anomaly detection, still demand further optimization to reduce computational overhead and improve scalability.In this paper, we introduce GPU-Accelerated Feature Extraction to enhance runtime and efficiency in edge-based simulations. Our approach leverages AI-driven clustering, grouping images with similar visual and pattern characteristics to enable adaptive tuning on a small subset before generalizing across the full dataset. This method achieves a 3.78× reduction in runtime.Furthermore, rather than processing an entire image, we recognize and extract a single representative pattern or region of interest (ROI) per image, removing redundant data and background noise. This refinement results in an additional 1.74× runtime improvement, culminating in an overall 6.6× speed boost, enhancing Scalable Real-Time AI Processing. We also demonstrated that with a similar runtime, the accuracy achieved is 2.5× higher.This solution, integrated into Calibre SEMSuite™, supports multicloud and real-time deployment for enhanced scalability, usability, and performance, providing users with a powerful tool for fully automated, AI-driven image classification, making high-throughput image review feasible even at the scale required for cutting-edge applications.Beyond performance gains, this approach introduces autonomous data cleaning, anomaly detection and defect identification mechanism, allowing failed patterns and defective images to be identified without human intervention, boosting the reviewer productivity.As GenAI and LLM systems gain popularity, the computational demands on modern systems have reached unprecedented levels. As we demonstrate, thanks to feature extraction and ROI selection, instead of needing for the entire dataset to be processed, only a fraction of the data could be used. This is crucial for reducing the computational overhead of LLM systems.We demonstrate that our method enables high-precision, real-time AI inference with applications in computer vision, LLMs, autonomous systems, healthcare, and scalable AI computing.
AbstractList The high computational cost of digital image processing, requiring high-performance hardware and extensive resources, severely limits real-time applications. While advancements in algorithm design and GPU acceleration have significantly improved efficiency, modern AI-driven applications such as large language models (LLMs), Generative AI (GenAI), medical imaging, autonomous vehicle perception, photography, advanced nano-scale semiconductor metrology, satellite image analysis, high-precision manufacturing, robotics, and real-time anomaly detection, still demand further optimization to reduce computational overhead and improve scalability.In this paper, we introduce GPU-Accelerated Feature Extraction to enhance runtime and efficiency in edge-based simulations. Our approach leverages AI-driven clustering, grouping images with similar visual and pattern characteristics to enable adaptive tuning on a small subset before generalizing across the full dataset. This method achieves a 3.78× reduction in runtime.Furthermore, rather than processing an entire image, we recognize and extract a single representative pattern or region of interest (ROI) per image, removing redundant data and background noise. This refinement results in an additional 1.74× runtime improvement, culminating in an overall 6.6× speed boost, enhancing Scalable Real-Time AI Processing. We also demonstrated that with a similar runtime, the accuracy achieved is 2.5× higher.This solution, integrated into Calibre SEMSuite™, supports multicloud and real-time deployment for enhanced scalability, usability, and performance, providing users with a powerful tool for fully automated, AI-driven image classification, making high-throughput image review feasible even at the scale required for cutting-edge applications.Beyond performance gains, this approach introduces autonomous data cleaning, anomaly detection and defect identification mechanism, allowing failed patterns and defective images to be identified without human intervention, boosting the reviewer productivity.As GenAI and LLM systems gain popularity, the computational demands on modern systems have reached unprecedented levels. As we demonstrate, thanks to feature extraction and ROI selection, instead of needing for the entire dataset to be processed, only a fraction of the data could be used. This is crucial for reducing the computational overhead of LLM systems.We demonstrate that our method enables high-precision, real-time AI inference with applications in computer vision, LLMs, autonomous systems, healthcare, and scalable AI computing.
Author Ahi, Kiarash
Sriram, Satya
Wu, Stewart
Fenger, Germain
Author_xml – sequence: 1
  givenname: Kiarash
  surname: Ahi
  fullname: Ahi, Kiarash
  email: kiarash.ahi@siemens.com
  organization: Siemens EDA,USA
– sequence: 2
  givenname: Stewart
  surname: Wu
  fullname: Wu, Stewart
  organization: Siemens EDA,Belgium
– sequence: 3
  givenname: Satya
  surname: Sriram
  fullname: Sriram, Satya
  organization: Siemens EDA,India
– sequence: 4
  givenname: Germain
  surname: Fenger
  fullname: Fenger, Germain
  organization: Siemens EDA,USA
BookMark eNo9kVGO0zAQhrMIJHaXvQESc4CmJHHtJLyFqN1WyoqKtIi3leNOUqPErmxnISfhQFwMhwWeLP3j-f6Z-W-Cl0orDIJ3cbSM4yh_X9QPJVvROFkmUUJnLY5okl4Fd3maZ4TElMQZjV4E1wlJWchYnr4Obqz9FkURy7P4-iq93x_DQgjs0XCHJ9ggd6NBWP9whgsntYJWG_iMvA8PckD4Iu0sFjvg6gRV9QD1ZB0OFtZtK4VEJaYPUIxOKz3o0cJu4B1Cjd2AyvGZuICjsuMFzZO03rLsRw8wUnWLP8x64MbBnjsvKu8sdKfk_0lqwXve9DhPsDdaoLW-E75Ldwa2ZL9-wobPONij8f8HrgQuIFlSX9nK7uwrft_RLzc92x2_hqWfzEgBxx181Nq6GbgdfWsoVejOGFZaX2a30-hP8iTd9CZ41fLe4t3f9zY4bNaHchtWn-53ZVGFMicuFD6RJmszzihpEk59HiuSEi5Yg1GOlNO0YVEj4lNGVk2eZA3lKeMNy9mqzSght8HbZ6xExMeLkf400-O_nMlvkhCkAw
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/ASMC64512.2025.11010527
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISBN 9798331531850
EISSN 2376-6697
EndPage 8
ExternalDocumentID 11010527
Genre orig-research
GroupedDBID 23M
6IE
6IF
6IH
6IK
6IL
6IM
6IN
AAJGR
AAWTH
ABLEC
ACGFS
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IJVOP
IPLJI
M43
OCL
RIE
RIL
RIO
RNS
ID FETCH-LOGICAL-i93t-c110b8f8a653b2a51534373ac6be09e5a57b60bc1d834b928b5a76ab6964f8533
IEDL.DBID RIE
IngestDate Wed Aug 27 01:39:04 EDT 2025
IsPeerReviewed true
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i93t-c110b8f8a653b2a51534373ac6be09e5a57b60bc1d834b928b5a76ab6964f8533
PageCount 8
ParticipantIDs ieee_primary_11010527
PublicationCentury 2000
PublicationDate 2025-May-5
PublicationDateYYYYMMDD 2025-05-05
PublicationDate_xml – month: 05
  year: 2025
  text: 2025-May-5
  day: 05
PublicationDecade 2020
PublicationTitle ASMC proceedings
PublicationTitleAbbrev ASMC
PublicationYear 2025
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0006981
Score 2.290731
Snippet The high computational cost of digital image processing, requiring high-performance hardware and extensive resources, severely limits real-time applications....
SourceID ieee
SourceType Publisher
StartPage 1
SubjectTerms Accuracy
AI clustering
Anomaly detection
Artificial intelligence
autonomous image segmentation
Clustering algorithms
Computational efficiency
deep pattern recognition
edge deployment
Feature extraction
GenAI
Generative AI
GPU-accelerated computing
Image edge detection
LLM Efficiency
real-time AI inference
Real-time systems
Runtime
scalable vision systems
Tuning
Title GPU-Accelerated Feature Extraction for Real-Time Vision AI and LLM Systems Efficiency: Autonomous Image Segmentation, Unsupervised Clustering, and Smart Pattern Recognition for Scalable AI Processing with 6.6× Faster Performance, 2.5× Higher Accuracy, and UX-Centric UI Boosting Human-in-the-Loop Productivity
URI https://ieeexplore.ieee.org/document/11010527
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NbtswDNbWYoftsr8MW_cDHnaM0tixZGu3LEi2AGlhNE2RWyFZchFgsYPYLrYn2QPtxUbKTrsddtjNsCGRggiSlvh9ZOwjxhAtVCC4syrkkbSWa6cND_Qoi_LE5rnHV1wt4vPzZL1WaQdW91gY55wvPnMDevR3-bbMGjoqO8VQhelAGB-xoziWLVjrzu1KlQRdAVcwVKfj5dlERqgK_gOGYnAY-lcTFR9DZk__U_oz1rtH40F6F2ees4eueMGe_EEk-PJB_CVd8XGWYQwh6gcLlNk1ewfT7_W-hS4AZqdwgWkhJ9QHXHlMOYznoAsLi8UZdNzlMPWkEoTI_ATjpibQQ9lUMN-i54Glu9l2aKWiD6uianbkbCoUOfnWEOkC6tP3cy63aJWQev7OAi4OhUqdJku0DUJtkQYdWAFHAh0LgxzIXz9hpmk6SO-RDX0IBwK_tLUpgOttcHE_WnGrNfcn1ZsMVnP4XJYVFXSDv6Tgm4JjossXZbkjacRx65tm9NjlbHo5-cq7lhB8o0Y1z3APTJInWoqRQSNDd03UTDqTxg2VE1rERg5NFthkFBkVJkboWGojlUTDw8z2FTsuysK9ZhDJnLr_OhVqE2H0MUGG5mpsaPGPMFTiDeuRCVzvWtKP68Pun_zj_Vv2mAzN10KKd-y43jfuPXuU3dabav_Bm-pvox7v-A
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NjtMwEDawILFc-Cti-Z0Dx7rbpLETcytVy1akVbRtV71VduysKtGkahIET8ID8WKMnXQXDhy4RYnsGcujmYk93zeEfMAYIpnwGDVa-DTgWlNppKKeHKRBFuksc_iKqzicz6P1WiQtWN1hYYwxrvjM9Oyju8vXRVrbo7JzDFWYDvjhPXKfBYHfb-BaN46Xi8hrS7i8vjgfLmYjHqAy-Bfos95x8F9tVFwUmTz-T_lPSOcWjwfJTaR5Su6a_Bl59AeV4PM74edkRYdpilHEkj9osLldfTAw_l4dGvACYH4Kl5gYUov7gCuHKofhFGSuIY5n0LKXw9jRSlhM5kcY1pWFPRR1CdMd-h5YmOtdi1fKu7DKy3pv3U2JIkdfa0u7gPp03ZyLHdolJI7BM4fLY6lSq8kCrcPitqwGLVwBR4I9GAbe479-wkTa6SC5xTZ0we8x_NJUpwCut8bF_WjErdbUnVVvU1hN4VNRlLakG9w1Bd3mFFNdGhfF3kqzLLeubUaHLCfj5eiCtk0h6FYMKpriHqgoiyRnA4Vmhg7bkjPJlCvTF4ZJFireV6mno0GghB8pJkMuFRccTQ9z2xfkJC9y85JAwDPb_9cIX6oA44_yUjRYpX2N_4S-YGekY01gs29oPzbH3X_1j_fvycOL5SzexNP5l9fk1Bqdq4xkb8hJdajNW_Ig_VZty8M7Z7a_ATbu8z8
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=ASMC+proceedings&rft.atitle=GPU-Accelerated+Feature+Extraction+for+Real-Time+Vision+AI+and+LLM+Systems+Efficiency%3A+Autonomous+Image+Segmentation%2C+Unsupervised+Clustering%2C+and+Smart+Pattern+Recognition+for+Scalable+AI+Processing+with+6.6%C3%97+Faster+Performance%2C+2.5%C3%97+Higher+Accuracy%2C+and+UX-Centric+UI+Boosting+Human-in-the-Loop+Productivity&rft.au=Ahi%2C+Kiarash&rft.au=Wu%2C+Stewart&rft.au=Sriram%2C+Satya&rft.au=Fenger%2C+Germain&rft.date=2025-05-05&rft.pub=IEEE&rft.eissn=2376-6697&rft.spage=1&rft.epage=8&rft_id=info:doi/10.1109%2FASMC64512.2025.11010527&rft.externalDocID=11010527