Method for generating information sequence segments using the quality functional of processing models

The constantly emerging need to increase the efficiency of solving classification problems and predicting the behavior of objects under observation necessitates improving data processing methods. This article proposes a method for improving the quality indicators of machine learning models in regres...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Nauchno-tekhnicheskiĭ vestnik informat͡s︡ionnykh tekhnologiĭ, mekhaniki i optiki Ročník 24; číslo 3; s. 474 - 482
Hlavní autoři: Tikhonov, D.D., Lebedev, I.S.
Médium: Journal Article
Jazyk:angličtina
Vydáno: ITMO University 01.12.2024
Témata:
ISSN:2226-1494, 2500-0373
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract The constantly emerging need to increase the efficiency of solving classification problems and predicting the behavior of objects under observation necessitates improving data processing methods. This article proposes a method for improving the quality indicators of machine learning models in regression and forecasting problems. The proposed processing of information sequences involves the use of input data segmentation. As a result of data division, segments with different properties of observation objects are formed. The novelty of the method lies in dividing the sequence into segments using the quality functional of processing models on data subsamples. This allows you to apply the best quality models on various data segments. The segments obtained in this way are separate subsamples to which the best quality models and machine learning algorithms are assigned. To assess the quality of the proposed solution, an experiment was performed using model data and multiple regression. The obtained values of the quality indicator RMSE for various algorithms on an experimental sample and with a different number of segments demonstrated an increase in the quality indicators of individual algorithms with an increase in the number of segments. The proposed method can improve RMSE performance by an average of 7 % by segmenting and assigning models that have the best performance in individual segments. The results obtained can be additionally used in the development of models and data processing methods. The proposed solution is aimed at further improving and expanding ensemble methods. The formation of multi-level model structures that process, analyze incoming information flows and assign the most suitable model for solving the current problem makes it possible to reduce the complexity and resource intensity of classical ensemble methods. The impact of the overfitting problem is reduced, the dependence of processing results on the basic models is reduced, the efficiency of setting up basic algorithms in the event of transformation of data properties is increased, and the interpretability of the results is improved.
AbstractList The constantly emerging need to increase the efficiency of solving classification problems and predicting the behavior of objects under observation necessitates improving data processing methods. This article proposes a method for improving the quality indicators of machine learning models in regression and forecasting problems. The proposed processing of information sequences involves the use of input data segmentation. As a result of data division, segments with different properties of observation objects are formed. The novelty of the method lies in dividing the sequence into segments using the quality functional of processing models on data subsamples. This allows you to apply the best quality models on various data segments. The segments obtained in this way are separate subsamples to which the best quality models and machine learning algorithms are assigned. To assess the quality of the proposed solution, an experiment was performed using model data and multiple regression. The obtained values of the quality indicator RMSE for various algorithms on an experimental sample and with a different number of segments demonstrated an increase in the quality indicators of individual algorithms with an increase in the number of segments. The proposed method can improve RMSE performance by an average of 7 % by segmenting and assigning models that have the best performance in individual segments. The results obtained can be additionally used in the development of models and data processing methods. The proposed solution is aimed at further improving and expanding ensemble methods. The formation of multi-level model structures that process, analyze incoming information flows and assign the most suitable model for solving the current problem makes it possible to reduce the complexity and resource intensity of classical ensemble methods. The impact of the overfitting problem is reduced, the dependence of processing results on the basic models is reduced, the efficiency of setting up basic algorithms in the event of transformation of data properties is increased, and the interpretability of the results is improved.
Author Lebedev, I.S.
Tikhonov, D.D.
Author_xml – sequence: 1
  givenname: D.D.
  orcidid: 0009-0008-0128-4144
  surname: Tikhonov
  fullname: Tikhonov, D.D.
– sequence: 2
  givenname: I.S.
  orcidid: 0000-0001-6753-2181
  surname: Lebedev
  fullname: Lebedev, I.S.
BookMark eNo9UctKBDEQDKKgrv5DDl6jmbxmBrzI4mNB8aLn0JPprCOziZvMHvbvzawPaKqLpqiGqnNyHGJAQq4qfl3VujE3QgjDKtUqJrgooJhkqlZMNeKInAnNOeOylseF_ylPyWXOn5zzqi4gxBnBF5w-Yk99THSNARNMQ1jTIZTDpvAYaMbtDoPDQtYbDFOmuzxrpg-k2x2Mw7SnfhfcLIaRRk-_UnSYD6JN7HHMF-TEw5jx8ncvyPvD_dvyiT2_Pq6Wd8_MVbUUDLn02oA0rUToOyMVetBcg2m9aRBFo1WjW-Su110vee0E7-rWdQ12HXgjF2T149tH-LRfadhA2tsIgz0cYlpbSNPgRrS-gQaUVD2gV6JVnS4_S46dBGe064vX7Y-XSzHnhP7fr-L20ICdc7VzrnZuwJaRtjRgSwPyG-pHfpk
ContentType Journal Article
DBID AAYXX
CITATION
DOA
DOI 10.17586/2226-1494-2024-24-3-474-482
DatabaseName CrossRef
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
DatabaseTitleList
Database_xml – sequence: 1
  dbid: DOA
  name: 开放获取期刊(Open Access Journals)
  url: https://www.doaj.org/
  sourceTypes: Open Website
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 2500-0373
EndPage 482
ExternalDocumentID oai_doaj_org_article_f8a8a434daef4294b53ea024b3ac65cd
10_17586_2226_1494_2024_24_3_474_482
GroupedDBID 642
AAYXX
ADBBV
AFKRA
ALMA_UNASSIGNED_HOLDINGS
BCNDV
BENPR
BPHCQ
BYOGL
CITATION
GROUPED_DOAJ
KQ8
PIMPY
PQQKQ
PROAC
VCL
VIT
ID FETCH-LOGICAL-c1732-e03f56a3693eadb634efa505a69f68ee2854859e0cd5bd307c20b79cb8ebbaf63
IEDL.DBID DOA
ISSN 2226-1494
IngestDate Mon Nov 03 22:07:14 EST 2025
Sat Nov 29 03:57:46 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 3
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c1732-e03f56a3693eadb634efa505a69f68ee2854859e0cd5bd307c20b79cb8ebbaf63
ORCID 0009-0008-0128-4144
0000-0001-6753-2181
OpenAccessLink https://doaj.org/article/f8a8a434daef4294b53ea024b3ac65cd
PageCount 9
ParticipantIDs doaj_primary_oai_doaj_org_article_f8a8a434daef4294b53ea024b3ac65cd
crossref_primary_10_17586_2226_1494_2024_24_3_474_482
PublicationCentury 2000
PublicationDate 2024-12-01
PublicationDateYYYYMMDD 2024-12-01
PublicationDate_xml – month: 12
  year: 2024
  text: 2024-12-01
  day: 01
PublicationDecade 2020
PublicationTitle Nauchno-tekhnicheskiĭ vestnik informat͡s︡ionnykh tekhnologiĭ, mekhaniki i optiki
PublicationYear 2024
Publisher ITMO University
Publisher_xml – name: ITMO University
SSID ssj0001700022
ssib026971427
Score 2.2755609
Snippet The constantly emerging need to increase the efficiency of solving classification problems and predicting the behavior of objects under observation...
SourceID doaj
crossref
SourceType Open Website
Index Database
StartPage 474
SubjectTerms информационная последовательность данных
многоуровневая модель обработки данных
повышение показателей качества
сегментация данных
Title Method for generating information sequence segments using the quality functional of processing models
URI https://doaj.org/article/f8a8a434daef4294b53ea024b3ac65cd
Volume 24
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAON
  databaseName: 开放获取期刊(Open Access Journals)
  customDbUrl:
  eissn: 2500-0373
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0001700022
  issn: 2226-1494
  databaseCode: DOA
  dateStart: 20010101
  isFulltext: true
  titleUrlDefault: https://www.doaj.org/
  providerName: Directory of Open Access Journals
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3NS8MwFA8yRPQgfuL8Ioddw9omTdqjisOLw4PCbiFfHXrYxjYF_3vfS7pRT16EUkqahvLL6_so7_0eIQPHXa5yy5mVwUKAIjnoQVMyAdFEZQulssbHZhNqPK4mk_ql0-oLc8ISPXACbthUpjKCC29CA7pT2JIHA4bFcuNk6Txq30zVnWAKJKmQtcpFy2_5kUhi0FphpznwNxiEBWKPDFBjgL8sh9tBEJoCToJxJpRgoip-2asOrX-0P6Mjctg6jvQuvfAx2QmzE3LQoRM8JeE5toOm4IfSaaSTxpxm2nKj4g7QTeY0XExjcRvFxPcpBTeQpvrKb4qmLv0hpPOGLlIlAU6KXXNWZ-Rt9Pj68MTaNgoMtoEXLGS8KaXhsgbkvJVchMaA42Nk3cgqBKyhrMo6ZM6X1sM374rMqtrZKlhrGsnPSW82n4ULQmGmNwKrbcEr8CXcd8bngXMprAgy75NyA5ZeJLYMjVEGgqwRZI0gawRZw8E1gKwB5D65R2S3zyDndRwASdCtJOi_JOHyPxa5IvtRBGLCyjXprZef4Ybsuq_1-2p5G4XsB7J60qA
linkProvider Directory of Open Access Journals
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Method+for+generating+information+sequence+segments+using+the+quality+functional+of+processing+models&rft.jtitle=Nauchno-tekhnicheski%C4%AD+vestnik+informat%CD%A1s%EF%B8%A1ionnykh+tekhnologi%C4%AD%2C+mekhaniki+i+optiki&rft.au=D.+D.+Tikhonov&rft.au=I.+S.+Lebedev&rft.date=2024-12-01&rft.pub=ITMO+University&rft.issn=2226-1494&rft.eissn=2500-0373&rft.volume=24&rft.issue=3&rft.spage=474&rft.epage=482&rft_id=info:doi/10.17586%2F2226-1494-2024-24-3-474-482&rft.externalDBID=DOA&rft.externalDocID=oai_doaj_org_article_f8a8a434daef4294b53ea024b3ac65cd
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2226-1494&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2226-1494&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2226-1494&client=summon