Two-stage optimization for machine learning workflow
Machine learning techniques play a preponderant role in dealing with massive amount of data and are employed in almost every possible domain. Building a high quality machine learning model to be deployed in production is a challenging task, from both, the subject matter experts and the machine learn...
Uloženo v:
| Vydáno v: | Information systems (Oxford) Ročník 92; s. 101483 |
|---|---|
| Hlavní autor: | |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Oxford
Elsevier Ltd
01.09.2020
Elsevier Science Ltd |
| Témata: | |
| ISSN: | 0306-4379, 1873-6076 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Machine learning techniques play a preponderant role in dealing with massive amount of data and are employed in almost every possible domain. Building a high quality machine learning model to be deployed in production is a challenging task, from both, the subject matter experts and the machine learning practitioners.
For a broader adoption and scalability of machine learning systems, the construction and configuration of machine learning workflow need to gain in automation. In the last few years, several techniques have been developed in this direction, known as AutoML.
In this paper, we present a two-stage optimization process to build data pipelines and configure machine learning algorithms. First, we study the impact of data pipelines compared to algorithm configuration in order to show the importance of data preprocessing over hyperparameter tuning. The second part presents policies to efficiently allocate search time between data pipeline construction and algorithm configuration. Those policies are agnostic from the metaoptimizer. Last, we present a metric to determine if a data pipeline is specific or independent from the algorithm, enabling fine-grain pipeline pruning and meta-learning for the coldstart problem.
•The importance of optimizing data pipeline over hyperparameter tuning is studied.•The results show data pipelines are often more important than hyperparameter tuning.•A two-stage optimization process is proposed to search for a ML workflow.•This process is empirically validated over several time allocation policies.•Iterative and adaptive policies are more robust than static policies.•A metric to measure if a data pipeline is independent from the model is proposed. |
|---|---|
| AbstractList | Machine learning techniques play a preponderant role in dealing with massive amount of data and are employed in almost every possible domain. Building a high quality machine learning model to be deployed in production is a challenging task, from both, the subject matter experts and the machine learning practitioners. For a broader adoption and scalability of machine learning systems, the construction and configuration of machine learning workflow need to gain in automation. In the last few years, several techniques have been developed in this direction, known as AutoML. In this paper, we present a two-stage optimization process to build data pipelines and configure machine learning algorithms. First, we study the impact of data pipelines compared to algorithm configuration in order to show the importance of data preprocessing over hyperparameter tuning. The second part presents policies to efficiently allocate search time between data pipeline construction and algorithm configuration. Those policies are agnostic from the metaoptimizer. Last, we present a metric to determine if a data pipeline is specific or independent from the algorithm, enabling fine-grain pipeline pruning and meta-learning for the coldstart problem. Machine learning techniques play a preponderant role in dealing with massive amount of data and are employed in almost every possible domain. Building a high quality machine learning model to be deployed in production is a challenging task, from both, the subject matter experts and the machine learning practitioners. For a broader adoption and scalability of machine learning systems, the construction and configuration of machine learning workflow need to gain in automation. In the last few years, several techniques have been developed in this direction, known as AutoML. In this paper, we present a two-stage optimization process to build data pipelines and configure machine learning algorithms. First, we study the impact of data pipelines compared to algorithm configuration in order to show the importance of data preprocessing over hyperparameter tuning. The second part presents policies to efficiently allocate search time between data pipeline construction and algorithm configuration. Those policies are agnostic from the metaoptimizer. Last, we present a metric to determine if a data pipeline is specific or independent from the algorithm, enabling fine-grain pipeline pruning and meta-learning for the coldstart problem. •The importance of optimizing data pipeline over hyperparameter tuning is studied.•The results show data pipelines are often more important than hyperparameter tuning.•A two-stage optimization process is proposed to search for a ML workflow.•This process is empirically validated over several time allocation policies.•Iterative and adaptive policies are more robust than static policies.•A metric to measure if a data pipeline is independent from the model is proposed. |
| ArticleNumber | 101483 |
| Author | Quemy, Alexandre |
| Author_xml | – sequence: 1 givenname: Alexandre surname: Quemy fullname: Quemy, Alexandre email: aquemy@pl.ibm.com organization: IBM Krakow Software Lab, Cracow, Poland |
| BookMark | eNp9kE1LAzEQhoNUsK3ePS543pqvTTbepPgFgpd6Dmk6W7Nuk5qkFv31bl1Pgp6GgfeZ4X0maOSDB4TOCZ4RTMRlO3NpRjFRh5XX7AiNSS1ZKbAUIzTGDIuSM6lO0CSlFmNMK6XGiC_2oUzZrKEI2-w27tNkF3zRhFhsjH1xHooOTPTOr4t9iK9NF_an6LgxXYKznzlFz7c3i_l9-fh09zC_fiwtozSXNeYVIbKuFK4EtpVtmKSWCyn5yghFgbOG12TJpaAgFWeiAbmkogYlpGLApuhiuLuN4W0HKes27KLvX2rKOcaqFpz3KTGkbAwpRWi0dfm7RY7GdZpgfTCkW-16rjekB0M9iH-B2-g2Jn78h1wNCPS13x1EnawDb2HlItisV8H9DX8BLNt8sg |
| CitedBy_id | crossref_primary_10_1007_s10796_021_10235_4 crossref_primary_10_3390_su122310124 crossref_primary_10_1016_j_ecoinf_2025_103166 crossref_primary_10_1016_j_is_2021_101822 crossref_primary_10_1016_j_is_2021_101957 crossref_primary_10_1016_j_asoc_2024_111292 crossref_primary_10_1016_j_is_2023_102258 crossref_primary_10_3390_app12199700 crossref_primary_10_1088_1361_6501_abd366 crossref_primary_10_1145_3698831 crossref_primary_10_36219_BPI_2024_4_08 crossref_primary_10_1145_3589328 crossref_primary_10_1016_j_compgeo_2023_106010 crossref_primary_10_3390_eng5010021 crossref_primary_10_1007_s11831_022_09765_0 crossref_primary_10_1186_s13195_024_01540_6 crossref_primary_10_3390_su142215292 crossref_primary_10_1109_ACCESS_2022_3225401 |
| Cites_doi | 10.1016/j.csi.2017.05.004 10.1088/1749-4699/8/1/014008 10.1007/978-3-319-99987-6_15 10.1016/j.protcy.2013.12.159 10.1162/neco.1996.8.7.1341 10.1145/3205455.3205586 10.1515/amcs-2017-0048 10.1016/j.ejor.2005.07.023 |
| ContentType | Journal Article |
| Copyright | 2019 Elsevier Ltd Copyright Elsevier Science Ltd. Sep 2020 |
| Copyright_xml | – notice: 2019 Elsevier Ltd – notice: Copyright Elsevier Science Ltd. Sep 2020 |
| DBID | AAYXX CITATION 7SC 8FD E3H F2A JQ2 L7M L~C L~D |
| DOI | 10.1016/j.is.2019.101483 |
| DatabaseName | CrossRef Computer and Information Systems Abstracts Technology Research Database Library & Information Sciences Abstracts (LISA) Library & Information Science Abstracts (LISA) ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Library and Information Science Abstracts (LISA) ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Technology Research Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering Computer Science |
| EISSN | 1873-6076 |
| ExternalDocumentID | 10_1016_j_is_2019_101483 S0306437919305356 |
| GroupedDBID | --K --M -~X .DC .~1 0R~ 13V 1B1 1~. 1~5 29I 4.4 457 4G. 5GY 5VS 63O 7-5 71M 77K 8P~ 9JN 9JO AAAKF AAAKG AACTN AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AARIN AAXUO AAYFN ABBOA ABFNM ABKBG ABMAC ABMVD ABTAH ABUCO ABXDB ABYKQ ACDAQ ACGFS ACHRH ACNNM ACNTT ACRLP ACZNC ADBBV ADEZE ADJOM ADMUD AEBSH AEKER AENEX AFFNX AFKWA AFTJW AGHFR AGJBL AGUBO AGUMN AGYEJ AHHHB AHZHX AI. AIALX AIEXJ AIKHN AITUG AJBFU AJOXV ALEQD ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD APLSM ASPBG AVWKF AXJTR AZFZN BKOJK BLXMC BNSAS CS3 DU5 EBS EFJIC EFLBG EJD EO8 EO9 EP2 EP3 F5P FDB FEDTE FGOYB FIRID FNPLU FYGXN G-Q G8K GBLVA GBOLZ HAMUX HF~ HLZ HVGLF HZ~ H~9 IHE J1W KOM LG9 M41 MO0 N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. PQQKQ Q38 R2- RIG RNS ROL RPZ SBC SDF SDG SDP SES SEW SPC SPCBC SSB SSD SSL SSV SSZ T5K TN5 UHS VH1 WUQ XSW ZCG ZY4 ~G- 77I 9DU AATTM AAXKI AAYWO AAYXX ABDPE ABJNI ABWVN ACLOT ACRPL ACVFH ADCNI ADNMO ADVLN AEIPS AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP CITATION EFKBS ~HD 7SC 8FD E3H F2A JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-c322t-80451178590560c5cf372c46774da692e43f481b4762e79436fe7b268e96793e3 |
| ISICitedReferencesCount | 17 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000548664400003&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0306-4379 |
| IngestDate | Fri Nov 14 18:50:35 EST 2025 Tue Nov 18 22:30:16 EST 2025 Sat Nov 29 07:19:08 EST 2025 Fri Feb 23 02:47:53 EST 2024 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Keywords | AutoML CASH Data pipelines Hyperparameter tuning |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c322t-80451178590560c5cf372c46774da692e43f481b4762e79436fe7b268e96793e3 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| PQID | 2440098644 |
| PQPubID | 2035446 |
| ParticipantIDs | proquest_journals_2440098644 crossref_citationtrail_10_1016_j_is_2019_101483 crossref_primary_10_1016_j_is_2019_101483 elsevier_sciencedirect_doi_10_1016_j_is_2019_101483 |
| PublicationCentury | 2000 |
| PublicationDate | September 2020 2020-09-00 20200901 |
| PublicationDateYYYYMMDD | 2020-09-01 |
| PublicationDate_xml | – month: 09 year: 2020 text: September 2020 |
| PublicationDecade | 2020 |
| PublicationPlace | Oxford |
| PublicationPlace_xml | – name: Oxford |
| PublicationTitle | Information systems (Oxford) |
| PublicationYear | 2020 |
| Publisher | Elsevier Ltd Elsevier Science Ltd |
| Publisher_xml | – name: Elsevier Ltd – name: Elsevier Science Ltd |
| References | Bergstra, Bengio (b10) 2012; 13 Hutter, Kotthoff, Vanschoren (b2) 2019 Bilalli, Abelló, Aluja-Banet (b31) 2017; 27 Dasu, Johnson (b4) 2003 Bergstra, Komer, Eliasmith, Yamins, Cox (b16) 2015; 8 Crone, Lessmann, Stahlbock (b3) 2006; 173 Wolpert (b6) 1996; 8 Elshawi, Maher, Sakr (b1) 2019 Nawi, Atomi, Rehman (b5) 2013; 11 Swersky, Snoek, Adams (b23) 2014 Li, Jamieson (b25) 2018; 18 Jamieson, Talwalkar (b24) 2016 Pedregosa, Varoquaux, Gramfort, Michel, Thirion, Grisel, Blondel, Prettenhofer, Weiss, Dubourg, Vanderplas, Passos, Cournapeau, Brucher, Perrot, Duchesnay (b34) 2011; 12 Kingma, Ba (b33) 2014 J. Wilson, F. Hutter, M. Deisenroth, Maximizing acquisition functions for Bayesian optimization, in: Proc. Int. Conf. Neural Inf. Process. Syst., 2018, pp. 9884–9895. B. Chen, H. Wu, W. Mo, I. Chattopadhyay, H. Lipson, Autostacker: A compositional evolutionary learning system, in: Proc. Gen. and Evol. Comput. Conf., 2018, pp. 402–409. Močkus (b20) 1975 Kotthoff, Thornton, Hoos, Hutter, Leyton-Brown (b14) 2017; 18 Eggensperger, Feurer, Hutter, Bergstra, Snoek, Hoos, Leyton-Brown (b35) 2013 Sun, Lin, Bischl (b29) 2019 Bischl, Casalicchio, Feurer, Hutter, Lang, Mantovani, van Rijn, Vanschoren (b37) 2017 Frazier (b19) 2018 Thornton, Hutter, Hoos, Leyton-Brown (b13) 2013 J. Snoek, H. Larochelle, R.P. Adams, Practical bayesian optimization of machine learning algorithms, in: Proc. Int. Conf. Neural Inf. Process. Syst., 2012, pp. 2951–2959. Feurer, Klein, Eggensperger, Springenberg, Blum, Hutter (b15) 2015 Hutter, Hoos, Leyton-Brown (b11) 2011 Chessell, Scheepers, Nguyen, van Kessel, van der Starre (b7) 2014 Rakotoarison, Sebag (b21) 2018 Domhan, Springenberg, Hutter (b22) 2015 Montgomery (b9) 2017 Vanschoren (b32) 2018 Bilalli, Abelló, Aluja-Banet, Wrembel (b36) 2018 J. Nalepa, M. Myller, S. Piechaczek, K. Hrynczenko, M. Kawulok, Genetic selection of training sets for (not only) artificial neural networks, in: Proc. Int. Conf. beyond Databases, Architectures Struct., 2018, pp. 194–206. A. Quemy, Data pipeline selection and optimization, in: Pro. Int. Workshop on Design, Optim., Languages and Anal. Processing of Big Data, 2019. J. Bergstra, R. Bardenet, Y. Bengio, B. Kégl, Algorithms for hyper-parameter optimization, in: Proc. Int. Conf. Neural Inf. Process. Syst., 2011, pp. 2546–2554. Chowdhury, Magdon-Ismail, Yener (b38) 2019 Kim, Kim, Choi (b30) 2017 Olson, Bartley, Urbanowicz, Moore (b27) 2016 Dasu (10.1016/j.is.2019.101483_b4) 2003 Kim (10.1016/j.is.2019.101483_b30) 2017 Rakotoarison (10.1016/j.is.2019.101483_b21) 2018 Bilalli (10.1016/j.is.2019.101483_b36) 2018 Vanschoren (10.1016/j.is.2019.101483_b32) 2018 10.1016/j.is.2019.101483_b28 Olson (10.1016/j.is.2019.101483_b27) 2016 Jamieson (10.1016/j.is.2019.101483_b24) 2016 10.1016/j.is.2019.101483_b26 Bilalli (10.1016/j.is.2019.101483_b31) 2017; 27 Bergstra (10.1016/j.is.2019.101483_b16) 2015; 8 Swersky (10.1016/j.is.2019.101483_b23) 2014 Wolpert (10.1016/j.is.2019.101483_b6) 1996; 8 Montgomery (10.1016/j.is.2019.101483_b9) 2017 Feurer (10.1016/j.is.2019.101483_b15) 2015 Nawi (10.1016/j.is.2019.101483_b5) 2013; 11 Kotthoff (10.1016/j.is.2019.101483_b14) 2017; 18 Bischl (10.1016/j.is.2019.101483_b37) 2017 Bergstra (10.1016/j.is.2019.101483_b10) 2012; 13 Eggensperger (10.1016/j.is.2019.101483_b35) 2013 Chessell (10.1016/j.is.2019.101483_b7) 2014 Chowdhury (10.1016/j.is.2019.101483_b38) 2019 Hutter (10.1016/j.is.2019.101483_b11) 2011 10.1016/j.is.2019.101483_b12 Li (10.1016/j.is.2019.101483_b25) 2018; 18 Sun (10.1016/j.is.2019.101483_b29) 2019 Crone (10.1016/j.is.2019.101483_b3) 2006; 173 Kingma (10.1016/j.is.2019.101483_b33) 2014 10.1016/j.is.2019.101483_b17 Elshawi (10.1016/j.is.2019.101483_b1) 2019 10.1016/j.is.2019.101483_b18 Frazier (10.1016/j.is.2019.101483_b19) 2018 Pedregosa (10.1016/j.is.2019.101483_b34) 2011; 12 10.1016/j.is.2019.101483_b8 Domhan (10.1016/j.is.2019.101483_b22) 2015 Hutter (10.1016/j.is.2019.101483_b2) 2019 Thornton (10.1016/j.is.2019.101483_b13) 2013 Močkus (10.1016/j.is.2019.101483_b20) 1975 |
| References_xml | – reference: J. Bergstra, R. Bardenet, Y. Bengio, B. Kégl, Algorithms for hyper-parameter optimization, in: Proc. Int. Conf. Neural Inf. Process. Syst., 2011, pp. 2546–2554. – volume: 12 start-page: 2825 year: 2011 end-page: 2830 ident: b34 article-title: Scikit-learn: Machine learning in python publication-title: J. Mach. Learn. Res. – year: 2014 ident: b23 article-title: Freeze-thaw Bayesian optimization – volume: 27 start-page: 697 year: 2017 end-page: 712 ident: b31 article-title: On the predictive power of meta-features in openml publication-title: Int. J. Appl. Math. Comput. Sci. – reference: J. Wilson, F. Hutter, M. Deisenroth, Maximizing acquisition functions for Bayesian optimization, in: Proc. Int. Conf. Neural Inf. Process. Syst., 2018, pp. 9884–9895. – year: 2013 ident: b35 article-title: Towards an empirical foundation for assessing Bayesian optimization of hyperparameters publication-title: NIPS Workshop on Bayesian Optimization in Theory and Practice – year: 2019 ident: b1 article-title: Automated machine learning: State-of-the-art and open challenges – reference: J. Nalepa, M. Myller, S. Piechaczek, K. Hrynczenko, M. Kawulok, Genetic selection of training sets for (not only) artificial neural networks, in: Proc. Int. Conf. beyond Databases, Architectures Struct., 2018, pp. 194–206. – start-page: 485 year: 2016 end-page: 492 ident: b27 article-title: Evaluation of a tree-based pipeline optimization tool for automating data science publication-title: Proc. Gen. and Evol. Comput. Conf. – start-page: 507 year: 2011 end-page: 523 ident: b11 article-title: Sequential model-based optimization for general algorithm configuration publication-title: Proc. Int. Conf. Learn. Intel. Optim. – start-page: 240 year: 2016 end-page: 248 ident: b24 article-title: Non-stochastic best arm identification and hyperparameter optimization publication-title: Artificial Intelligence and Statistics – volume: 8 start-page: 014008 year: 2015 ident: b16 article-title: Hyperopt: a python library for model selection and hyperparameter optimization publication-title: Comput. Sci. Discov. – reference: J. Snoek, H. Larochelle, R.P. Adams, Practical bayesian optimization of machine learning algorithms, in: Proc. Int. Conf. Neural Inf. Process. Syst., 2012, pp. 2951–2959. – volume: 173 start-page: 781 year: 2006 end-page: 800 ident: b3 article-title: The impact of preprocessing on data mining: An evaluation of classifier sensitivity in direct marketing publication-title: European J. Oper. Res. – start-page: 2962 year: 2015 end-page: 2970 ident: b15 article-title: Efficient and robust automated machine learning publication-title: Proc. Int. Conf. Neural Inf. Process. Syst. – year: 2017 ident: b30 article-title: Learning to warm-start Bayesian hyperparameter optimization – volume: 11 start-page: 32 year: 2013 end-page: 39 ident: b5 article-title: The effect of data pre-processing on optimized training of artificial neural networks publication-title: Proc. Technol. – start-page: 400 year: 1975 end-page: 404 ident: b20 article-title: On Bayesian methods for seeking the extremum publication-title: Optimization Techniques IFIP Technical Conference – reference: A. Quemy, Data pipeline selection and optimization, in: Pro. Int. Workshop on Design, Optim., Languages and Anal. Processing of Big Data, 2019. – year: 2015 ident: b22 article-title: Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves publication-title: Int. Conf. Artif. Intel. – year: 2019 ident: b38 article-title: Quantifying error contributions of computational steps, algorithms and hyperparameter choices in image classification pipelines – year: 2014 ident: b33 article-title: Adam: A method for stochastic optimization – year: 2018 ident: b21 article-title: Automl with Monte Carlo tree search publication-title: Workshop AutoML 2018 @ ICML/IJCAI-ECAI – reference: B. Chen, H. Wu, W. Mo, I. Chattopadhyay, H. Lipson, Autostacker: A compositional evolutionary learning system, in: Proc. Gen. and Evol. Comput. Conf., 2018, pp. 402–409. – year: 2019 ident: b29 article-title: Reinbo: Machine learning pipeline search and configuration with Bayesian optimization embedded reinforcement learning – year: 2003 ident: b4 article-title: Exploratory Data Mining and Data Cleaning, Vol. 479 – year: 2014 ident: b7 article-title: Governing and managing big data for analytics and decision makers publication-title: IBM Redguides Bus. Lead. – year: 2019 ident: b2 article-title: Automatic machine learning: methods, systems, challenges publication-title: Challenges Mach. Learn. – volume: 13 start-page: 281 year: 2012 end-page: 305 ident: b10 article-title: Random search for hyper-parameter optimization publication-title: J. Mach. Learn. Res. – volume: 8 start-page: 1341 year: 1996 end-page: 1390 ident: b6 article-title: The lack of a priori distinctions between learning algorithms publication-title: Neural Comput. – year: 2018 ident: b32 article-title: Meta-learning: A survey – year: 2018 ident: b19 article-title: A tutorial on Bayesian optimization – year: 2017 ident: b9 article-title: Design and Analysis of Experiments – start-page: 101 year: 2018 end-page: 109 ident: b36 article-title: Intelligent assistance for data pre-processing publication-title: Comput. Stand. Interfaces – volume: 18 start-page: 1 year: 2018 end-page: 52 ident: b25 article-title: Hyperband: A novel bandit-based approach to hyperparameter optimization publication-title: J. Mach. Learn. Res. – year: 2017 ident: b37 article-title: Openml benchmarking suites and the openml100 – start-page: 847 year: 2013 end-page: 855 ident: b13 article-title: Auto-WEKA: Combined selection and hyperparameter optimization of classification algorithms publication-title: Int. Conf. Knowl. Disc. Data Min. – volume: 18 start-page: 826 year: 2017 end-page: 830 ident: b14 article-title: Auto-WEKA 2.0: Automatic model selection and hyperparameter optimization in WEKA publication-title: J. Mach. Learn. Res. – year: 2014 ident: 10.1016/j.is.2019.101483_b23 – year: 2019 ident: 10.1016/j.is.2019.101483_b29 – start-page: 101 year: 2018 ident: 10.1016/j.is.2019.101483_b36 article-title: Intelligent assistance for data pre-processing publication-title: Comput. Stand. Interfaces doi: 10.1016/j.csi.2017.05.004 – start-page: 240 year: 2016 ident: 10.1016/j.is.2019.101483_b24 article-title: Non-stochastic best arm identification and hyperparameter optimization – ident: 10.1016/j.is.2019.101483_b12 – year: 2003 ident: 10.1016/j.is.2019.101483_b4 – start-page: 2962 year: 2015 ident: 10.1016/j.is.2019.101483_b15 article-title: Efficient and robust automated machine learning – volume: 8 start-page: 014008 issue: 1 year: 2015 ident: 10.1016/j.is.2019.101483_b16 article-title: Hyperopt: a python library for model selection and hyperparameter optimization publication-title: Comput. Sci. Discov. doi: 10.1088/1749-4699/8/1/014008 – ident: 10.1016/j.is.2019.101483_b26 doi: 10.1007/978-3-319-99987-6_15 – volume: 11 start-page: 32 year: 2013 ident: 10.1016/j.is.2019.101483_b5 article-title: The effect of data pre-processing on optimized training of artificial neural networks publication-title: Proc. Technol. doi: 10.1016/j.protcy.2013.12.159 – ident: 10.1016/j.is.2019.101483_b18 – year: 2014 ident: 10.1016/j.is.2019.101483_b33 – volume: 8 start-page: 1341 issue: 7 year: 1996 ident: 10.1016/j.is.2019.101483_b6 article-title: The lack of a priori distinctions between learning algorithms publication-title: Neural Comput. doi: 10.1162/neco.1996.8.7.1341 – volume: 12 start-page: 2825 year: 2011 ident: 10.1016/j.is.2019.101483_b34 article-title: Scikit-learn: Machine learning in python publication-title: J. Mach. Learn. Res. – year: 2019 ident: 10.1016/j.is.2019.101483_b38 – year: 2019 ident: 10.1016/j.is.2019.101483_b2 article-title: Automatic machine learning: methods, systems, challenges publication-title: Challenges Mach. Learn. – start-page: 485 year: 2016 ident: 10.1016/j.is.2019.101483_b27 article-title: Evaluation of a tree-based pipeline optimization tool for automating data science – ident: 10.1016/j.is.2019.101483_b28 doi: 10.1145/3205455.3205586 – volume: 27 start-page: 697 issue: 4 year: 2017 ident: 10.1016/j.is.2019.101483_b31 article-title: On the predictive power of meta-features in openml publication-title: Int. J. Appl. Math. Comput. Sci. doi: 10.1515/amcs-2017-0048 – year: 2017 ident: 10.1016/j.is.2019.101483_b9 – year: 2018 ident: 10.1016/j.is.2019.101483_b19 – year: 2014 ident: 10.1016/j.is.2019.101483_b7 article-title: Governing and managing big data for analytics and decision makers publication-title: IBM Redguides Bus. Lead. – start-page: 507 year: 2011 ident: 10.1016/j.is.2019.101483_b11 article-title: Sequential model-based optimization for general algorithm configuration – volume: 173 start-page: 781 issue: 3 year: 2006 ident: 10.1016/j.is.2019.101483_b3 article-title: The impact of preprocessing on data mining: An evaluation of classifier sensitivity in direct marketing publication-title: European J. Oper. Res. doi: 10.1016/j.ejor.2005.07.023 – start-page: 847 year: 2013 ident: 10.1016/j.is.2019.101483_b13 article-title: Auto-WEKA: Combined selection and hyperparameter optimization of classification algorithms – year: 2019 ident: 10.1016/j.is.2019.101483_b1 – year: 2017 ident: 10.1016/j.is.2019.101483_b30 – ident: 10.1016/j.is.2019.101483_b17 – year: 2013 ident: 10.1016/j.is.2019.101483_b35 article-title: Towards an empirical foundation for assessing Bayesian optimization of hyperparameters – year: 2017 ident: 10.1016/j.is.2019.101483_b37 – start-page: 400 year: 1975 ident: 10.1016/j.is.2019.101483_b20 article-title: On Bayesian methods for seeking the extremum – ident: 10.1016/j.is.2019.101483_b8 – year: 2018 ident: 10.1016/j.is.2019.101483_b21 article-title: Automl with Monte Carlo tree search – volume: 18 start-page: 1 year: 2018 ident: 10.1016/j.is.2019.101483_b25 article-title: Hyperband: A novel bandit-based approach to hyperparameter optimization publication-title: J. Mach. Learn. Res. – volume: 13 start-page: 281 issue: Feb year: 2012 ident: 10.1016/j.is.2019.101483_b10 article-title: Random search for hyper-parameter optimization publication-title: J. Mach. Learn. Res. – year: 2018 ident: 10.1016/j.is.2019.101483_b32 – volume: 18 start-page: 826 issue: 1 year: 2017 ident: 10.1016/j.is.2019.101483_b14 article-title: Auto-WEKA 2.0: Automatic model selection and hyperparameter optimization in WEKA publication-title: J. Mach. Learn. Res. – year: 2015 ident: 10.1016/j.is.2019.101483_b22 article-title: Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves |
| SSID | ssj0002599 |
| Score | 2.390996 |
| Snippet | Machine learning techniques play a preponderant role in dealing with massive amount of data and are employed in almost every possible domain. Building a high... |
| SourceID | proquest crossref elsevier |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 101483 |
| SubjectTerms | Agnosticism Algorithms Automation AutoML CASH Configurations Data Data pipelines Data search Hyperparameter tuning Information systems Machine learning Optimization Pipelines Policies Workflow |
| Title | Two-stage optimization for machine learning workflow |
| URI | https://dx.doi.org/10.1016/j.is.2019.101483 https://www.proquest.com/docview/2440098644 |
| Volume | 92 |
| WOSCitedRecordID | wos000548664400003&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 customDbUrl: eissn: 1873-6076 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0002599 issn: 0306-4379 databaseCode: AIEXJ dateStart: 19950301 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1JSwMxFA5uBz24i3VjDl48BIckk0yORRT1IIoKvQ2TNCOKnYqt27_3ZaujoqjgZWjTJin50ve-vLwFoW2qM0NSWWEjRBcD_-_iPBMEU1llaUm1ZGXqik2Ik5O805GnoeTlwJUTEHWdPz_Lu3-FGtoAbBs6-wu4R4NCA7wG0OEJsMPzZ8A_9TFQPmsMAHHQC3GWzp2w5zwnTSwVceUKxVS3_acmRQ0BSq6Tz_PsDLM-srBhODh7ML2XRoxM8KINFgQ4LkYXqWDWiqEtb35ELpwq5dgmK_SKwkvHXFDMU1-wJYpPX8rukyT2RoEbGMz6z0nbwHzJmg_5rc_tTHYioJI22QwfR5NEZBJE1GT7aL9zPFKscFKT_lLI_7Jw6-zd9d7P8xXL-KBvHYm4mEezgf0nbY_aAhoz9SKai5U1kiBoF9FMI03kEmIjSJMmpAkAkgRIkwhpEiFdRpcH-xd7hzhUu8AahOoQqIJNFSfyTAInTXWmKyqIBj0mWLfkkhhGKwaHDAbqy9i0frwyQhGeG8lByBq6gibqfm1WUVJKe_lMpCmVYppWSinaJfBecU6UUC20G1en0CEVvK1IcltEn7-b4npQ2PUs_Hq20M6ox51Pg_LNd2lc8CLQOE_PCtgb3_TaiNgU4d8EnzNmE94CZ1_706DraPptw2-gieH9g9lEU_pxeD243wr76xXBr3EW |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Two-stage+optimization+for+machine+learning+workflow&rft.jtitle=Information+systems+%28Oxford%29&rft.au=Quemy%2C+Alexandre&rft.date=2020-09-01&rft.pub=Elsevier+Ltd&rft.issn=0306-4379&rft.eissn=1873-6076&rft.volume=92&rft_id=info:doi/10.1016%2Fj.is.2019.101483&rft.externalDocID=S0306437919305356 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0306-4379&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0306-4379&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0306-4379&client=summon |