Optimization of Small Sized File Access Efficiency in Hadoop Distributed File System by Integrating Virtual File System Layer
Storage for large datasets, handling data in different formats and data getting generated with high speed are the major highlights of the Hadoop because of which the Hadoop got invented. Hadoop is the solution for the big data problems as discussed above. In order to give the improved solution (in t...
Gespeichert in:
| Veröffentlicht in: | International journal of advanced computer science & applications Jg. 13; H. 6 |
|---|---|
| Hauptverfasser: | , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
West Yorkshire
Science and Information (SAI) Organization Limited
01.01.2022
|
| Schlagworte: | |
| ISSN: | 2158-107X, 2156-5570 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | Storage for large datasets, handling data in different formats and data getting generated with high speed are the major highlights of the Hadoop because of which the Hadoop got invented. Hadoop is the solution for the big data problems as discussed above. In order to give the improved solution (in terms of access efficiency and time) for small sized files, this solution is proposed. A novel approach called VFS-HDFS architecture is designed in which the focus is on optimization of small sized files access problems with significant development compared with the existing solutions i.e. HDFS sequence files, HAR, NHAR. In the proposed work a Virtual file system layer has been added as a wrapper over the top of existing HDFS architecture. However, the research work is carried out without altering the existing HFDS architecture. In this paper drawbacks of existing techniques i.e. Flat File Technique and Table Chain Technique which are implemented in HDFS HAR, NHAR, sequence file is overcome by using Bucket Chain Technique. The files to merge in a single bucket are selected using ensemble classifier which is a combination of different classifiers. Combination of multiple classifiers gives the better accurate results. Using this proposed system, better results are obtained compared with the existing system in terms of access efficiency of small sized files in HDFS. |
|---|---|
| AbstractList | Storage for large datasets, handling data in different formats and data getting generated with high speed are the major highlights of the Hadoop because of which the Hadoop got invented. Hadoop is the solution for the big data problems as discussed above. In order to give the improved solution (in terms of access efficiency and time) for small sized files, this solution is proposed. A novel approach called VFS-HDFS architecture is designed in which the focus is on optimization of small sized files access problems with significant development compared with the existing solutions i.e. HDFS sequence files, HAR, NHAR. In the proposed work a Virtual file system layer has been added as a wrapper over the top of existing HDFS architecture. However, the research work is carried out without altering the existing HFDS architecture. In this paper drawbacks of existing techniques i.e. Flat File Technique and Table Chain Technique which are implemented in HDFS HAR, NHAR, sequence file is overcome by using Bucket Chain Technique. The files to merge in a single bucket are selected using ensemble classifier which is a combination of different classifiers. Combination of multiple classifiers gives the better accurate results. Using this proposed system, better results are obtained compared with the existing system in terms of access efficiency of small sized files in HDFS. |
| Author | Mathur, Anjali Alange, Neeta |
| Author_xml | – sequence: 1 givenname: Neeta surname: Alange fullname: Alange, Neeta – sequence: 2 givenname: Anjali surname: Mathur fullname: Mathur, Anjali |
| BookMark | eNp9kLFOwzAURS1UJErpHzBYYk5xnNhJ2KLS0qJKHQKILXIcu3KVOMF2hlTi3wltGWDgLe8N594nnWsw0o0WANz6aOaHhCb36-d0nqUzjDCeIT9AFNMLMMY-oR4hERod79jzUfR-BabW7tEwQYJpHIzB57Z1qlYH5lSjYSNhVrOqgpk6iBIuVSVgyrmwFi6kVFwJzXuoNFyxsmla-KisM6ro3A-c9daJGhY9XGsndmao1Tv4pozrWPUL2bBemBtwKVllxfS8J-B1uXiZr7zN9mk9Tzcex4Q4j8mCcRqHWBYyYowJEhNKklDKQDJSBFgKLKiPI84jGZAilKUYjiIhEQv8kgQTcHfqbU3z0Qnr8n3TGT28zDFNCEUxDsOBCk8UN421Rsi8Napmps99lB9d5yfX-bfr_Ox6iD38iXHljj6dYar6P_wFkMSIUQ |
| CitedBy_id | crossref_primary_10_1007_s10586_023_03992_1 |
| ContentType | Journal Article |
| Copyright | 2022. This work is licensed under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
| Copyright_xml | – notice: 2022. This work is licensed under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
| DBID | AAYXX CITATION 3V. 7XB 8FE 8FG 8FK 8G5 ABUWG AFKRA ARAPS AZQEC BENPR BGLVJ CCPQU COVID DWQXO GNUQQ GUQSH HCIFZ JQ2 K7- M2O MBDVC P5Z P62 PHGZM PHGZT PIMPY PKEHL PQEST PQGLB PQQKQ PQUKI PRINS Q9U |
| DOI | 10.14569/IJACSA.2022.0130626 |
| DatabaseName | CrossRef ProQuest Central (Corporate) ProQuest Central (purchase pre-March 2016) ProQuest SciTech Collection ProQuest Technology Collection ProQuest Central (Alumni) (purchase pre-March 2016) Research Library (Alumni) ProQuest Central (Alumni) ProQuest Central UK/Ireland Advanced Technologies & Computer Science Collection ProQuest Central Essentials ProQuest Central ProQuest Technology Collection ProQuest One Coronavirus Research Database ProQuest Central Korea ProQuest Central Student ProQuest Research Library SciTech Premium Collection (via ProQuest) ProQuest Computer Science Collection Computer Science Database Research Library Research Library (Corporate) Advanced Technologies & Aerospace Database ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Premium ProQuest One Academic (New) Publicly Available Content Database ProQuest One Academic Middle East (New) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic (retired) ProQuest One Academic UKI Edition ProQuest Central China ProQuest Central Basic |
| DatabaseTitle | CrossRef Publicly Available Content Database Research Library Prep Computer Science Database ProQuest Central Student Technology Collection ProQuest One Academic Middle East (New) ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Computer Science Collection ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College Research Library (Alumni Edition) ProQuest Central China ProQuest Central ProQuest One Applied & Life Sciences ProQuest Central Korea ProQuest Research Library ProQuest Central (New) Advanced Technologies & Aerospace Collection ProQuest Central Basic ProQuest One Academic Eastern Edition Coronavirus Research Database ProQuest Technology Collection ProQuest SciTech Collection Advanced Technologies & Aerospace Database ProQuest One Academic UKI Edition ProQuest One Academic ProQuest One Academic (New) ProQuest Central (Alumni) |
| DatabaseTitleList | Publicly Available Content Database |
| Database_xml | – sequence: 1 dbid: PIMPY name: Publicly Available Content Database url: http://search.proquest.com/publiccontent sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 2156-5570 |
| ExternalDocumentID | 10_14569_IJACSA_2022_0130626 |
| GroupedDBID | .DC 5VS 8G5 AAYXX ABUWG ADMLS AFFHD AFKRA ALMA_UNASSIGNED_HOLDINGS ARAPS AZQEC BENPR BGLVJ CCPQU CITATION DWQXO EBS EJD GNUQQ GUQSH HCIFZ K7- KQ8 M2O OK1 PHGZM PHGZT PIMPY PQGLB RNS 3V. 7XB 8FE 8FG 8FK COVID JQ2 MBDVC P62 PKEHL PQEST PQQKQ PQUKI PRINS Q9U |
| ID | FETCH-LOGICAL-c255t-afbac6842fbf7aaae5856594ff3fa5b32fe2e6127cc7f35b4fde7f3b957a31d53 |
| IEDL.DBID | K7- |
| ISICitedReferencesCount | 3 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000871782600001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 2158-107X |
| IngestDate | Fri Jul 25 03:22:49 EDT 2025 Sat Nov 29 02:26:07 EST 2025 Tue Nov 18 22:27:27 EST 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | false |
| IsScholarly | true |
| Issue | 6 |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c255t-afbac6842fbf7aaae5856594ff3fa5b32fe2e6127cc7f35b4fde7f3b957a31d53 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| OpenAccessLink | https://www.proquest.com/docview/2695608244?pq-origsite=%requestingapplication% |
| PQID | 2695608244 |
| PQPubID | 5444811 |
| ParticipantIDs | proquest_journals_2695608244 crossref_primary_10_14569_IJACSA_2022_0130626 crossref_citationtrail_10_14569_IJACSA_2022_0130626 |
| PublicationCentury | 2000 |
| PublicationDate | 20220101 |
| PublicationDateYYYYMMDD | 2022-01-01 |
| PublicationDate_xml | – month: 01 year: 2022 text: 20220101 day: 01 |
| PublicationDecade | 2020 |
| PublicationPlace | West Yorkshire |
| PublicationPlace_xml | – name: West Yorkshire |
| PublicationTitle | International journal of advanced computer science & applications |
| PublicationYear | 2022 |
| Publisher | Science and Information (SAI) Organization Limited |
| Publisher_xml | – name: Science and Information (SAI) Organization Limited |
| SSID | ssj0000392683 |
| Score | 2.1897914 |
| Snippet | Storage for large datasets, handling data in different formats and data getting generated with high speed are the major highlights of the Hadoop because of... |
| SourceID | proquest crossref |
| SourceType | Aggregation Database Enrichment Source Index Database |
| SubjectTerms | Big Data Chains Classifiers Efficiency Optimization |
| Title | Optimization of Small Sized File Access Efficiency in Hadoop Distributed File System by Integrating Virtual File System Layer |
| URI | https://www.proquest.com/docview/2695608244 |
| Volume | 13 |
| WOSCitedRecordID | wos000871782600001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVPQU databaseName: Advanced Technologies & Aerospace Database customDbUrl: eissn: 2156-5570 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000392683 issn: 2158-107X databaseCode: P5Z dateStart: 20100101 isFulltext: true titleUrlDefault: https://search.proquest.com/hightechjournals providerName: ProQuest – providerCode: PRVPQU databaseName: Computer Science Database customDbUrl: eissn: 2156-5570 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000392683 issn: 2158-107X databaseCode: K7- dateStart: 20100101 isFulltext: true titleUrlDefault: http://search.proquest.com/compscijour providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest Central customDbUrl: eissn: 2156-5570 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000392683 issn: 2158-107X databaseCode: BENPR dateStart: 20100101 isFulltext: true titleUrlDefault: https://www.proquest.com/central providerName: ProQuest – providerCode: PRVPQU databaseName: Publicly Available Content Database customDbUrl: eissn: 2156-5570 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000392683 issn: 2158-107X databaseCode: PIMPY dateStart: 20100101 isFulltext: true titleUrlDefault: http://search.proquest.com/publiccontent providerName: ProQuest – providerCode: PRVPQU databaseName: Research Library customDbUrl: eissn: 2156-5570 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000392683 issn: 2158-107X databaseCode: M2O dateStart: 20100101 isFulltext: true titleUrlDefault: https://search.proquest.com/pqrl providerName: ProQuest |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LT-MwELZY2MNeFlhAy1M-cPXSOImTnFCBVpRHiSigwiWyHVuqVJrSFiSQ-O_MJA4LF_awFyuRnSjSZ88z8w0hu2GccBWLnIlAKRaEOmGJCuA2zk3gaSF0GYe8OYu63bjfT1IXcJu63yprmVgK6rzQGCPf4wIt-Ri00f74gWHXKMyuuhYa38iCx7mH-_w0Yu8xlgYof1EycYJiQxbTqO-q58BsSPY6J83DXhN8RM7_YAJPIMPCR-30WTiXGqe9-L_fukR-OluTNqvNsUzmzOgXWaz7OFB3rFfI6wXIjXtXkEkLS3v3cjikvcGLyWkbxAZtlm0Vaavkm8BiTToYURBaRTGmR0i9i12z6sUVCTpVz7TjuChAPdKbwQRLVT4tOZNg8K-S63br6vCYubYMTIP_MWPSKqkxfWeVjaSUBjwOESaBtb6VofK5NdyA4RRpHVk_VIHNDVyoJIyk7-Whv0bmR8XI_CbUcKtB5jSUkHFgtJVCKL8RGzQErRfl68Sv4ci04yzH1hnDDH0XBDGrQMwQxMyBuE7Y-1PjirPjH-u3ahgzd4Kn2V8MN76e3iQ_8GVVWGaLzM8mj2abfNdPs8F0skMWDlrd9HKn3JgwnvMLGNPwDmbSznl6-wb57ez- |
| linkProvider | ProQuest |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V3Pb9MwFH6axiS4sI0f2mADH-Bo1jqJnRwmVG2rVloKUsfUW7AdW6rUNaUtmzaJf4m_kfcSZ7ALnHbglihODvGX771n530fwJskzYRJZcFlbAyPE5vxzMR4mhYublspbbUOeT5Qw2E6Hmef1-Bn0wtDv1U2nFgRdVFaWiM_EJIy-RSj0fv5N06uUbS72lho1LDou-srLNmWh71jnN-3QnRPzo5OeXAV4BbT5xXX3mhLu0_eeKW1dpgwyySLvY-8TkwkvBMO476yVvkoMbEvHB6YLFE6ahfkEoGU_yCOUkVa_X3Fb9d0WphsyEr5EwMpqaaqcejWwzQlO-h96ByNOliTCvGONgwlKTr8GQ3vBoMqwnU3_7d3swWPQy7NOjX4t2HNzZ7AZuNTwQJtPYUfn5AXL0LDKSs9G13o6ZSNJjeuYF2kRdapbCPZSaWnQc2obDJjSMplOWfHJC1MrmDN4FrknZlr1gtaGxj-2flkQa04d4YMNBY0z-DLvbyE57A-K2duB5gT3iKntozUaeys11KaqJU6SnR9WxW7EDXTn9ugyU7WINOcajMCTV6DJifQ5AE0u8Bv75rXmiT_GL_XwCYPDLXMf2Pmxd8vv4aHp2cfB_mgN-y_hEf04HoJag_WV4vvbh827OVqsly8qj4GBl_vG2G_AIRTSJA |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Optimization+of+Small+Sized+File+Access+Efficiency+in+Hadoop+Distributed+File+System+by+Integrating+Virtual+File+System+Layer&rft.jtitle=International+journal+of+advanced+computer+science+%26+applications&rft.au=Alange%2C+Neeta&rft.au=Mathur%2C+Anjali&rft.date=2022-01-01&rft.pub=Science+and+Information+%28SAI%29+Organization+Limited&rft.issn=2158-107X&rft.eissn=2156-5570&rft.volume=13&rft.issue=6&rft_id=info:doi/10.14569%2FIJACSA.2022.0130626 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2158-107X&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2158-107X&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2158-107X&client=summon |