Distributed Computing Engines for Big Data Analytics
Technologies like cloud computing paved way for dealing with massive amounts of data. Prior to cloud, it was not possible unless you invest large amounts for computing resources. Now there is ecosystem which is conducive to storing and processing voluminous data that cannot be handled by local compu...
Uloženo v:
| Vydáno v: | International journal of recent technology and engineering Ročník 8; číslo 2; s. 5841 - 5845 |
|---|---|
| Hlavní autoři: | , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
30.07.2019
|
| ISSN: | 2277-3878, 2277-3878 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Technologies like cloud computing paved way for dealing with massive amounts of data. Prior to cloud, it was not possible unless you invest large amounts for computing resources. Now there is ecosystem which is conducive to storing and processing voluminous data that cannot be handled by local computing resources. With such ecosystem, big data technology came into existence. Big data is the data characterized by volume, velocity, veracity and variety. This has enabled enterprises to give more value to every piece of data. This in turn led to the increased usage of cloud for both storage and processing. For processing big data efficient technologies are required. New programming paradigm like MapReduce with Hadoop distributed programming framework is widely used. However, there are other emerging frameworks like Apache Spark and Apache Flink to handle big data more efficiently. In this paper, empirical study is made on the three frameworks like Hadoop, Apache Spark and Apache Flink with different parameters like type of network, block size of HDFS, input data size and other configuration changes. The experimental results revealed that Apache Spark and Apache Flink outperform Hadoop. This is evaluated with different benchmark big data workloads. |
|---|---|
| AbstractList | Technologies like cloud computing paved way for dealing with massive amounts of data. Prior to cloud, it was not possible unless you invest large amounts for computing resources. Now there is ecosystem which is conducive to storing and processing voluminous data that cannot be handled by local computing resources. With such ecosystem, big data technology came into existence. Big data is the data characterized by volume, velocity, veracity and variety. This has enabled enterprises to give more value to every piece of data. This in turn led to the increased usage of cloud for both storage and processing. For processing big data efficient technologies are required. New programming paradigm like MapReduce with Hadoop distributed programming framework is widely used. However, there are other emerging frameworks like Apache Spark and Apache Flink to handle big data more efficiently. In this paper, empirical study is made on the three frameworks like Hadoop, Apache Spark and Apache Flink with different parameters like type of network, block size of HDFS, input data size and other configuration changes. The experimental results revealed that Apache Spark and Apache Flink outperform Hadoop. This is evaluated with different benchmark big data workloads. |
| Author | Prashanthi, Bh Madhuri, D. Krishna Sowjanya, G. |
| Author_xml | – sequence: 1 givenname: Bh surname: Prashanthi fullname: Prashanthi, Bh – sequence: 2 givenname: G. surname: Sowjanya fullname: Sowjanya, G. – sequence: 3 givenname: D. Krishna surname: Madhuri fullname: Madhuri, D. Krishna |
| BookMark | eNpNz71uwjAYhWGrolIp5Qq6-AaS-v-LRwiUVkLq0s6WHdvICBJkh4G7rxQ6dDrvdKTnGc36oQ8IvVJSc6kFeUvHPIZ6zQFoTaBhVD-gOWMAFW-gmf3rJ7Qs5UgIoVxRwdUciU0qY07uOgaP2-F8uY6pP-Btf0h9KDgOGa_TAW_saPGqt6fbmLrygh6jPZWw_NsF-nnffrcf1f5r99mu9lVHpdaV0pQGFSM4EJIFRhz3qhPgAWRQXnrtVJBR6sgEdwDKMeEa5mSnPWPM8gXi998uD6XkEM0lp7PNN0OJmexmspvJbu52_guPEk7m |
| ContentType | Journal Article |
| CorporateAuthor | Griet, Assistant Professor,Dept of CSE, GRIET |
| CorporateAuthor_xml | – name: Griet, Assistant Professor,Dept of CSE, GRIET |
| DBID | AAYXX CITATION |
| DOI | 10.35940/ijrte.B3771.078219 |
| DatabaseName | CrossRef |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | CrossRef |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 2277-3878 |
| EndPage | 5845 |
| ExternalDocumentID | 10_35940_ijrte_B3771_078219 |
| GroupedDBID | AAYXX ALMA_UNASSIGNED_HOLDINGS CITATION M~E OK1 RNS |
| ID | FETCH-LOGICAL-c1599-6911e6ff7b7452e20b3d6c47d775e6d5d9b6e5f59f243b776b24b82b5c9d222a3 |
| ISSN | 2277-3878 |
| IngestDate | Sat Nov 29 06:08:51 EST 2025 |
| IsDoiOpenAccess | false |
| IsOpenAccess | true |
| IsPeerReviewed | false |
| IsScholarly | true |
| Issue | 2 |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c1599-6911e6ff7b7452e20b3d6c47d775e6d5d9b6e5f59f243b776b24b82b5c9d222a3 |
| OpenAccessLink | https://doi.org/10.35940/ijrte.b3771.078219 |
| PageCount | 5 |
| ParticipantIDs | crossref_primary_10_35940_ijrte_B3771_078219 |
| PublicationCentury | 2000 |
| PublicationDate | 2019-07-30 |
| PublicationDateYYYYMMDD | 2019-07-30 |
| PublicationDate_xml | – month: 07 year: 2019 text: 2019-07-30 day: 30 |
| PublicationDecade | 2010 |
| PublicationTitle | International journal of recent technology and engineering |
| PublicationYear | 2019 |
| SSID | ssj0001361436 |
| Score | 2.0740423 |
| Snippet | Technologies like cloud computing paved way for dealing with massive amounts of data. Prior to cloud, it was not possible unless you invest large amounts for... |
| SourceID | crossref |
| SourceType | Index Database |
| StartPage | 5841 |
| Title | Distributed Computing Engines for Big Data Analytics |
| Volume | 8 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVHPJ databaseName: ROAD: Directory of Open Access Scholarly Resources customDbUrl: eissn: 2277-3878 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0001361436 issn: 2277-3878 databaseCode: M~E dateStart: 20120101 isFulltext: true titleUrlDefault: https://road.issn.org providerName: ISSN International Centre |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LT9wwELaWx6E9IB6tWigoB26QdHHsOD52Wx4XUKWCxC1ax-PurlAWLYFy4ifwmxk7L_NQBQcuUdaKRpvMp_Hn8fgbQra1UUooBC9ydwgZpDqUHGjYH-LawSQmBdN3zSbEyUl6fi5_93r3zVmYmwtRFOntrbx8V1fjGDrbHp19g7tboziA9-h0vKLb8foqx_-yUri2i5XN3bqeDTYbUOkOOvGFncH4L3q7HFaKJGVT7z7pitq7HKGnLIGx0RYOlG0y3m08QCdo2IbZ2fBqhB5z_YJ3Bm3C-c_03wRjj6Orh1GXC9ej6_q8e-SCzqjq6N0kI9z5p2ZfxcUsaneE47TqyhPBC2N10E09bFEvgCIf2vMmY_zJXwr0MZfMlkaOJ7MSokEsxF5kyU4dfh_Jaj-Z7toiRFz-ODOZM5I5I1llZI4sUMGlLRE8vvNydjGSGdd1sn2nSsjK2fn-_M94ZMdjLafLZKlebgQ_KpiskB4Uq-SjJ0K5RpgHmKAFTFADJkDABAiYwAImaAHziZwd7J_-PArrXhphjoRVhglOapAYI5RgnALtq1gnORNaCA6J5lqqBLjh0lAWKyESRZlKqeK51Eghh_FnMl9MC_hCAmRVlIKSqWE5MwIJLgcwXHGtDbJt9pXsNi-dXVaSKdl_vvX62x7fIB864H0j8-XsGjbJYn5Tjq9mW85fD9A0Zc4 |
| linkProvider | ISSN International Centre |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Distributed+Computing+Engines+for+Big+Data+Analytics&rft.jtitle=International+journal+of+recent+technology+and+engineering&rft.au=Prashanthi%2C+Bh&rft.au=Sowjanya%2C+G.&rft.au=Madhuri%2C+D.+Krishna&rft.date=2019-07-30&rft.issn=2277-3878&rft.eissn=2277-3878&rft.volume=8&rft.issue=2&rft.spage=5841&rft.epage=5845&rft_id=info:doi/10.35940%2Fijrte.B3771.078219&rft.externalDBID=n%2Fa&rft.externalDocID=10_35940_ijrte_B3771_078219 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2277-3878&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2277-3878&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2277-3878&client=summon |