Applying compression algorithms on hadoop cluster implementing through apache tez and hadoop mapreduce

The latest and famous subject all over the cloud research area is Big Data; its main appearances are volume, velocity and variety. The characteristics are difficult to manage through traditional software and their various available methodologies. To manage the data which is occurring from various do...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of engineering & technology (Dubai) Jg. 7; H. 2.26; S. 80
Hauptverfasser: E. Laxmi Lydia, Dr, Srinivasa Rao, M
Format: Journal Article
Sprache:Englisch
Veröffentlicht: 07.05.2018
ISSN:2227-524X, 2227-524X
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract The latest and famous subject all over the cloud research area is Big Data; its main appearances are volume, velocity and variety. The characteristics are difficult to manage through traditional software and their various available methodologies. To manage the data which is occurring from various domains of big data are handled through Hadoop, which is open framework software which is mainly developed to provide solutions. Handling of big data analytics is done through Hadoop Map Reduce framework and it is the key engine of hadoop cluster and it is extensively used in these days. It uses batch processing system.Apache developed an engine named "Tez", which supports interactive query system and it won't writes any temporary data into the Hadoop Distributed File System(HDFS).The paper mainly focuses on performance juxtaposition of MapReduce and TeZ, performance of these two engines are examined through the compression of input files and map output files. To compare two engines we used Bzip compression algorithm for the input files and snappy for the map out files. Word Count and Terasort gauge are used on our experiments. For the Word Count gauge, the results shown that Tez engine has better execution time than Hadoop MapReduce engine for the both compressed and non-compressed data. It has reduced the execution time nearly 39% comparing to the execution time of the Hadoop MapReduce engine. Correspondingly for the terasort gauge, the Tez engine has higher execution time than Hadoop MapReduce engine.  
AbstractList The latest and famous subject all over the cloud research area is Big Data; its main appearances are volume, velocity and variety. The characteristics are difficult to manage through traditional software and their various available methodologies. To manage the data which is occurring from various domains of big data are handled through Hadoop, which is open framework software which is mainly developed to provide solutions. Handling of big data analytics is done through Hadoop Map Reduce framework and it is the key engine of hadoop cluster and it is extensively used in these days. It uses batch processing system.Apache developed an engine named "Tez", which supports interactive query system and it won't writes any temporary data into the Hadoop Distributed File System(HDFS).The paper mainly focuses on performance juxtaposition of MapReduce and TeZ, performance of these two engines are examined through the compression of input files and map output files. To compare two engines we used Bzip compression algorithm for the input files and snappy for the map out files. Word Count and Terasort gauge are used on our experiments. For the Word Count gauge, the results shown that Tez engine has better execution time than Hadoop MapReduce engine for the both compressed and non-compressed data. It has reduced the execution time nearly 39% comparing to the execution time of the Hadoop MapReduce engine. Correspondingly for the terasort gauge, the Tez engine has higher execution time than Hadoop MapReduce engine.  
Author Srinivasa Rao, M
E. Laxmi Lydia, Dr
Author_xml – sequence: 1
  givenname: Dr
  surname: E. Laxmi Lydia
  fullname: E. Laxmi Lydia, Dr
– sequence: 2
  givenname: M
  surname: Srinivasa Rao
  fullname: Srinivasa Rao, M
BookMark eNpNkEtLAzEcxINUsNZ-Ai_5ArvmtY8cS_EFBS89eAsx-aebsrsJyVaon962KniamcMMw-8WzcYwAkL3lJRUCCof_B6m8rPxrGR1SVnF5RWaM8aaomLiffbP36BlzntCCOWCtkLOkVvF2B_9uMMmDDFBzj6MWPe7kPzUDRmfUqdtCBGb_pAnSNgPsYcBxuncmroUDrsO66hNB3iCL6xH-1cZ9GnSHgzcoWun-wzLX12g7dPjdv1SbN6eX9erTWFaKQv74WpLqDVNDU40nMhG1Oz01ekGdCtMZYV2jW6to0wyLQ0XlauJsUZyx4EvEP-ZNSnknMCpmPyg01FRoi6w1BmWOsNSrFYXWPwbVqBkKg
ContentType Journal Article
DBID AAYXX
CITATION
DOI 10.14419/ijet.v7i2.26.12539
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList CrossRef
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 2227-524X
ExternalDocumentID 10_14419_ijet_v7i2_26_12539
GroupedDBID AAYXX
ALMA_UNASSIGNED_HOLDINGS
CITATION
KQ8
M~E
RNS
ID FETCH-LOGICAL-c899-dbf6d01dc76ef473097462134fa7ea84c5d4af7a8df1292a9c345f60cdc93f3e3
ISSN 2227-524X
IngestDate Sat Nov 29 03:36:32 EST 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed false
IsScholarly true
Issue 2.26
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c899-dbf6d01dc76ef473097462134fa7ea84c5d4af7a8df1292a9c345f60cdc93f3e3
OpenAccessLink https://doi.org/10.14419/ijet.v7i2.26.12539
ParticipantIDs crossref_primary_10_14419_ijet_v7i2_26_12539
PublicationCentury 2000
PublicationDate 2018-05-07
PublicationDateYYYYMMDD 2018-05-07
PublicationDate_xml – month: 05
  year: 2018
  text: 2018-05-07
  day: 07
PublicationDecade 2010
PublicationTitle International journal of engineering & technology (Dubai)
PublicationYear 2018
SSID ssj0001341849
Score 2.0257108
Snippet The latest and famous subject all over the cloud research area is Big Data; its main appearances are volume, velocity and variety. The characteristics are...
SourceID crossref
SourceType Index Database
StartPage 80
Title Applying compression algorithms on hadoop cluster implementing through apache tez and hadoop mapreduce
Volume 7
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVHPJ
  databaseName: ROAD: Directory of Open Access Scholarly Resources
  customDbUrl:
  eissn: 2227-524X
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0001341849
  issn: 2227-524X
  databaseCode: M~E
  dateStart: 20120101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lj9MwELbKwgEOiKd4ywduISVvO0fELuJQVggqtLfI9YPNqk2qblp1OfBj-KXMxM4DFiH2wCVKrXiUdr6OZ8Yznwl5GXCdKBFwXy209pMUKW9THvoaO3xkHESyZef_MmPHx_zkJP84mfzoemF2S1ZVfL_P1_9V1TAGysbW2SuouxcKA3APSocrqB2u_6R49CsvXCutq3KtPLH8Wm_K5nTV7g6cClXXa08ut0iTgJ2Stojctk7Zg3tgEQV9eo3-1u4vuCkrASJVB4ezoQx-yCqOuCj0QHbYQqzp8_jo2B5uF3gWdp-JOJp6M7Ffld7sQtkS3sO-cvgzbjTtxLnwPol6SOK6dEXI2-JANlg17L2F6NeWZU71H8acWWYj9EXTaGxl7dlPl4w_eHZInlqeaRDCSpw0Bf_NkiX9SrX92xLYFyZiSIRiChRSoJAiyopWyDVyPWJpjpbzw_dRHg_cAN5GWf23cORWKOf15ZcZOUAjT2Z-h9x2IQh9Y6Fzl0x0dY_cGhFT3iemAxEdgYgOIKLwySKCOhDRMYioAxG1IKIAIgog6qb0IHpA5u-O5m_f--5EDl9CXA7_Z5OpIFSSZdoksDZAMJohJaARTAueyFQlwjDBlQE3MhK5jJPUZIFUMo9NrOOH5KCqK_2I0FgEchEFJs6zHOblgmeh4LhFwQOVhNFj8qr7lYq15V0p_qKcJ1d7_Cm5OWDzGTloNlv9nNyQu6Y837xoFfwTGoOBxg
linkProvider ISSN International Centre
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Applying+compression+algorithms+on+hadoop+cluster+implementing+through+apache+tez+and+hadoop+mapreduce&rft.jtitle=International+journal+of+engineering+%26+technology+%28Dubai%29&rft.au=E.+Laxmi+Lydia%2C+Dr&rft.au=Srinivasa+Rao%2C+M&rft.date=2018-05-07&rft.issn=2227-524X&rft.eissn=2227-524X&rft.volume=7&rft.issue=2.26&rft.spage=80&rft_id=info:doi/10.14419%2Fijet.v7i2.26.12539&rft.externalDBID=n%2Fa&rft.externalDocID=10_14419_ijet_v7i2_26_12539
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2227-524X&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2227-524X&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2227-524X&client=summon