Block Storage Optimization and Parallel Data Processing and Analysis of Product Big Data Based on the Hadoop Platform

The traditional distributed database storage architecture has the problems of low efficiency and storage capacity in managing data resources of seafood products. We reviewed various storage and retrieval technologies for the big data resources. A block storage layout optimization method based on the...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Mathematical problems in engineering Ročník 2021; s. 1 - 14
Hlavní autori: Wang, Yajun, Cheng, Shengming, Zhang, Xinchen, Leng, Junyu, Liu, Jun
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: New York Hindawi 31.12.2021
John Wiley & Sons, Inc
Predmet:
ISSN:1024-123X, 1563-5147
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:The traditional distributed database storage architecture has the problems of low efficiency and storage capacity in managing data resources of seafood products. We reviewed various storage and retrieval technologies for the big data resources. A block storage layout optimization method based on the Hadoop platform and a parallel data processing and analysis method based on the MapReduce model are proposed. A multireplica consistent hashing algorithm based on data correlation and spatial and temporal properties is used in the parallel data processing and analysis method. The data distribution strategy and block size adjustment are studied based on the Hadoop platform. A multidata source parallel join query algorithm and a multi-channel data fusion feature extraction algorithm based on data-optimized storage are designed for the big data resources of seafood products according to the MapReduce parallel frame work. Practical verification shows that the storage optimization and data-retrieval methods provide supports for constructing a big data resource-management platform for seafood products and realize efficient organization and management of the big data resources of seafood products. The execution time of multidata source parallel retrieval is only 32% of the time of the standard Hadoop scheme, and the execution time of the multichannel data fusion feature extraction algorithm is only 35% of the time of the standard Hadoop scheme.
Bibliografia:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1024-123X
1563-5147
DOI:10.1155/2021/3839800