RabbitTrim: An Efficient and Versatile Trimmer on Multi-Core Platforms.

Uložené v:
Podrobná bibliografia
Názov: RabbitTrim: An Efficient and Versatile Trimmer on Multi-Core Platforms.
Autori: Wang M, Yang Y, Yin Z, Yan L, Zhang T, Zhu F, Li X, Duan X, Schmidt B, Liu W
Zdroj: IEEE transactions on computational biology and bioinformatics [IEEE Trans Comput Biol Bioinform] 2025 Nov-Dec; Vol. 22 (6), pp. 2442-2452.
Spôsob vydávania: Journal Article
Jazyk: English
Informácie o časopise: Publisher: IEEE Country of Publication: United States NLM ID: 9919068173606676 Publication Model: Print Cited Medium: Internet ISSN: 2998-4165 (Electronic) Linking ISSN: 29984165 NLM ISO Abbreviation: IEEE Trans Comput Biol Bioinform Subsets: MEDLINE
Imprint Name(s): Original Publication: [New York, New York] : IEEE, [2025]-
Výrazy zo slovníka MeSH: Software* , Sequence Analysis, DNA*/methods , High-Throughput Nucleotide Sequencing*/methods , Computational Biology*/methods, Humans ; Animals ; Algorithms
Abstrakt: Trimming is an essential step in sequencing data processing. However, many existing trimming tools, such as Trimmomatic and Ktrim, are limited by suboptimal implementations and fail to fully leverage the computational power of modern multi-core platforms. To address this, we introduce RabbitTrim, a highly optimized and versatile trimming tool that fully supports the functionalities of Trimmomatic and Ktrim. RabbitTrim's performance is enhanced through efficient I/O strategies, parallel (de)compression engines, block-based memory pools, bitwise operations, and vectorization techniques. Compared to Trimmomatic, RabbitTrim (in trimmomatic mode) achieves speedups ranging from 1.8x to 6.0x for plain FASTQ files and 3.7x to 14.0x for gzip-compressed FASTQ files on a 48-core Intel server. Similarly, compared to Ktrim, RabbitTrim (in ktrim mode) achieves speedups ranging from 1.5x to 2.5x for plain FASTQ files and 2.7x to 5.6x for gzip-compressed FASTQ files on the same server. Moreover, RabbitTrim is able to process 101 GB gzip-compressed sequencing data in only 5 minutes while Trimmomatic requires at least 21 minutes.
Entry Date(s): Date Created: 20250814 Date Completed: 20251210 Latest Revision: 20251211
Update Code: 20251211
DOI: 10.1109/TCBBIO.2025.3579070
PMID: 40811278
Databáza: MEDLINE
Popis
Abstrakt:Trimming is an essential step in sequencing data processing. However, many existing trimming tools, such as Trimmomatic and Ktrim, are limited by suboptimal implementations and fail to fully leverage the computational power of modern multi-core platforms. To address this, we introduce RabbitTrim, a highly optimized and versatile trimming tool that fully supports the functionalities of Trimmomatic and Ktrim. RabbitTrim's performance is enhanced through efficient I/O strategies, parallel (de)compression engines, block-based memory pools, bitwise operations, and vectorization techniques. Compared to Trimmomatic, RabbitTrim (in trimmomatic mode) achieves speedups ranging from 1.8x to 6.0x for plain FASTQ files and 3.7x to 14.0x for gzip-compressed FASTQ files on a 48-core Intel server. Similarly, compared to Ktrim, RabbitTrim (in ktrim mode) achieves speedups ranging from 1.5x to 2.5x for plain FASTQ files and 2.7x to 5.6x for gzip-compressed FASTQ files on the same server. Moreover, RabbitTrim is able to process 101 GB gzip-compressed sequencing data in only 5 minutes while Trimmomatic requires at least 21 minutes.
ISSN:2998-4165
DOI:10.1109/TCBBIO.2025.3579070