IgMAT: immunoglobulin sequence multi-species annotation tool for any species including those with incomplete antibody annotation or unusual characteristics

Background The advent and continual improvement of high-throughput sequencing technologies has made immunoglobulin repertoire sequencing accessible and informative regardless of study species. However, to fully map dynamic changes in polyclonal responses precise framework and complementarity determi...

Full description

Saved in:
Bibliographic Details
Published in:BMC bioinformatics Vol. 24; no. 1; pp. 491 - 8
Main Authors: Dorey-Robinson, Daniel, Maccari, Giuseppe, Hammond, John A.
Format: Journal Article
Language:English
Published: London BioMed Central 21.12.2023
BioMed Central Ltd
Springer Nature B.V
BMC
Subjects:
ISSN:1471-2105, 1471-2105
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Background The advent and continual improvement of high-throughput sequencing technologies has made immunoglobulin repertoire sequencing accessible and informative regardless of study species. However, to fully map dynamic changes in polyclonal responses precise framework and complementarity determining region annotation of rearranging genes is pivotal. Most sequence annotation tools are designed primarily for use with human and mouse antibody sequences which use databases with fixed species lists, applying very specific assumptions which select against unique structural characteristics. For this reason, data agnostic tools able to learn from presented data can be very useful with new species or with novel datasets. Results We have developed IgMAT, which utilises a reduced amino acid alphabet, that incorporates multiple HMM alignments into a single consensus to automatically annotate immunoglobulin sequences from most organisms. Additionally, the software allows the incorporation of user defined databases to better represent the species and/or antibody class of interest. To demonstrate the accuracy and utility of IgMAT, we present analysis of sequences extracted from structural data and immunoglobulin sequence datasets from several different species. Conclusions IgMAT is fully open-sourced and freely available on GitHub ( https://github.com/TPI-Immunogenetics/igmat ) for download under GPLv3 license. It can be used as a CLI application or as a python module to be integrated in custom scripts.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1471-2105
1471-2105
DOI:10.1186/s12859-023-05624-2