Coded Trace Reconstruction

Motivated by average-case trace reconstruction and coding for portable DNA-based storage systems, we initiate the study of coded trace reconstruction , the design and analysis of high-rate efficiently encodable codes that can be efficiently decoded with high probability from few reads (also called t...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE transactions on information theory Ročník 66; číslo 10; s. 6084 - 6103
Hlavní autoři: Cheraghchi, Mahdi, Gabrys, Ryan, Milenkovic, Olgica, Ribeiro, Joao
Médium: Journal Article
Jazyk:angličtina
Vydáno: New York IEEE 01.10.2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:
ISSN:0018-9448, 1557-9654
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Motivated by average-case trace reconstruction and coding for portable DNA-based storage systems, we initiate the study of coded trace reconstruction , the design and analysis of high-rate efficiently encodable codes that can be efficiently decoded with high probability from few reads (also called traces ) corrupted by edit errors. Codes used in current portable DNA-based storage systems with nanopore sequencers are largely based on heuristics, and have no provable robustness or performance guarantees even for an error model with i.i.d. deletions and constant deletion probability. Our work is the first step towards the design of efficient codes with provable guarantees for such systems. We consider a constant rate of i.i.d. deletions, and perform an analysis of marker-based code-constructions. This gives rise to codes with redundancy <inline-formula> <tex-math notation="LaTeX">O(n/\log n) </tex-math></inline-formula> (resp. <inline-formula> <tex-math notation="LaTeX">O(n/\log \log n) </tex-math></inline-formula>) that can be efficiently reconstructed from <inline-formula> <tex-math notation="LaTeX">\exp (O(\log ^{2/3}n)) </tex-math></inline-formula> (resp. <inline-formula> <tex-math notation="LaTeX">\exp (O(\log \log n)^{2/3}) </tex-math></inline-formula>) traces, where <inline-formula> <tex-math notation="LaTeX">n </tex-math></inline-formula> is the message length. Then, we give a construction of a code with <inline-formula> <tex-math notation="LaTeX">O(\log n) </tex-math></inline-formula> bits of redundancy that can be efficiently reconstructed from <inline-formula> <tex-math notation="LaTeX">\text {poly}(n) </tex-math></inline-formula> traces if the deletion probability is small enough. Finally, we show how to combine both approaches, giving rise to an efficient code with <inline-formula> <tex-math notation="LaTeX">O(n/\log n) </tex-math></inline-formula> bits of redundancy which can be reconstructed from <inline-formula> <tex-math notation="LaTeX">\text {poly}(\log n) </tex-math></inline-formula> traces for a small constant deletion probability.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0018-9448
1557-9654
DOI:10.1109/TIT.2020.2996377