Publications

Fast matching statistics for sets of long similar strings  (2024)

Authors:
Liptak, Zsuzsanna; Luca', Martina; Masillo, Francesco; Puglisi, Simon J.
Title:
Fast matching statistics for sets of long similar strings
Year:
2024
Type of item:
Contributo in atti di convegno
Tipologia ANVUR:
Contributo in Atti di convegno
Language:
Inglese
Congresso:
Prague Stringology Conference
Place:
Prague, Czech Republic
Period:
26.08.2024-27.08.2024
Publisher:
Prague, Czech Republic, August 26-28
Page numbers:
3-15
Keyword:
matching statistics, suffix array, parallel algorithms, LCP-array
Short description of contents:
Matching statistics (MS) computation is at the heart of numerous bioinformatics applications, from read alignment to computing phylogenies of a set of genomes or even speeding up the computation of core data structures on collections of genomes. Many of these datasets have the property of being highly similar to the reference, which itself, however, may not be very repetitive. Some heuristics based on sequenceto-sequence similarity have already been studied in [Lipt´ak et al., Alg. Mol. Biol. 2024], leading to a significant speedup in the computation of the matching statistics. In this paper, we introduce a new heuristic that further speeds MS computation. The core idea is to take advantage of existing similarities between the input sequences and the reference. We give an implementation making use of this heuristic, which also allows the use of multiple threads to parallelize MS computation. We give an experimental evaluation of our tool, LRF-ms, comparing it to other MS computation tools, on publicly available genomic datasets, and show that it is the fastest when the collection of genomes is highly similar to the reference string, while keeping a comparably low memory footprint.
Product ID:
143408
Handle IRIS:
11562/1147690
Last Modified:
December 19, 2024
Bibliographic citation:
Liptak, Zsuzsanna; Luca', Martina; Masillo, Francesco; Puglisi, Simon J., Fast matching statistics for sets of long similar strings  in Proc. of the 27th Prague Stringology Conference (PSC 2024)Prague, Czech Republic, August 26-28Proceedings of "Prague Stringology Conference" , Prague, Czech Republic , 26.08.2024-27.08.2024 , 2024pp. 3-15

Consulta la scheda completa presente nel repository istituzionale della Ricerca di Ateneo IRIS

<<back

Activities

Research facilities

Share