Pubblicazioni

Fast matching statistics for sets of long similar strings  (2024)

Autori:
Liptak, Zsuzsanna; Luca', Martina; Masillo, Francesco; Puglisi, Simon J.
Titolo:
Fast matching statistics for sets of long similar strings
Anno:
2024
Tipologia prodotto:
Contributo in atti di convegno
Tipologia ANVUR:
Contributo in Atti di convegno
Lingua:
Inglese
Titolo del Convegno:
Prague Stringology Conference
Luogo:
Prague, Czech Republic
Periodo:
26.08.2024-27.08.2024
Casa editrice:
Prague, Czech Republic, August 26-28
Intervallo pagine:
3-15
Parole chiave:
matching statistics, suffix array, parallel algorithms, LCP-array
Breve descrizione dei contenuti:
Matching statistics (MS) computation is at the heart of numerous bioinformatics applications, from read alignment to computing phylogenies of a set of genomes or even speeding up the computation of core data structures on collections of genomes. Many of these datasets have the property of being highly similar to the reference, which itself, however, may not be very repetitive. Some heuristics based on sequenceto-sequence similarity have already been studied in [Lipt´ak et al., Alg. Mol. Biol. 2024], leading to a significant speedup in the computation of the matching statistics. In this paper, we introduce a new heuristic that further speeds MS computation. The core idea is to take advantage of existing similarities between the input sequences and the reference. We give an implementation making use of this heuristic, which also allows the use of multiple threads to parallelize MS computation. We give an experimental evaluation of our tool, LRF-ms, comparing it to other MS computation tools, on publicly available genomic datasets, and show that it is the fastest when the collection of genomes is highly similar to the reference string, while keeping a comparably low memory footprint.
Id prodotto:
143408
Handle IRIS:
11562/1147690
ultima modifica:
19 dicembre 2024
Citazione bibliografica:
Liptak, Zsuzsanna; Luca', Martina; Masillo, Francesco; Puglisi, Simon J., Fast matching statistics for sets of long similar strings  in Proc. of the 27th Prague Stringology Conference (PSC 2024)Prague, Czech Republic, August 26-28Atti di "Prague Stringology Conference" , Prague, Czech Republic , 26.08.2024-27.08.2024 , 2024pp. 3-15

Consulta la scheda completa presente nel repository istituzionale della Ricerca di Ateneo IRIS

<<indietro

Attività

Strutture

Condividi