From Conventional Biology to Computational Biology: recent problems on sequences and structures
The discovery of motifs in biosequences is frequently torn between the rigidity of the model on the one hand
and the abundance of candidates on the other. In particular, the variety of motifs described by strings that include ``don't care''
patterns escalates exponentially with the length of the motif, and this gets only worse if a don't care is allowed to stretch
up to some prescribed maximum length. This circumstance tends to generates daunting computational burdens, and
often gives rise to tables that are impossible to visualize and digest. While part of the problem is endemic, another part
of it seems rooted in the various characterizations offered for the notion of a motif, that are typically based either on
syntax or on statistics alone. The talk will address the application of these concepts to problems on strings originated
from biological processes. Other problems, not related to strings, will be presented like the tree-dimensional alignment
of proteins and the study of human migration based on genomic variations.