Soundness, efficiency and precision are the main challenges in automatic program analysis. Abstract interpretation is a widely recognised and used method for specifying soundNbyNconstruction program analysis algorithms as approximations of the semantics of a programming language. The main contention is in between efficiency, typically achieved by considering simplified abstract domains and widening operators, and precision, typically achieved by considering more complex and refined abstractions. Instead of acting on the abstract domain by refinement to avoid false alarms, in this proposal we will solve this contention by refactoring code to better cope with a given abstraction. We will study the foundational properties of families of programs for which a given analysis is precise (i.e., no false alarms are produced). We will then develop a prototype compiler whose goal is to transform a program P into a semantically equivalent one Q that minimizes the false alarms for a given fixed abstract domain A. This will be achieved by exploiting machine learning techniques on large enclaves of code, where the programs, including P, are related by their similarity. In this case, fragments of code which are equivalent to the ones employed on P, are automatically identified in such a way that this code Q’, when analysed by A, produces less false alarms than the ones in P. Suitable code synthesis will be then implemented to refactor P to Q , which resembles Q’ in structure, to better cope with the abstract domain A.
The result will be a prototype complier which, given in input a program P and an abstraction A, automatically removes the most of false alarms produced by the analysis P by A.