Precise Program Reasoning using Probabilistic Methods
The enormous rise in the scale, scope, and complexity of software projects has created a thriving marketplace for program analysis and verification tools. Despite routine adoption by industry, developing such tools remains challenging, and their designers must carefully balance tradeoffs between false alarms, missed bugs, and scalability to large codebases. Furthermore, when tools fail to verify some program property, they only provide coarse estimates of alarm relevance, potential severity, and of the likelihood of being a real bug, thereby limiting their usefulness in large software projects.
I will present a framework that extends deductive program reasoning systems with rich notions of alarm provenance. These ascriptions of root cause emerge naturally from the program structure, and enable us to identify alarms of interest to the programmer, and further refine the deductive process with probabilistic models of alarm confidence. In experiments with large programs, such graphical representations of alarm derivation enable an order-of-magnitude reduction in false alarm rates and invocations of expensive reasoning engines such as SMT solvers.
To the analysis user, these techniques offer a lens by which to focus their attention on the most important alarms and a uniform method for the tool to interactively generalize from human feedback. To the analysis designer, they offer novel opportunities to leverage data-driven approaches in analysis design. And to researchers, they offer new challenges while performing probabilistic inference in models of unprecedented size. I will conclude by describing how these ideas highlight important problems at the frontiers of logical and statistical AI, and promise to underpin the next generation of intelligent programming systems, with applications in diverse areas such as program synthesis, differentiable programming, and fault localization in complex systems.