Unsupervised Learning of API Aliasing Specifications (PLDI 2019 - PLDI Research Papers)

Who

Jan Eberhardt, Samuel Steffen, Veselin Raychev, Martin Vechev

Track

PLDI 2019 PLDI Research Papers

Time Zone

The program is currently displayed in (GMT-07:00) Tijuana, Baja California.

Use conference time zone: (GMT-07:00) Tijuana, Baja CaliforniaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Tue 25 Jun 2019 14:00 - 14:20 at 228AB - Learning Specifications Chair(s): Michael Pradel

Abstract

Real world applications make heavy use of powerful libraries and frameworks, posing a significant challenge for static analysis as the library implementation may be very complex or unavailable. Thus, obtaining specifications that summarize the behaviors of the library is important as it enables static analyzers to precisely track the effects of APIs on the client program, without requiring the actual API implementation.

In this work, we propose a novel method for discovering aliasing specifications of APIs by learning from a large dataset of programs. Unlike prior work, our method does not require manual annotation, access to the library's source code or ability to run its APIs. Instead, it learns specifications in a fully unsupervised manner, by statically observing usages of APIs in the dataset. The core idea is to learn a probabilistic model of interactions between API methods and aliasing objects, enabling identification of additional likely aliasing relations, and to then infer aliasing specifications of APIs that explain these relations. The learned specifications are then used to augment an API-aware points-to analysis.

We implemented our approach in a tool called USpec and used it to automatically learn aliasing specifications from millions of source code files. USpec learned over 2000 specifications of various Java and Python APIs, in the process improving the results of the points-to analysis and its clients.

Link to Preprint

https://files.sri.inf.ethz.ch/website/papers/unsupervised-learning-of-api-aliasing-specifications-pldi2019.pdf

Jan Eberhardt

DeepCode, Switzerland

Samuel Steffen

ETH Zurich, Switzerland

Switzerland

Veselin Raychev

DeepCode AG

Martin Vechev

ETH Zürich

Video Abstract