Wootz: A Compiler-Based Framework for Fast CNN Pruning via Composability (PLDI 2019 - PLDI Research Papers)

Who

Hui Guan, Xipeng Shen, Seung-Hwan Lim

Track

PLDI 2019 PLDI Research Papers

Time Zone

The program is currently displayed in (GMT-07:00) Tijuana, Baja California.

Use conference time zone: (GMT-07:00) Tijuana, Baja CaliforniaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Tue 25 Jun 2019 14:40 - 15:00 at 224AB - Reasoning and Optimizing ML Models Chair(s): Martin Maas

Abstract

Convolutional Neural Networks (CNN) are widely used for Deep Learning tasks. CNN pruning is an important method to adapt a large CNN model trained on general datasets to fit a more specialized task or a smaller device. The key challenge is on deciding which filters to remove in order to maximize the quality of the pruned networks while satisfying the constraints. It is time-consuming due to the enormous configuration space and the slowness of CNN training.

The problem has drawn many efforts from the machine learning field, which try to reduce the set of network configurations to explore. This work tackles the problem distinctively from a programming systems perspective, trying to speed up the evaluations of the remaining configurations through computation reuse via a compiler-based framework. We empirically uncover the existence of composability in the training of a collection of pruned CNN models, and point out the opportunities for computation reuse. We then propose composability-based CNN pruning, and design a compression-based algorithm to efficiently identify the set of CNN layers to pre-train for maximizing their reuse benefits in CNN pruning. We further develop a compiler-based framework named Wootz, which, for an arbitrary CNN, automatically generates code that builds a Teacher-Student scheme to materialize composability-based pruning. Experiments show that network pruning enabled by Wootz shortens the state-of-art pruning process by up to 186X while producing significantly better pruning results.

File attachments

preprint (pldi19main-p793-p-f8be57a-40852M-final.pdf)	2.65MiB

Hui Guan

North Carolina State University

Xipeng Shen

North Carolina State University

United States

Seung-Hwan Lim

Oak Ridge National Laboratory, USA

Video Abstract