Mon 24 Jun 2019 09:05 - 09:25 at 229AB - Concurrency I Chair(s): Alastair F. Donaldson

A memory consistency model (or simply a memory model) specifies the granularity and the order in which memory accesses by one thread become visible to other threads in the program.
We previously proposed the volatile-by-default (VBD) memory model as a natural form of sequential consistency (SC) for Java. VBD is significantly stronger than the Java memory model (JMM) and incurs relatively modest overheads in a modified HotSpot JVM running on Intel x86 hardware. However, the x86 memory model is already quite close to SC. It is expected that the cost of VBD will be much higher on the other widely used hardware platform today, namely ARM, whose memory model is very weak.

In this paper, we quantify this expectation by building and evaluating a baseline volatile-by-default JVM for ARM called VBDA-HotSpot, using the same technique previously used for x86. Through this baseline we report, to the best of our knowledge, the first comprehensive study of the cost of providing language-level SC for a production compiler on ARM. VBDA-HotSpot indeed incurs a considerable performance penalty on ARM, with average overheads on the DaCapo benchmarks on two ARM servers of 57% and 73% respectively.

Motivated by these experimental results, we then present a novel speculative technique to optimize language-level SC. While several prior works have shown how to optimize SC in the context of an offline, whole-program compiler, to our knowledge this is the first optimization approach that is compatible with modern implementation technology, including dynamic class loading and just-in-time (JIT) compilation.
The basic idea is to modify the JIT compiler to treat each object as thread-local initially, so accesses to its fields can be compiled without fences. If an object is ever accessed by a second thread, any speculatively compiled code for the object is removed, and future JITed code for the object will include the necessary fences in order to ensure SC. We demonstrate that this technique is effective, reducing the overhead of enforcing VBD by one-third on average, and additional experiments validate the thread-locality hypothesis that underlies the approach.

Mon 24 Jun

Displayed time zone: Tijuana, Baja California change

08:45 - 09:45
Concurrency IPLDI Research Papers at 229AB
Chair(s): Alastair F. Donaldson Google and Imperial College London
08:45
20m
Talk
Promising-ARM/RISC-V: A Simpler and Faster Operational Concurrency Model
PLDI Research Papers
Christopher Pulte University of Cambridge, Jean Pichon-Pharabod University of Cambridge, Jeehoon Kang KAIST, Sung-Hwan Lee Seoul National University, South Korea, Chung-Kil Hur Seoul National University
Media Attached
09:05
20m
Talk
Accelerating Sequential Consistency for Java with Speculative Compilation
PLDI Research Papers
Lun Liu University of California at Los Angeles, USA, Todd Millstein University of California, Los Angeles, Madan Musuvathi Microsoft Research
DOI Pre-print Media Attached
09:25
20m
Talk
Renaissance: Benchmarking Suite for Parallel Applications on the JVM
PLDI Research Papers
Aleksandar Prokopec Oracle Labs, Andrea Rosà University of Lugano, Switzerland, David Leopoldseder Johannes Kepler University Linz, Gilles Duboscq Oracle Labs, Petr Tuma Charles University, Martin Studener JKU Linz, Austria, Lubomír Bulej Charles University, Yudi Zheng Oracle Labs, Alex Villazón Universidad Privada Boliviana, Bolivia, Doug Simon Oracle Labs, Thomas Wuerthinger Oracle Labs, Walter Binder University of Lugano, Switzerland