Tue 25 Jun 2019 16:00 - 16:20 at 224AB - Performance Chair(s): Ting Cao

Minimizing cache misses has been the traditional goal in optimizing cache performance using compiler based techniques. However, continuously increasing dataset sizes combined with large numbers of cache banks and memory banks connected using on-chip networks in emerging manycores/accelerators makes cache hit–miss latency optimization as important as cache miss rate minimization. In this paper, we propose compiler support that optimizes both the latencies of last-level cache (LLC) hits and the latencies of LLC misses. Our approach tries to achieve this goal by improving the parallelism exhibited by LLC hits and LLC misses. More specifically, it tries to maximize both cache-level parallelism (CLP) and memory-level parallelism (MLP). This paper presents different incarnations of our approach, and evaluates them using a set of 12 multithreaded applications. Our results indicate that (i) optimizing MLP first and CLP later brings, on average, 11.31% performance improvement over an approach that already minimizes the number of LLC misses, and (ii) optimizing CLP first and MLP later brings 9.43% performance improvement. In comparison, balancing MLP and CLP brings 17.32% performance improvement on average.

Tue 25 Jun
Times are displayed in time zone: (GMT-07:00) Tijuana, Baja California change

16:00 - 17:00: PLDI Research Papers - Performance at 224AB
Chair(s): Ting CaoMicrosoft Research
pldi-2019-papers16:00 - 16:20
Xulong TangPenn State, Mahmut Taylan KandemirPennsylvania State University, USA, Mustafa KarakoyTOBB University of Economics and Technology, Turkey, Meenakshi ArunachalamIntel, USA
Media Attached
pldi-2019-papers16:20 - 16:40
Laxman DhulipalaCarnegie Mellon University, Guy E. BlellochCarnegie Mellon University, Julian ShunMIT
pldi-2019-papers16:40 - 17:00
Kirshanthan SundararajahPurdue University, Milind KulkarniPurdue University
Media Attached