Implementing Stencil Problems in Chapel: An Experience Report (CHIUW 2019)

Sat 22 - Wed 26 June 2019 Phoenix, Arizona, United States

Who

Per Fuchs, Pieter Hijma, Clemens Grelck

Track

CHIUW 2019

Time Zone

The program is currently displayed in (GMT-07:00) Tijuana, Baja California.

Use conference time zone: (GMT-07:00) Tijuana, Baja CaliforniaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Sat 22 Jun 2019 11:45 - 12:10 at 212A - Chapel Performance and Optimizations Chair(s): David G. Wonnacott

Abstract

Stencil operations represent a fundamental class of algorithms in high-performance computing. We are interested in what level of performance can be expected from a high-productivity language such as Chapel. To this effect we discuss four different implementations of a generic stencil operation with a convergence check after each iteration. We start with a sequential implementation followed by a global-view implementation that we experiment with both on a 16-core multi-core system as well as on a cluster with up to 16 such nodes using domain maps. We finish with a local-view implementation that explicitly encodes all design decisions with respect to parallel execution. This paper is set up as a two stage experience report: We mainly report our findings from the users’ perspective without any feedback from the Chapel implementers. We then report additional analysis performed under guidance of the Chapel team. Our experimental findings show that Chapel performs as expected on a single node. However, it does not achieve the expected levels of performance on our multi-node system, neither with the data-parallel global-view approach, nor with the task-parallel local-view code. We discuss the root causes of our reduced performance in detail and report possible solutions.

Per Fuchs

Vrije Universiteit (VU) Amsterdam

Pieter Hijma

Vrije Universiteit (VU) Amsterdam

Clemens Grelck

University of Amsterdam

Germany