Sat 22 Jun 2019 15:00 - 15:25 at 212A - Applications of Chapel Chair(s): Michael Ferguson

Exploratory data analysis (EDA) is the prerequisite for all data science. EDA is non-negotiably interactive—by far the most popular environment for EDA is a Jupyter notebook—and, as datasets grow, increasingly computationally intensive. Several existing projects attempt to combine interactivity and distributed computation using programming paradigms and tools from cloud computing, but none of these projects have come close to meeting our needs for high-performance EDA. To fill this gap, we have developed a prototype, called arkouda, that allows a user to interactively issue massively parallel computations on distributed data. We designed the API of arkouda to closely mimic NumPy, the underlying computational library used in approximately 80% of EDA workflows (based on a sample of Jupyter notebooks). Our vision is that users will import arkouda as a Python module in place of NumPy (e.g. “import arkouda as np”) and use familiar NumPy functions and syntax to interact with arrays of data residing on an HPC. The computational heart of arkouda is a Chapel interpreter that accepts a pre-defined set of commands from the Python frontend and uses Chapel’s built-in machinery for multi-locale and multithreaded execution. While arkouda, in our experience, comes closer than anything else to enabling high-performance EDA, the process of developing arkouda has also helped identify ways Chapel must improve in order to become a truly productive language for data science.

Sat 22 Jun

CHIUW-2019-papers
15:00 - 15:25: CHIUW 2019 - Applications of Chapel at 212A
Chair(s): Michael FergusonCray Inc.
CHIUW-2019-papers15:00 - 15:25
Talk