Sat 22 Jun 2019 14:30 - 15:00 at 106C - Session 4 Chair(s): Tze Meng Low

In a rank-polymorphic programming language, all functions automatically lift to operate on arbitrarily high-dimensional aggregate data. By adding records to such a language, we can support computation on data frames, a tabular data structure containing heterogeneous data but in which individual columns are homogeneous. In such a setting, a data frame is a vector of records, subject to both ordinary array operations (e.g., filtering, reducing, sorting) and lifted record operations—projecting a field lifts to projecting a column. Data frames have become a popular tool for exploratory data analysis, but fluidity of interacting with data frames via lifted record operations depends on how the language’s records are designed. We investigate three languages with different notions of record data: Racket, Standard ML, and Python. For each, we examine several common tasks for working with data frames and how the language’s records make these tasks easy or hard. Based on their advantages and disadvantages, we synthesize their ideas to produce a design for record types which is flexible for both scalar and lifted computation.

Sat 22 Jun

14:00 - 15:30: ARRAY 2019 - Session 4 at 106C
Chair(s): Tze Meng LowCMU
ARRAY-2019-papers14:00 - 14:30
Norman A. RinkTU Dresden, Germany, Jeronimo CastrillonTU Dresden, Germany
ARRAY-2019-papers14:30 - 15:00
Justin SlepakNortheastern University, Olin ShiversNortheastern University, USA, Panagiotis ManoliosNortheastern University
ARRAY-2019-papers15:00 - 15:30
Martin ElsmanUniversity of Copenhagen, Denmark, Troels HenriksenUniversity of Copenhagen, Denmark, Niels G. W. SerupDIKU, University of Copenhagen