There is currently a large number of data programming models and their respective frontends such as relational tables, graphs, tensors, and streams. This has lead to a plethora of runtimes that typically focus on the efficient execution of just a single frontend. This fragmentation manifests today into highly complex pipelines that bundle multiple runtimes to support the necessary models. Hence, joint optimisation and execution of such pipelines across these frontend-bound runtimes is infeasible. We propose Arc as the first unified Intermediate Representation (IR) for data analytics that incorporates stream semantics based on a modern specification of streams, windows and stream aggregation, to combine batch and stream computation models. Arc extends Weld, an IR for batch computation, and adds stream interoperability as a natural extension to describe static computational graphs suitable for stream processing.
Sun 23 JunDisplayed time zone: Tijuana, Baja California change
11:20 - 12:20 | |||
11:20 20mTalk | Streaming saturation for large RDF graphs with dynamic schema information DBPL Mohammad Amin Farvardin PSL, Université Paris-Dauphine, LAMSADE, Dario Colazzo , Khalid Belhajjame PSL, Université Paris-Dauphine, LAMSADE, Carlo Sartiani | ||
11:40 20mTalk | Arc: An IR for Batch and Stream Programming DBPL Lars Kroll KTH Royal Institute of Technology, Sweden, Klas Segeljakt KTH, Paris Carbone KTH, Sweden, Christian Schulte KTH Royal Institute of Technology, Sweden, Seif Haridi Pre-print Media Attached | ||
12:00 20mTalk | Towards Compiling Graph Queries in Relational Engines DBPL Ruby Tahboub Purdue University, Xilun Wu Purdue University, Gregory Essertel , Tiark Rompf Purdue University |