There is currently a large number of data programming models and their respective frontends such as relational tables, graphs, tensors, and streams. This has lead to a plethora of runtimes that typically focus on the efficient execution of just a single frontend. This fragmentation manifests today into highly complex pipelines that bundle multiple runtimes to support the necessary models. Hence, joint optimisation and execution of such pipelines across these frontend-bound runtimes is infeasible. We propose Arc as the first unified Intermediate Representation (IR) for data analytics that incorporates stream semantics based on a modern specification of streams, windows and stream aggregation, to combine batch and stream computation models. Arc extends Weld, an IR for batch computation, and adds stream interoperability as a natural extension to describe static computational graphs suitable for stream processing.
Sun 23 Jun
|11:20 - 11:40|
|11:40 - 12:00|
Lars KrollKTH Royal Institute of Technology, Sweden, Klas SegeljaktKTH, Paris CarboneKTH, Sweden, Christian SchulteKTH Royal Institute of Technology, Sweden, Seif HaridiPre-print Media Attached
|12:00 - 12:20|