There is currently a large number of data programming models and their respective frontends such as relational tables, graphs, tensors, and streams. This has lead to a plethora of runtimes that typically focus on the efficient execution of just a single frontend. This fragmentation manifests today into highly complex pipelines that bundle multiple runtimes to support the necessary models. Hence, joint optimisation and execution of such pipelines across these frontend-bound runtimes is infeasible. We propose Arc as the first unified Intermediate Representation (IR) for data analytics that incorporates stream semantics based on a modern specification of streams, windows and stream aggregation, to combine batch and stream computation models. Arc extends Weld, an IR for batch computation, and adds stream interoperability as a natural extension to describe static computational graphs suitable for stream processing.
Conference DaySun 23 JunDisplayed time zone: Tijuana, Baja California change
11:20 - 12:20
|Streaming saturation for large RDF graphs with dynamic schema information|
|Arc: An IR for Batch and Stream Programming|
Lars KrollKTH Royal Institute of Technology, Sweden, Klas SegeljaktKTH, Paris CarboneKTH, Sweden, Christian SchulteKTH Royal Institute of Technology, Sweden, Seif HaridiPre-print Media Attached
|Towards Compiling Graph Queries in Relational Engines|