A Middle Layer for Offloading JVM-Based SQL Engines' Execution to Native Engines
github.com·15h·
Discuss: Hacker News
Flag this post

Apache Gluten (Incubating)

A Middle Layer for Offloading JVM-based SQL Engines’ Execution to Native Engines

1. Introduction

Background

Apache Spark is a mature and stable project that has been under continuous development for many years. It is one of the most widely used frameworks for scaling out the processing of petabyte-scale datasets. Over time, the Spark community has had to address significant performance challenges, which required a variety of optimizations. A major milestone came with Spark 2.0, where Whole-Stage Code Generation replaced the Volcano Model, delivering up to a 2× speedup. Since then, most subsequent improvements have focused on the query plan level, while the performance of individual operators has almost stopped improving.

In recent year…

Similar Posts

Loading similar posts...