FlowPipe: LLM-Enhanced Conditional Generative Flow Networks for Data Preparation Pipeline Construction (opens in new tab)
Data preparation pipelines improve data quality in machine learning by transforming raw tables into learning-ready data through sequential cleaning and feature transformation operators. However, automatically constructing such pipelines is computationally difficult because operator sequences are combinatorial and end-to-end evaluation is expensive. Existing state-of-the-art (SOTA) Multi-DQN methods still face three key limitations: decoupled v...
Read the original article