APDTFlow v0.3.0: From Research to Production-Ready Time Series Forecasting

11 min read9 hours ago

–

Introduction

A few months ago, I introduced APDTFlow a modular forecasting framework that combines Neural ODEs, multi-scale decomposition, and transformer architectures to tackle the challenges of time series forecasting.

Press enter or click to view image in full size

GitHub repository

Today, I’m excited to announce APDTFlow v0.3.0, a major update that bridges the gap between research prototype and production-ready forecasting system. This release focuses on what real-world forecasting actually requires: handling categorical features, quantifying uncertainty with mathematical guarantees, rigorous model validation, and comprehensive diagnostics.

What’s New in v0.3.0?

Version 0.3.0 introdu…

11 min read9 hours ago

–

Introduction

Press enter or click to view image in full size

GitHub repository

What’s New in v0.3.0?

Version 0.3.0 introduces four major feature categories:

1. Categorical Feature Support: Handle day-of-week effects, holidays, store IDs, and other categorical patterns

2. **Conformal Prediction: **Distribution-free uncertainty quantification with mathematical coverage guarantees

3. **Comprehensive Model Validation: **Darts-style backtesting and statistical residual diagnostics

4. **Built-in Visualization: **Plot forecasts with uncertainty bands and diagnostic plots

5. **Industry-Standard Metrics: **MASE, sMAPE, and CRPS for proper benchmarking

Each of these features addresses a real gap between academic forecasting models and what’s actually needed in production. Let me walk you through them.

Categorical Features: The Missing Piece

One of the most common patterns in real-world time series data is the effect of categorical variables. Sales are different on Mondays versus Fridays. Energy consumption changes during holidays. Customer behavior varies by store location.

Traditional time series models either ignore these patterns entirely or require manual feature engineering to create dummy variables. APDTFlow v0.3.0 introduces native categorical feature support with two encoding strategies:

One-Hot Encoding

The simplest approach, each category gets its own binary feature. This is interpretable and works well for low-cardinality features like day-of-week (7 categories) or month (12 categories).

Learnable Embeddings

For high-cardinality features like store IDs (potentially hundreds or thousands of unique values), embeddings are more efficient. The model learns dense vector representations that capture similarities between categories — similar stores get similar embeddings.

How to Use Categorical Features

Here’s a practical example using retail sales data:

from apdtflow.forecaster import APDTFlowForecasterfrom apdtflow.preprocessing.categorical_encoder import create_time_featuresimport pandas as pd# Load your datadf = pd.read_csv('retail_sales.csv', parse_dates=['date'])# Automatically create time-based categorical featuresdf = create_time_features(df, date_col='date')# This adds: day_of_week, month, quarter, is_weekend# Initialize model with categorical supportmodel = APDTFlowForecaster(    forecast_horizon=7,    categorical_encoding='onehot',  # or 'embedding'    hidden_dim=64,    num_scales=3)# Fit with categorical featuresmodel.fit(    df,    target_col='sales',    date_col='date',    categorical_cols=['day_of_week', 'is_holiday', 'store_id'])# Predictions automatically incorporate categorical patternspredictions = model.predict(future_dates=7)

The model learns that Saturdays have different patterns than Tuesdays, and adjusts predictions accordingly.

Conformal Prediction: Uncertainty with Guarantees

Forecasting the future always involves uncertainty. But how do we quantify that uncertainty in a way we can trust?

Traditional approaches use prediction intervals based on distributional assumptions it assuming residuals are Gaussian, using standard deviations, and hoping for the best. The problem is that real-world forecast errors rarely follow nice, well-behaved distributions.

Distribution-Free Uncertainty Quantification

APDTFlow v0.3.0 introduces conformal prediction, a framework that provides uncertainty quantification without making distributional assumptions. The key insight: instead of assuming a distribution, we use past forecast errors directly to calibrate prediction intervals.

Here’s what makes conformal prediction special:

Coverage Guarantees: If you request a 95% prediction interval, you get a mathematical guarantee that 95% of future values will fall within that interval
Distribution-Free: Works for any data distribution, no Gaussian assumptions required
Finite-Sample Validity: The guarantees hold even with limited calibration data

Two Flavors: Split and Adaptive

APDTFlow implements two conformal prediction methods:

Split Conformal Prediction

The classic approach: split your data into training and calibration sets. Compute forecast errors on the calibration set, then use the quantile of these errors to construct prediction intervals.

Adaptive Conformal Prediction

More sophisticated: the prediction intervals adapt online based on recent forecast performance. If the model starts making larger errors, the intervals widen automatically. This is particularly useful for non-stationary time series where uncertainty changes over time.

Using Conformal Prediction

Here’s how to enable conformal prediction in your forecasts:

# Initialize model with conformal predictionmodel = APDTFlowForecaster(    forecast_horizon=14,    use_conformal=True,    conformal_method='adaptive',  # or 'split'    hidden_dim=64)# Fit the model (includes calibration step)model.fit(df, target_col='sales', date_col='date')# Get predictions with 95% prediction intervalspredictions, lower_bound, upper_bound = model.predict(    alpha=0.05,  # 1 - alpha = 95% coverage    return_intervals='conformal')# Guaranteed: 95% of actual future values will fall in [lower_bound, upper_bound]

For production systems, especially in risk-sensitive applications like energy grid management or supply chain planning, having reliable uncertainty estimates is crucial. Conformal prediction gives you intervals you can actually trust — not just statistical hand-waving, but mathematical guarantees.

The research behind this is quite recent, with key papers from ICLR 2025 and arXiv preprints showing impressive results on real-world forecasting benchmarks.

Model Validation: Backtesting and Residual Diagnostics

One of the biggest gaps between academic papers and production ML is rigorous validation. It’s not enough to show good metrics on a single train-test split you need to prove your model works consistently across different time periods.

Historical Forecasts (Backtesting)

APDTFlow v0.3.0 introduces historical_forecasts(), inspired by the Darts library’s approach to backtesting. This method simulates production forecasting on historical data using a rolling window:

1. Start at some point in your historical data

2. Make a forecast for the next N steps

3. Move forward in time

4. Repeat, collecting all forecasts and actual values

This gives you a realistic picture of how your model would have performed in production.

# Run backtesting with weekly forecastsbacktest_results = model.historical_forecasts(    data=df,    target_col='sales',    start=0.7,           # Start at 70% through the data    forecast_horizon=7,   # 7-day forecasts    stride=7,             # Make a new forecast every 7 days    retrain=False,        # Use fixed model (fast)    metrics=['MAE', 'MASE', 'sMAPE'])# Analyze resultsprint(f"Average MAE: {backtest_results['abs_error'].mean():.2f}")print(f"MASE: {backtest_results['mase'].mean():.3f}")# backtest_results is a DataFrame with:# - timestamp, fold, forecast_step, actual, predicted, error# Perfect for detailed analysis and visualization

You can also enable retrain=True to retrain the model at each fold, simulating a more realistic production scenario where you periodically retrain on new data.

Residual Diagnostics

Good forecasters don’t just make predictions — they understand when and why their models fail. APDTFlow v0.3.0 includes comprehensive residual analysis tools:

# Compute residuals on test dataresiduals, actuals, predictions = model.compute_residuals(test_df)# Visualize with 4-panel diagnostic plotfig, axes = model.plot_residuals()# Shows: residuals over time, distribution, ACF plot, Q-Q plot# Statistical analysisdiagnostics = model.analyze_residuals()print(f"Mean residual: {diagnostics['mean']:.4f}")print(f"Shapiro-Wilk p-value: {diagnostics['shapiro_pvalue']:.4f}")print(f"Ljung-Box p-value: {diagnostics['ljung_box_pvalue']:.4f}")

The diagnostic suite includes:

Shapiro-Wilk Test: Are residuals normally distributed?
Ljung-Box Test: Are residuals autocorrelated (indicating missed patterns)?
Visual Diagnostics: Time series plot, histogram with fitted normal, ACF plot, Q-Q plot

These tools help you diagnose model deficiencies:

Systematic patterns in residuals → model is missing something
Heteroscedasticity → prediction uncertainty varies over time
Autocorrelation → temporal dependencies not fully captured

This level of diagnostic rigor is standard in production ML systems at companies like Uber and Amazon, and now it’s built into APDTFlow.

Visualization: Making Forecasts Interpretable

One of the most practical features in APDTFlow is built-in visualization. Understanding your forecasts visually is crucial for both model validation and communicating results to stakeholders.

Simple Forecast Plotting

# Make predictionspredictions = model.predict(future_dates=14)# Visualize with historymodel.plot_forecast(with_history=100)# Output: Matplotlib plot showing last 100 historical points + 14 future predictions

Forecasts with Uncertainty Bands

When using conformal prediction or any uncertainty estimation method, you can visualize confidence intervals:

# Enable conformal predictionmodel = APDTFlowForecaster(    forecast_horizon=14,    use_conformal=True,    conformal_method='adaptive')model.fit(df, target_col='sales', date_col='date')# Get predictions with intervalslower, pred, upper = model.predict(    alpha=0.05,  # 95% confidence    return_intervals='conformal')# Visualize with uncertainty bandsmodel.plot_forecast(    with_history=100,    show_uncertainty=True)# Output: Plot with shaded confidence regions around predictions

Backtesting Visualization

You can also visualize backtesting results to see how your model performed across multiple time periods:

import matplotlib.pyplot as plt# Run backtestingbacktest_results = model.historical_forecasts(    data=df,    target_col='sales',    start=0.7,    forecast_horizon=7,    stride=7)# Create custom visualizationplt.figure(figsize=(12, 6))plt.plot(backtest_results['timestamp'], backtest_results['actual'],         'o-', label='Actual', alpha=0.7)plt.plot(backtest_results['timestamp'], backtest_results['predicted'],         's-', label='Predicted', alpha=0.7)plt.fill_between(backtest_results['timestamp'],                 backtest_results['actual'],                 backtest_results['predicted'],                 alpha=0.2)plt.xlabel('Date')plt.ylabel('Sales')plt.title('Backtesting Results: Actual vs Predicted')plt.legend()plt.grid(True, alpha=0.3)plt.show()

Industry-Standard Metrics:

When benchmarking forecasting models, you need metrics that are comparable across different datasets and scales. APDTFlow v0.3.0 adds four metrics that are widely used in forecasting competitions and production systems:

MASE (Mean Absolute Scaled Error)

Scale-independent metric that normalizes errors relative to a naive seasonal baseline. MASE < 1.0 means you’re beating the naive forecast.

mase = model.score(test_df, metric='mase')print(f"MASE: {mase:.3f}")  # < 1.0 is good

Hyndman & Koehler (2006) “Another look at measures of forecast accuracy”

sMAPE (Symmetric Mean Absolute Percentage Error)

Addresses the asymmetry problem in standard MAPE — treats over-forecasts and under-forecasts symmetrically. Bounded between 0% and 200%.

smape = model.score(test_df, metric='smape')print(f"sMAPE: {smape:.2f}%")

Makridakis (1993) “Accuracy measures: theoretical and practical concerns”

CRPS (Continuous Ranked Probability Score)

Evaluates probabilistic forecasts by measuring both sharpness (narrow intervals) and calibration (correct coverage). Essential for evaluating conformal prediction.

crps = model.score(test_df, metric='crps')print(f"CRPS: {crps:.3f}")  # Lower is better

Gneiting & Raftery (2007)“Strictly proper scoring rules, prediction, and estimation”

Coverage

Measures prediction interval calibration — do your 95% intervals actually contain 95% of observations?

coverage = model.score(test_df, metric='coverage', alpha=0.05)print(f"Coverage: {coverage:.1%}")  # Should be close to 95%

These metrics are the same ones used in the M4 and M5 forecasting competitions, making your results directly comparable to state-of-the-art methods.

Putting It All Together: A Complete Workflow

Here’s what a complete production forecasting workflow looks like with APDTFlow:

from apdtflow.forecaster import APDTFlowForecasterfrom apdtflow.preprocessing.categorical_encoder import create_time_featuresimport pandas as pd# 1. Load and prepare datadf = pd.read_csv('sales_data.csv', parse_dates=['date'])df = create_time_features(df, date_col='date')# 2. Initialize model with all production featuresmodel = APDTFlowForecaster(    forecast_horizon=14,    num_scales=3,    hidden_dim=64,    use_conformal=True,    conformal_method='adaptive',    categorical_encoding='onehot')# 3. Train with categorical features and exogenous variablesmodel.fit(    df,    target_col='sales',    date_col='date',    categorical_cols=['day_of_week', 'is_holiday'],    exog_cols=['temperature', 'promotion'],    exog_fusion_type='gated')# 4. Rigorous validation via backtestingbacktest_results = model.historical_forecasts(    data=df,    target_col='sales',    start=0.7,    forecast_horizon=14,    stride=7,    metrics=['MAE', 'MASE', 'sMAPE', 'CRPS'])print(f"Backtest MASE: {backtest_results['mase'].mean():.3f}")# 5. Residual diagnosticsdiagnostics = model.analyze_residuals()if diagnostics['ljung_box_pvalue'] < 0.05:    print("Warning: Residuals show autocorrelation—consider more lags")# 6. Production forecasts with uncertaintypredictions, lower, upper = model.predict(    future_dates=14,    alpha=0.05,  # 95% intervals    return_intervals='conformal')# 7. Visualize predictions with uncertainty bandsmodel.plot_forecast(    with_history=100,        # Show last 100 historical points    show_uncertainty=True    # Display confidence intervals)# Output: Beautiful matplotlib plot with history + predictions + uncertainty bands# 8. Model persistencemodel.save('production_model.pkl')# Later: reload and useloaded_model = APDTFlowForecaster.load('production_model.pkl')new_predictions = loaded_model.predict(future_dates=14)

This workflow covers everything you need for a production forecasting system: feature engineering, rigorous validation, uncertainty quantification, diagnostics, and persistence.

Under the Hood: What Makes APDTFlow Different

While v0.3.0 adds many practical features, the core architecture that makes APDTFlow unique remains:

Neural ODEs for Continuous-Time Modeling

Unlike standard RNNs or transformers that operate on discrete time steps, APDTFlow models continuous dynamics using Neural ODEs. This provides several advantages:

Irregular Time Steps: Handle missing data and irregular sampling naturally
Smooth Evolution: Forecasts evolve smoothly rather than jumping discontinuously
Adaptive Computation: The ODE solver adjusts step size based on dynamics complexity

Multi-Scale Decomposition

Before feeding data to the Neural ODE, APDTFlow decomposes the signal into multiple scales using a wavelet-like approach. This allows the model to:

Capture both long-term trends and short-term fluctuations simultaneously
Process different time scales with different network capacities
Improve interpretability by separating fast and slow dynamics

Probabilistic Fusion

The latent representations from different scales are combined using a probabilistic fusion mechanism that:

Learns optimal weights for each scale
Quantifies uncertainty at the fusion level
Provides a natural framework for conformal prediction

What’s Next?

Hierarchical Forecasting: Support for hierarchical time series with reconciliation
Automated Hyperparameter Tuning: Integration with Optuna or Ray Tune
Extended Exogenous Support: More fusion strategies and covariate handling

Conclusion

APDTFlow started as a research project exploring Neural ODEs for time series forecasting. With v0.3.0, it’s evolved into a production-ready framework that combines cutting-edge research with the practical features that real-world forecasting requires.

The addition of categorical features, conformal prediction, comprehensive validation tools, and industry-standard metrics makes APDTFlow suitable for everything from academic research to production deployment at scale.

If you’re working on time series forecasting problems I encourage you to give APDTFlow v0.3.0 a try. The framework is open source, actively maintained, and designed to be both powerful and approachable.

Check out the:

GitHub repository

for documentation, examples, and to contribute!

If you enjoyed this post, please give it a clap. Feel free to follow me on** Medium **for more articles!

Connect with me on LinkedIn to discuss forecasting challenges and solutions.

References

Conformal Prediction:

1. Shafer, G., & Vovk, V. (2008). *A Tutorial on Conformal Prediction*. Journal of Machine Learning Research, 9, 371–421.

The foundational paper on conformal prediction, establishing the theoretical framework for distribution-free prediction intervals.

2. Gibbs, I., & Candès, E. (2021). *Adaptive Conformal Inference Under Distribution Shift*. arXiv:2106.00170.

Introduces adaptive conformal prediction methods that adjust to non-stationarity, directly relevant to APDTFlow’s implementation.

3. Xu, C., & Xie, Y. (2023). *Conformal Prediction for Time Series*. arXiv:2010.09107.

Comprehensive study of conformal prediction in time series contexts, addressing temporal dependencies and calibration.

4. Zaffran, M., et al. (2025). *Adaptive Conformal Predictions for Time Series*. ICML 2025.

Recent work on adaptive methods with learning rates, influencing APDTFlow’s adaptive implementation.

Categorical Features in Time Series:

5. Oreshkin, B. N., et al. (2019). *N-BEATS: Neural basis expansion analysis for interpretable time series forecasting*. ICLR 2020.

While N-BEATS focuses on pure time series, the discussion of feature engineering influenced APDTFlow’s categorical design.

6. Lim, B., et al. (2021). *Temporal Fusion Transformers for Interpretable Multi-horizon Time Series Forecasting*. International Journal of Forecasting.

Comprehensive treatment of categorical and continuous covariates in deep learning forecasting models.

Backtesting and Model Validation:

7. Cerqueira, V., Torgo, L., & Mozetič, I. (2020). *Evaluating time series forecasting models: An empirical study on performance estimation methods*. Machine Learning, 109, 1997–2028.

Rigorous study of time series validation methods, establishing best practices for backtesting.

8. Bergmeir, C., & Benítez, J. M. (2012). *On the use of cross-validation for time series predictor evaluation*. Information Sciences, 191, 192–213.

Critical analysis of cross-validation for time series, informing APDTFlow’s rolling window approach.

Forecasting Metrics:

9. Hyndman, R. J., & Koehler, A. B. (2006). *Another look at measures of forecast accuracy*. International Journal of Forecasting, 22(4), 679–688.

The definitive paper on MASE and other scale-independent forecasting metrics.

10. Makridakis, S. (1993). *Accuracy measures: theoretical and practical concerns*. International Journal of Forecasting, 9(4), 527–529.

Discusses sMAPE and the problems with asymmetric percentage errors.

11. Gneiting, T., & Raftery, A. E. (2007). *Strictly proper scoring rules, prediction, and estimation*. Journal of the American Statistical Association, 102(477), 359–378.

Establishes CRPS as a proper scoring rule for probabilistic forecasts.

Neural ODEs and Time Series:

12. Chen, R. T. Q., et al. (2018). *Neural Ordinary Differential Equations*. NeurIPS 2018.

The seminal paper introducing Neural ODEs, foundational to APDTFlow’s architecture.

13. Rubanova, Y., Chen, R. T. Q., & Duvenaud, D. (2019). *Latent ODEs for Irregularly-Sampled Time Series*. NeurIPS 2019.

Extends Neural ODEs to irregularly-sampled time series, directly applicable to forecasting.

Exogenous Variables and Fusion:

14. Chen, Y., et al. (2024). *TimeXer: Empowering Transformers for Time Series Forecasting with Exogenous Variables*. arXiv:2402.19072.

Recent work on exogenous variable fusion strategies, influencing APDTFlow’s gated and attention mechanisms.

15. Ilbert, A., et al. (2025). *ChronosX: Advancing Time Series Forecasting with Exogenous Variables*. arXiv:2503.21251.

Very recent preprint on advanced covariate handling in deep learning forecasters.

General Time Series Forecasting:

16. Hyndman, R. J., & Athanasopoulos, G. (2021). *Forecasting: Principles and Practice* (3rd ed.). OTexts.

The comprehensive textbook on forecasting, covering classical and modern methods. Available free online at [https://otexts.com/fpp3/](https://otexts.com/fpp3/)

17. Januschowski, T., et al. (2020). *Criteria for classifying forecasting methods*. International Journal of Forecasting, 36(1), 167–177.

Framework for understanding different forecasting paradigms, helping position APDTFlow in the landscape.

Introduction

What’s New in v0.3.0?

Introduction

What’s New in v0.3.0?

Categorical Features: The Missing Piece

One-Hot Encoding

Learnable Embeddings

How to Use Categorical Features

Conformal Prediction: Uncertainty with Guarantees

Distribution-Free Uncertainty Quantification

Two Flavors: Split and Adaptive

Using Conformal Prediction

Model Validation: Backtesting and Residual Diagnostics

Historical Forecasts (Backtesting)

Residual Diagnostics

Visualization: Making Forecasts Interpretable

Simple Forecast Plotting

Forecasts with Uncertainty Bands

Backtesting Visualization

Industry-Standard Metrics:

MASE (Mean Absolute Scaled Error)

sMAPE (Symmetric Mean Absolute Percentage Error)

CRPS (Continuous Ranked Probability Score)

Coverage

Putting It All Together: A Complete Workflow

Under the Hood: What Makes APDTFlow Different

Neural ODEs for Continuous-Time Modeling

Multi-Scale Decomposition

Probabilistic Fusion

What’s Next?

Conclusion

References

Conformal Prediction:

Categorical Features in Time Series:

Backtesting and Model Validation:

Forecasting Metrics:

Neural ODEs and Time Series:

Exogenous Variables and Fusion:

General Time Series Forecasting:

Similar Posts