11 min read9 hours ago
–
Introduction
A few months ago, I introduced APDTFlow a modular forecasting framework that combines Neural ODEs, multi-scale decomposition, and transformer architectures to tackle the challenges of time series forecasting.
Press enter or click to view image in full size
Today, I’m excited to announce APDTFlow v0.3.0, a major update that bridges the gap between research prototype and production-ready forecasting system. This release focuses on what real-world forecasting actually requires: handling categorical features, quantifying uncertainty with mathematical guarantees, rigorous model validation, and comprehensive diagnostics.
What’s New in v0.3.0?
Version 0.3.0 introdu…
11 min read9 hours ago
–
Introduction
A few months ago, I introduced APDTFlow a modular forecasting framework that combines Neural ODEs, multi-scale decomposition, and transformer architectures to tackle the challenges of time series forecasting.
Press enter or click to view image in full size
Today, I’m excited to announce APDTFlow v0.3.0, a major update that bridges the gap between research prototype and production-ready forecasting system. This release focuses on what real-world forecasting actually requires: handling categorical features, quantifying uncertainty with mathematical guarantees, rigorous model validation, and comprehensive diagnostics.
What’s New in v0.3.0?
Version 0.3.0 introduces four major feature categories:
1. Categorical Feature Support: Handle day-of-week effects, holidays, store IDs, and other categorical patterns
2. **Conformal Prediction: **Distribution-free uncertainty quantification with mathematical coverage guarantees
3. **Comprehensive Model Validation: **Darts-style backtesting and statistical residual diagnostics
4. **Built-in Visualization: **Plot forecasts with uncertainty bands and diagnostic plots
5. **Industry-Standard Metrics: **MASE, sMAPE, and CRPS for proper benchmarking
Each of these features addresses a real gap between academic forecasting models and what’s actually needed in production. Let me walk you through them.
Categorical Features: The Missing Piece
One of the most common patterns in real-world time series data is the effect of categorical variables. Sales are different on Mondays versus Fridays. Energy consumption changes during holidays. Customer behavior varies by store location.
Traditional time series models either ignore these patterns entirely or require manual feature engineering to create dummy variables. APDTFlow v0.3.0 introduces native categorical feature support with two encoding strategies:
One-Hot Encoding
The simplest approach, each category gets its own binary feature. This is interpretable and works well for low-cardinality features like day-of-week (7 categories) or month (12 categories).
Learnable Embeddings
For high-cardinality features like store IDs (potentially hundreds or thousands of unique values), embeddings are more efficient. The model learns dense vector representations that capture similarities between categories — similar stores get similar embeddings.
How to Use Categorical Features
Here’s a practical example using retail sales data:
from apdtflow.forecaster import APDTFlowForecasterfrom apdtflow.preprocessing.categorical_encoder import create_time_featuresimport pandas as pd# Load your datadf = pd.read_csv('retail_sales.csv', parse_dates=['date'])# Automatically create time-based categorical featuresdf = create_time_features(df, date_col='date')# This adds: day_of_week, month, quarter, is_weekend# Initialize model with categorical supportmodel = APDTFlowForecaster( forecast_horizon=7, categorical_encoding='onehot', # or 'embedding' hidden_dim=64, num_scales=3)# Fit with categorical featuresmodel.fit( df, target_col='sales', date_col='date', categorical_cols=['day_of_week', 'is_holiday', 'store_id'])# Predictions automatically incorporate categorical patternspredictions = model.predict(future_dates=7)
The model learns that Saturdays have different patterns than Tuesdays, and adjusts predictions accordingly.
Conformal Prediction: Uncertainty with Guarantees
Forecasting the future always involves uncertainty. But how do we quantify that uncertainty in a way we can trust?
Traditional approaches use prediction intervals based on distributional assumptions it assuming residuals are Gaussian, using standard deviations, and hoping for the best. The problem is that real-world forecast errors rarely follow nice, well-behaved distributions.
Distribution-Free Uncertainty Quantification
APDTFlow v0.3.0 introduces conformal prediction, a framework that provides uncertainty quantification without making distributional assumptions. The key insight: instead of assuming a distribution, we use past forecast errors directly to calibrate prediction intervals.
Here’s what makes conformal prediction special:
- Coverage Guarantees: If you request a 95% prediction interval, you get a mathematical guarantee that 95% of future values will fall within that interval
- Distribution-Free: Works for any data distribution, no Gaussian assumptions required
- Finite-Sample Validity: The guarantees hold even with limited calibration data
Two Flavors: Split and Adaptive
APDTFlow implements two conformal prediction methods:
Split Conformal Prediction
The classic approach: split your data into training and calibration sets. Compute forecast errors on the calibration set, then use the quantile of these errors to construct prediction intervals.
Adaptive Conformal Prediction
More sophisticated: the prediction intervals adapt online based on recent forecast performance. If the model starts making larger errors, the intervals widen automatically. This is particularly useful for non-stationary time series where uncertainty changes over time.
Using Conformal Prediction
Here’s how to enable conformal prediction in your forecasts:
# Initialize model with conformal predictionmodel = APDTFlowForecaster( forecast_horizon=14, use_conformal=True, conformal_method='adaptive', # or 'split' hidden_dim=64)# Fit the model (includes calibration step)model.fit(df, target_col='sales', date_col='date')# Get predictions with 95% prediction intervalspredictions, lower_bound, upper_bound = model.predict( alpha=0.05, # 1 - alpha = 95% coverage return_intervals='conformal')# Guaranteed: 95% of actual future values will fall in [lower_bound, upper_bound]
For production systems, especially in risk-sensitive applications like energy grid management or supply chain planning, having reliable uncertainty estimates is crucial. Conformal prediction gives you intervals you can actually trust — not just statistical hand-waving, but mathematical guarantees.
The research behind this is quite recent, with key papers from ICLR 2025 and arXiv preprints showing impressive results on real-world forecasting benchmarks.
Model Validation: Backtesting and Residual Diagnostics
One of the biggest gaps between academic papers and production ML is rigorous validation. It’s not enough to show good metrics on a single train-test split you need to prove your model works consistently across different time periods.
Historical Forecasts (Backtesting)
APDTFlow v0.3.0 introduces historical_forecasts(), inspired by the Darts library’s approach to backtesting. This method simulates production forecasting on historical data using a rolling window:
1. Start at some point in your historical data
2. Make a forecast for the next N steps
3. Move forward in time
4. Repeat, collecting all forecasts and actual values
This gives you a realistic picture of how your model would have performed in production.
# Run backtesting with weekly forecastsbacktest_results = model.historical_forecasts( data=df, target_col='sales', start=0.7, # Start at 70% through the data forecast_horizon=7, # 7-day forecasts stride=7, # Make a new forecast every 7 days retrain=False, # Use fixed model (fast) metrics=['MAE', 'MASE', 'sMAPE'])# Analyze resultsprint(f"Average MAE: {backtest_results['abs_error'].mean():.2f}")print(f"MASE: {backtest_results['mase'].mean():.3f}")# backtest_results is a DataFrame with:# - timestamp, fold, forecast_step, actual, predicted, error# Perfect for detailed analysis and visualization
You can also enable retrain=True to retrain the model at each fold, simulating a more realistic production scenario where you periodically retrain on new data.
Residual Diagnostics
Good forecasters don’t just make predictions — they understand when and why their models fail. APDTFlow v0.3.0 includes comprehensive residual analysis tools:
# Compute residuals on test dataresiduals, actuals, predictions = model.compute_residuals(test_df)# Visualize with 4-panel diagnostic plotfig, axes = model.plot_residuals()# Shows: residuals over time, distribution, ACF plot, Q-Q plot# Statistical analysisdiagnostics = model.analyze_residuals()print(f"Mean residual: {diagnostics['mean']:.4f}")print(f"Shapiro-Wilk p-value: {diagnostics['shapiro_pvalue']:.4f}")print(f"Ljung-Box p-value: {diagnostics['ljung_box_pvalue']:.4f}")
The diagnostic suite includes:
- Shapiro-Wilk Test: Are residuals normally distributed?
- Ljung-Box Test: Are residuals autocorrelated (indicating missed patterns)?
- Visual Diagnostics: Time series plot, histogram with fitted normal, ACF plot, Q-Q plot
These tools help you diagnose model deficiencies:
- Systematic patterns in residuals → model is missing something
- Heteroscedasticity → prediction uncertainty varies over time
- Autocorrelation → temporal dependencies not fully captured
This level of diagnostic rigor is standard in production ML systems at companies like Uber and Amazon, and now it’s built into APDTFlow.
Visualization: Making Forecasts Interpretable
One of the most practical features in APDTFlow is built-in visualization. Understanding your forecasts visually is crucial for both model validation and communicating results to stakeholders.
Simple Forecast Plotting
# Make predictionspredictions = model.predict(future_dates=14)# Visualize with historymodel.plot_forecast(with_history=100)# Output: Matplotlib plot showing last 100 historical points + 14 future predictions
Forecasts with Uncertainty Bands
When using conformal prediction or any uncertainty estimation method, you can visualize confidence intervals:
# Enable conformal predictionmodel = APDTFlowForecaster( forecast_horizon=14, use_conformal=True, conformal_method='adaptive')model.fit(df, target_col='sales', date_col='date')# Get predictions with intervalslower, pred, upper = model.predict( alpha=0.05, # 95% confidence return_intervals='conformal')# Visualize with uncertainty bandsmodel.plot_forecast( with_history=100, show_uncertainty=True)# Output: Plot with shaded confidence regions around predictions
Backtesting Visualization
You can also visualize backtesting results to see how your model performed across multiple time periods:
import matplotlib.pyplot as plt# Run backtestingbacktest_results = model.historical_forecasts( data=df, target_col='sales', start=0.7, forecast_horizon=7, stride=7)# Create custom visualizationplt.figure(figsize=(12, 6))plt.plot(backtest_results['timestamp'], backtest_results['actual'], 'o-', label='Actual', alpha=0.7)plt.plot(backtest_results['timestamp'], backtest_results['predicted'], 's-', label='Predicted', alpha=0.7)plt.fill_between(backtest_results['timestamp'], backtest_results['actual'], backtest_results['predicted'], alpha=0.2)plt.xlabel('Date')plt.ylabel('Sales')plt.title('Backtesting Results: Actual vs Predicted')plt.legend()plt.grid(True, alpha=0.3)plt.show()
Industry-Standard Metrics:
When benchmarking forecasting models, you need metrics that are comparable across different datasets and scales. APDTFlow v0.3.0 adds four metrics that are widely used in forecasting competitions and production systems:
MASE (Mean Absolute Scaled Error)
Scale-independent metric that normalizes errors relative to a naive seasonal baseline. MASE < 1.0 means you’re beating the naive forecast.
mase = model.score(test_df, metric='mase')print(f"MASE: {mase:.3f}") # < 1.0 is good
Hyndman & Koehler (2006) “Another look at measures of forecast accuracy”
sMAPE (Symmetric Mean Absolute Percentage Error)
Addresses the asymmetry problem in standard MAPE — treats over-forecasts and under-forecasts symmetrically. Bounded between 0% and 200%.
smape = model.score(test_df, metric='smape')print(f"sMAPE: {smape:.2f}%")
Makridakis (1993) “Accuracy measures: theoretical and practical concerns”
CRPS (Continuous Ranked Probability Score)
Evaluates probabilistic forecasts by measuring both sharpness (narrow intervals) and calibration (correct coverage). Essential for evaluating conformal prediction.
crps = model.score(test_df, metric='crps')print(f"CRPS: {crps:.3f}") # Lower is better
Gneiting & Raftery (2007)“Strictly proper scoring rules, prediction, and estimation”
Coverage
Measures prediction interval calibration — do your 95% intervals actually contain 95% of observations?
coverage = model.score(test_df, metric='coverage', alpha=0.05)print(f"Coverage: {coverage:.1%}") # Should be close to 95%
These metrics are the same ones used in the M4 and M5 forecasting competitions, making your results directly comparable to state-of-the-art methods.
Putting It All Together: A Complete Workflow
Here’s what a complete production forecasting workflow looks like with APDTFlow:
from apdtflow.forecaster import APDTFlowForecasterfrom apdtflow.preprocessing.categorical_encoder import create_time_featuresimport pandas as pd# 1. Load and prepare datadf = pd.read_csv('sales_data.csv', parse_dates=['date'])df = create_time_features(df, date_col='date')# 2. Initialize model with all production featuresmodel = APDTFlowForecaster( forecast_horizon=14, num_scales=3, hidden_dim=64, use_conformal=True, conformal_method='adaptive', categorical_encoding='onehot')# 3. Train with categorical features and exogenous variablesmodel.fit( df, target_col='sales', date_col='date', categorical_cols=['day_of_week', 'is_holiday'], exog_cols=['temperature', 'promotion'], exog_fusion_type='gated')# 4. Rigorous validation via backtestingbacktest_results = model.historical_forecasts( data=df, target_col='sales', start=0.7, forecast_horizon=14, stride=7, metrics=['MAE', 'MASE', 'sMAPE', 'CRPS'])print(f"Backtest MASE: {backtest_results['mase'].mean():.3f}")# 5. Residual diagnosticsdiagnostics = model.analyze_residuals()if diagnostics['ljung_box_pvalue'] < 0.05: print("Warning: Residuals show autocorrelation—consider more lags")# 6. Production forecasts with uncertaintypredictions, lower, upper = model.predict( future_dates=14, alpha=0.05, # 95% intervals return_intervals='conformal')# 7. Visualize predictions with uncertainty bandsmodel.plot_forecast( with_history=100, # Show last 100 historical points show_uncertainty=True # Display confidence intervals)# Output: Beautiful matplotlib plot with history + predictions + uncertainty bands# 8. Model persistencemodel.save('production_model.pkl')# Later: reload and useloaded_model = APDTFlowForecaster.load('production_model.pkl')new_predictions = loaded_model.predict(future_dates=14)
This workflow covers everything you need for a production forecasting system: feature engineering, rigorous validation, uncertainty quantification, diagnostics, and persistence.
Under the Hood: What Makes APDTFlow Different
While v0.3.0 adds many practical features, the core architecture that makes APDTFlow unique remains:
Neural ODEs for Continuous-Time Modeling
Unlike standard RNNs or transformers that operate on discrete time steps, APDTFlow models continuous dynamics using Neural ODEs. This provides several advantages:
- Irregular Time Steps: Handle missing data and irregular sampling naturally
- Smooth Evolution: Forecasts evolve smoothly rather than jumping discontinuously
- Adaptive Computation: The ODE solver adjusts step size based on dynamics complexity
Multi-Scale Decomposition
Before feeding data to the Neural ODE, APDTFlow decomposes the signal into multiple scales using a wavelet-like approach. This allows the model to:
- Capture both long-term trends and short-term fluctuations simultaneously
- Process different time scales with different network capacities
- Improve interpretability by separating fast and slow dynamics
Probabilistic Fusion
The latent representations from different scales are combined using a probabilistic fusion mechanism that:
- Learns optimal weights for each scale
- Quantifies uncertainty at the fusion level
- Provides a natural framework for conformal prediction
What’s Next?
- Hierarchical Forecasting: Support for hierarchical time series with reconciliation
- Automated Hyperparameter Tuning: Integration with Optuna or Ray Tune
- Extended Exogenous Support: More fusion strategies and covariate handling
Conclusion
APDTFlow started as a research project exploring Neural ODEs for time series forecasting. With v0.3.0, it’s evolved into a production-ready framework that combines cutting-edge research with the practical features that real-world forecasting requires.
The addition of categorical features, conformal prediction, comprehensive validation tools, and industry-standard metrics makes APDTFlow suitable for everything from academic research to production deployment at scale.
If you’re working on time series forecasting problems I encourage you to give APDTFlow v0.3.0 a try. The framework is open source, actively maintained, and designed to be both powerful and approachable.
Check out the:
for documentation, examples, and to contribute!
If you enjoyed this post, please give it a clap. Feel free to follow me on** Medium **for more articles!
Connect with me on LinkedIn to discuss forecasting challenges and solutions.
References
Conformal Prediction:
1. Shafer, G., & Vovk, V. (2008). *A Tutorial on Conformal Prediction*. Journal of Machine Learning Research, 9, 371–421.
- The foundational paper on conformal prediction, establishing the theoretical framework for distribution-free prediction intervals.
2. Gibbs, I., & Candès, E. (2021). *Adaptive Conformal Inference Under Distribution Shift*. arXiv:2106.00170.
- Introduces adaptive conformal prediction methods that adjust to non-stationarity, directly relevant to APDTFlow’s implementation.
3. Xu, C., & Xie, Y. (2023). *Conformal Prediction for Time Series*. arXiv:2010.09107.
- Comprehensive study of conformal prediction in time series contexts, addressing temporal dependencies and calibration.
4. Zaffran, M., et al. (2025). *Adaptive Conformal Predictions for Time Series*. ICML 2025.
- Recent work on adaptive methods with learning rates, influencing APDTFlow’s adaptive implementation.
Categorical Features in Time Series:
5. Oreshkin, B. N., et al. (2019). *N-BEATS: Neural basis expansion analysis for interpretable time series forecasting*. ICLR 2020.
- While N-BEATS focuses on pure time series, the discussion of feature engineering influenced APDTFlow’s categorical design.
6. Lim, B., et al. (2021). *Temporal Fusion Transformers for Interpretable Multi-horizon Time Series Forecasting*. International Journal of Forecasting.
- Comprehensive treatment of categorical and continuous covariates in deep learning forecasting models.
Backtesting and Model Validation:
7. Cerqueira, V., Torgo, L., & Mozetič, I. (2020). *Evaluating time series forecasting models: An empirical study on performance estimation methods*. Machine Learning, 109, 1997–2028.
- Rigorous study of time series validation methods, establishing best practices for backtesting.
8. Bergmeir, C., & Benítez, J. M. (2012). *On the use of cross-validation for time series predictor evaluation*. Information Sciences, 191, 192–213.
- Critical analysis of cross-validation for time series, informing APDTFlow’s rolling window approach.
Forecasting Metrics:
9. Hyndman, R. J., & Koehler, A. B. (2006). *Another look at measures of forecast accuracy*. International Journal of Forecasting, 22(4), 679–688.
- The definitive paper on MASE and other scale-independent forecasting metrics.
10. Makridakis, S. (1993). *Accuracy measures: theoretical and practical concerns*. International Journal of Forecasting, 9(4), 527–529.
- Discusses sMAPE and the problems with asymmetric percentage errors.
11. Gneiting, T., & Raftery, A. E. (2007). *Strictly proper scoring rules, prediction, and estimation*. Journal of the American Statistical Association, 102(477), 359–378.
- Establishes CRPS as a proper scoring rule for probabilistic forecasts.
Neural ODEs and Time Series:
12. Chen, R. T. Q., et al. (2018). *Neural Ordinary Differential Equations*. NeurIPS 2018.
- The seminal paper introducing Neural ODEs, foundational to APDTFlow’s architecture.
13. Rubanova, Y., Chen, R. T. Q., & Duvenaud, D. (2019). *Latent ODEs for Irregularly-Sampled Time Series*. NeurIPS 2019.
- Extends Neural ODEs to irregularly-sampled time series, directly applicable to forecasting.
Exogenous Variables and Fusion:
14. Chen, Y., et al. (2024). *TimeXer: Empowering Transformers for Time Series Forecasting with Exogenous Variables*. arXiv:2402.19072.
- Recent work on exogenous variable fusion strategies, influencing APDTFlow’s gated and attention mechanisms.
15. Ilbert, A., et al. (2025). *ChronosX: Advancing Time Series Forecasting with Exogenous Variables*. arXiv:2503.21251.
- Very recent preprint on advanced covariate handling in deep learning forecasters.
General Time Series Forecasting:
16. Hyndman, R. J., & Athanasopoulos, G. (2021). *Forecasting: Principles and Practice* (3rd ed.). OTexts.
- The comprehensive textbook on forecasting, covering classical and modern methods. Available free online at [https://otexts.com/fpp3/](https://otexts.com/fpp3/)
17. Januschowski, T., et al. (2020). *Criteria for classifying forecasting methods*. International Journal of Forecasting, 36(1), 167–177.
- Framework for understanding different forecasting paradigms, helping position APDTFlow in the landscape.