1. Fundamental Pattern Recognition Issues
- Model fails at basic tabular pattern matching
- Cannot reliably learn deterministic mappings
2. Semantic Understanding Deficits
- Cannot distinguish data types (codes vs IDs)
- No understanding of feature relevance
- Treats all columns as equally important
3. Limited Learning Complexity
- Cannot handle multi-feature interactions
- Defaults to simple, linear patterns
- Insufficient for SAP’s complex business logic
Other tests were successful; I focus only on the failures, mostly related to the architecture of the model.
My OPINION
Before we get into SAP’s new model, we need to understand why tabular data (think 2D database tables) is such a uniquely difficult problem for AI. Where we tried this, and we failed in the…
1. Fundamental Pattern Recognition Issues
- Model fails at basic tabular pattern matching
- Cannot reliably learn deterministic mappings
2. Semantic Understanding Deficits
- Cannot distinguish data types (codes vs IDs)
- No understanding of feature relevance
- Treats all columns as equally important
3. Limited Learning Complexity
- Cannot handle multi-feature interactions
- Defaults to simple, linear patterns
- Insufficient for SAP’s complex business logic
Other tests were successful; I focus only on the failures, mostly related to the architecture of the model.
My OPINION
Before we get into SAP’s new model, we need to understand why tabular data (think 2D database tables) is such a uniquely difficult problem for AI. Where we tried this, and we failed in the past.
Tabular Data is Structured. Its meaning comes from the rigid relationship between rows and columns, but it often lacks the “inherent structure” of natural language. Having a “5” in the row of “Product_Rating” column means something completely different than the same “5” but in column “Quantity”, and AI models have historically struggled with this context. Not to mention the row below might be identical, but with a different single digit, which makes the unique key different.
Not to mention that tables get updates, and data is updated, where LLMs have been trained on books, texts, or images that are static.
That said, if you ever want to solve this, you can categorize the Tabular models, also called LTMs, into two blocks;
1. Gradient-Boosted Trees
These are not deep learning models, but **Gradient-Boosted Decision Trees **like XGBoost or **CatBoost **models are traditional ML, by traditional I mean they require separate training for each dataset.
You need the data to be highly curated, a lot of compute power, and an excellent team of ML Engineers to make it work.
2. Use LLMs
The other category is to use a General Pourpose LLM like Llama which is Open Weights, and just feed the tables. The results are Poor because
- Limited Context Windows: LLMs have a hard limit on how much text they can read at once (the “context window”). This means they might only be able to look at 32 or 64 rows of your data, which is useless for a table with thousands of entries.
- The Wrong Architecture: LLMs are designed for sequential text, not the 2D structure of a table. They are not effective at handling numerical values.
What Research is Trying Now: “Table-Native” Foundation Models
Because of the limitations of GBDTs (no pretraining) and LLMs (wrong architecture), research has moved toward “table native foundation models”
These models are designed to learn from many tables and then apply that knowledge to new, unseen tables using in context learning (ICL). Which is essentially;
What Snowflake and Databricks tried (and not many care).
This was tried and explained in this blog post
This is where the new models, including SAP’s, come in with ConTextTab
I repeat this again. The problem with the synthetic models is that they ignore the rich meaning in our data.
SAP’s ConTextTab ties to solve it; it’s a “table-native” model like TabICL (so it’s efficient), but it adds a crucial new ingredient:
- semantic understanding. Instead of synthetic data, it’s trained on a large (public) dataset. GREAT!
- It uses specialized encoders to understand different data types, including text, dates, and numbers. FANTASTIC!
- ALSO, it reads the *column names *to understand context, SUPERB!
This is the new frontier: creating models that are both as efficient as **GBDTs **and as “smart” about language as LLMs, all while being built specifically for the structure of a table. I think this could work!
BUT
This is where I come critic, especially since we tried and failed here, and it’s been one year since we knew SAP was in this direction, removing the marketing narrative.
- It’s Not Trained on SAP Data: The paper is very clear that ConTextTab was pretrained on the T4 dataset. This is a large, public dataset of tables scraped from the internet, not a SAP-specific Dataset. This is bad, because the first thing we failed at when we wanted to do something real is that SAP Datasets don’t exist, and I was wondering, “I hope SAP any given day, creates an AI Model and publishes the datasets where it has been trained,” which includes S/4HANA or even ECC. Well, I assume not today.
- It was evaluated on other public benchmarks like CARTE and OpenML. There is no indication that any of SAP’s valuable enterprise data was used for the evaluation.
- It has not been trained on a Supercomputer, the sample is small! When Databricks and Snowflake announced their models and Meta was already investing billions, DBX and SNOW announced, “It didn’t cost us a bunch of money, just a dozen million USD.” SAP is way cheaper than that. The authors state the “base” model was trained on a “single H100 GPU between 4 to 12 days” (aka 1k USD approx), which is a scale more aligned with a university project. Personally I spend almost 1k USD every month at Sagemaker, and not doing much, unfortunately.
- It doesn’t scale. Not only because we are limited by the Context Window, 2,073 rows × 50 columns, but any enterprise attempt to pre-train a model with real tables, and an extremely large context window that does not exist today, will fall under the rules of scaling gods.
- SAP’s contribution to research is reasonably small— they’re essentially fine-tuning existing architectures on public data with minimal compute resources. For a company of SAP’s size and with its treasure trove of enterprise data, this is surprisingly underwhelming.
A Little bit of HOPE
What Phillipp Herzig has not mentioned in his blog post is that SAP is exploring another domain, which is called (Relational Encoder for Latent Aggregation of Typed Entities / RELATE), which goes in the direction of graph neural networks on relational databases, and not traditional tabular data.
In English, this is a completely different problem than **ConTextTab **is addressing.
- **ConTextTab **works with single tables (traditional tabular data) using Classification and Regression on Rows.
- **RELATE **works with multi-table relational databases converted to graphs. Graphs are MUCH more valuable for AI Chatbots, compared to Tables. This is more useful. And proven technology.
Press enter or click to view image in full size
By Author
RELATE is also not highly innovative at its best; it uses Perceiver-style cross-attention to create node embeddings for GNNs, a technology that has existed for many years, BUT it’s extremely useful if you control the Domain of the data, which SAP does, and that makes the difference, because the schema-agnostic encoding works across different database schemas. Here, SAP adds value; in the Table Rows, SAP does NOT add value.
Think about RELATE as an expert in complex multi-table relational structures that can address more realistic enterprise challenges (as multi-table databases are core to SAP systems)
Still, to see on RELATE what or IF, they use SAP data to build their GNN.
My To-Do list to SAP
My hope is that some at SAP might read me. Here are some suggestions and technical improvements I should expect from SAP’s AI research for next generation of models:
1. Domain-Specific Pretraining
Design pretraining tasks that mirror actual ERP workflows — like predicting financial period closes, detecting anomalous journal entries, etc… These tasks should encode business logic constraints (e.g., double-entry bookkeeping rules, material master dependencies).
1.1. Business Rule Integration Layers
Create neural architectures with explicit modules for incorporating hard business constraints — like currency conversion rules, tax calculations, or industry-specific regulations — as differentiable components.
2. Multi-Tenant scalable Architectures
Develop models that can handle multi-tenant scenarios efficiently — sharing learned representations across customers while maintaining strict data isolation. This includes techniques like federated learning adaptations or parameter-efficient fine-tuning per tenant.
The ConTextTab paper admits their model (and others like it) “fail to scale their performance to very large datasets”. This is a non-starter for enterprise use. We should expect SAP to develop architectures that solve this. Other “column-then-row attention” mechanisms like TabICL already handle up to 500,000 samples.
3. Business Process Models
Create architectures that understand business process sequences
Press enter or click to view image in full size
orders -> deliveries -> invoices -> payments
This means modeling long-range temporal dependencies across months/years, not just timestamp features.
4. Cross-Module Representation Learning
Build models that learn unified representations across SAP modules (FI/CO, MM, SD, HR). A purchase in MM should inform predictions in FI. This requires handling heterogeneous schemas and business objects simultaneously.
Enterprise data is almost never one flat table. It’s a complex, multi- table relational database. SAP’s own RELATE paper introduces an encoder for “multimodal relational graphs”. We should expect any flagship model to integrate this, moving from single-table prediction to true multi-table graph-based learning.
Press enter or click to view image in full size
Making AI accessible to 100K+ learners. Find the most practical, hands-on and comprehensive AI Engineering and AI for Work certifications at academy.towardsai.net - we have pathways for any experience level. Monthly cohorts still open — use COHORT10 for 10% off!