Why is AI Generated Rust slow when compared with Go/C#/Node/JavaScript

This article has been written by me with AI Assistance.

Why is Rust Sometimes Slower Than Expected?

Rust is widely known for its performance and safety, often being compared to C and C++. However, there are scenarios where Rust code may not perform as fast as anticipated. This article delves into the micro benchmark performance of various language implementations of the same abstraction and the surprising results they produce.

⚠️ Performance Results Not Reliable

This is a micro benchmark relevant to the abstractions i am building and is not suitable to be used for general decision making on performance.

Since I have not reviewed any of the implementation, AI may have made mistakes which invalidate these performance testing results and their observations.

I believe many …

This article has been written by me with AI Assistance.

Why is Rust Sometimes Slower Than Expected?

⚠️ Performance Results Not Reliable

This is a micro benchmark relevant to the abstractions i am building and is not suitable to be used for general decision making on performance.

Since I have not reviewed any of the implementation, AI may have made mistakes which invalidate these performance testing results and their observations.

I believe many more optimizations prompt request can improve the current implementations by AI, as well as re-implementing the assembler logic manually, which will invalidate the testing results and their current observations.

Disclaimer

Note: I am only familiar with C#. I am not familiar with idioms or conventions in other languages.

The implementation used to generate the performance data are not handwritten implementations; they are all fully generated using different models (GPT 4.1 and Claude Sonnet 4 and 4.5) with zero review of the implementation.

I have ensured the implementation output are correct by using AI-generated tests to verify the results across all engines and languages and also visually confirming the expected output which are published in fly.io.

No optimizations have been done using AI till date

Datasource

See the performance comparison report for detailed metrics.

The performance reports can also be generated from the deployed language assemblers:

Language	Assembler Demo in fly.io
Javascript	javascript Assembler Web Client Side Assembly
C#	csharp Assembler Web Server Side Assembly
Rust	rust Assembler Web Server Side Assembly
Go	go Assembler Web Server Side Assembly
Node.js	node Assembler Web Server Side Assembly
PHP	PHP Assembler Web Server Side Assembly

The output generated in my PC which is used for analysis in this article is also available in All Performance Results which will be periodically regenerated as the implementations are improved as well as when newer rules are added.

Interesting Observations (Generated by Claude Sonnet 4.5)

The PreProcess Paradox: While preprocessing universally improves performance, C# and Node.js show regressions in some HtmlRule1 tests (C# avg: 3.08ms → 3.68ms, Node avg: 4.96ms → 4.42ms for the group). This suggests that for simpler templates, the preprocessing overhead exceeds the merging benefits, challenging the “always preprocess” assumption.
The Sub-Millisecond Club: Node.js achieves an exclusive performance tier with five PreProcess tests under 1ms (0.57ms, 0.80ms, 0.90ms, 0.92ms, 0.99ms) for JSON rules - a threshold no other server-side language consistently reaches. This establishes a 2-3x advantage over even the second-place contenders in these scenarios.
Variance as a Performance Indicator: Go demonstrates the tightest min/max ranges across all tests (often within 2-3ms spread), while C# shows extreme outliers (HtmlRule3A max: 23.47ms vs avg: 6.56ms). This variance suggests that Go’s performance is more predictable under varying system loads, whereas C# may experience JIT compilation or garbage collection pauses.
Client-Side JavaScript’s Hidden Strength: Despite running in a browser environment with additional overhead, client-side JavaScript wins 8 out of 24 Normal Engine tests, often beating server-side Node.js. This suggests the V8 JIT optimizations are highly effective, and the browser’s rendering isolation doesn’t significantly impact pure computation tasks.
PHP’s Consistent 2-5x Penalty Floor: PHP maintains a remarkably consistent performance penalty across all test categories and all engines, never getting within 2x of the winner. Unlike other languages that show workload-specific advantages, PHP’s universal slowdown points to fundamental interpreter overhead that preprocessing cannot overcome.
The Rust Preprocessing Coefficient: Rust shows the most dramatic preprocessing gains (3x+ in some cases: HtmlRule1 avg 9.00ms → 3.01ms), but also the highest variance in PreProcess mode for complex rules (JsonRule2B max jumps to 15.13ms). This suggests Rust’s AI-generated parsing code has optimization opportunities that, once addressed, could make it consistently competitive with Go.

Interesting Observations (Written by me)

Recursion Has a Cost: Based on Rule 1 with two scenarios, HtmlRule1A and HtmlRule1B, and the performance results, I can see that recursion has a higher cost since the only difference between the two scenarios is the nesting of the direct assembly of components. Maybe use iteration instead of recursion to improve performance?
Testing in Debug: Testing in Debug in Rust is worse when compared with all other languages.
Backup Folder Size: I thought node_modules folder size was huge leading to slow backup if they are not ignored but now i can say rust target folders are more huge and backup without doing cargo clean will increase backup time.

Interesting Observations (Generated by GPT4.1)

Consistency Across Languages in PreProcess Engine: The performance gap between languages narrows significantly when using the PreProcess Engine. This suggests that the initial parsing and data structure setup is the main bottleneck, and once that is optimized, most modern languages can achieve similar throughput for the merging step.
Go’s Predictable Performance: Go consistently delivers low and stable times across both Normal and PreProcess Engine scenarios, especially for HTML rules. This highlights Go’s efficiency in handling both parsing and merging workloads, making it a strong candidate for server-side template processing.
Node.js and JavaScript Lead in JSON Tasks: Node.js and client-side JavaScript are often the fastest for JSON-based rules, likely due to the highly optimized V8 engine and native JSON parsing. This makes them particularly suitable for workloads that are JSON-heavy.
Rust’s Performance Highly Dependent on Optimization: Rust’s performance is competitive in the PreProcess Engine but lags in the Normal Engine, especially without manual optimization. This underscores the importance of reviewing and tuning Rust code, as the default or AI-generated implementations may not leverage Rust’s full potential.
PHP Lags Behind: PHP is consistently the slowest across most scenarios, both in Normal and PreProcess Engine. This is expected due to its interpreted nature and less optimized data handling for these workloads.
Impact of Output Size: For rules with larger output sizes, the performance differences between languages can become more pronounced, especially in the Normal Engine. This suggests that memory management and string handling efficiency play a bigger role as output grows.

Interesting Observations (Generated by Gemini AI)

Pre-processing is a Game Changer: The most significant observation is the dramatic performance increase across all languages when using the “PreProcess Engine”. For example, in JsonRule2A, C#’s time drops from ~266ms to ~4ms, and Node.js drops from ~58ms to ~2ms. This indicates that for this workload, the initial parsing and setup of the templates is the most expensive part of the operation.
Rust is Not a Silver Bullet: In the “Normal Engine” tests, which involve file I/O and parsing, Rust is frequently outperformed. Go is consistently the winner in HTML-based rules, while Node.js and client-side JavaScript dominate the JSON-based rules. This challenges the assumption that Rust is automatically the fastest for all workloads. The performance of its initial, AI-generated code was surprisingly poor, requiring specific optimization just to beat PHP.
Go and Node.js Shine in “Cold Starts”: In the “Normal Engine” scenario, Go shows exceptional performance on HTML processing tasks. Node.js (and by extension, the V8 JavaScript engine) is the clear winner for JSON-related tasks, likely due to its highly optimized, built-in JSON parser.
Pre-processing as the Great Equalizer: While performance differences are stark in the “Normal Engine”, they shrink considerably in the “PreProcess Engine”. Once the initial parsing is done, most of the languages perform within a few milliseconds of each other for many tests. However, Node.js and client-side Javascript often maintain a slight edge, clocking in at under a millisecond for several pre-processed JSON rules.
The Cost of Abstraction: The results suggest that the specific abstractions used in this Assembler project have a significant performance overhead that is language-agnostic. The massive speedup from pre-processing implies that the way templates are parsed and prepared is far more critical than the language executing the final assembly. For this specific, AI-generated, non-hand-optimized workload, the choice of algorithm and I/O handling has a much larger impact on performance than the choice of language itself.

Consolidated Performance Results

The tables below show the time in milliseconds for 1000 iterations in each language to perform a series of template assembly tasks.

Normal Engine: Normal Engine does the parsing and merging for each rule and during loading teh raw template is loaded.

PreProcess Engine: PreProcess Engine does only the merging for each rule and during loading it is parsed into a structure which is used by the engine to merge.

The results are split into two main scenarios:

Normal Engine: This measures the performance of loading the template, parsing it, and applying the rules of assembly from scratch for every operation.

PreProcess Engine: This measures performance when the template is loaded and parsed once, and the resulting structure is cached. Subsequent operations only need to apply the rules of assembly, which should be significantly faster.

The OutputSize column indicates the size of the generated content in bytes. The fastest time for each test is highlighted in bold.

Generated: 2025-10-21 08:41:01 UTC | Iterations: 1000, Warmup: 100 | All times in milliseconds (ms)

Grouped View (Min/Avg/Max by Rule Groups)

HtmlRule1

Language	Normal Engine (Min/Avg/Max)	PreProcess Engine (Min/Avg/Max)
CSharp	2.24 / 3.08 / 3.92	2.39 / 3.68 / 4.97
Rust	5.92 / 9.00 / 12.07	1.92 / 3.01 / 4.10
Go	1.64 / 2.68 / 3.72	1.65 / 2.44 / 3.23
Node	4.33 / 4.96 / 5.59	2.72 / 4.42 / 6.11
PHP	12.91 / 17.85 / 22.79	7.70 / 10.47 / 13.25
Javascript	3.30 / 4.40 / 5.50	2.70 / 2.95 / 3.20

HtmlRule2

Language	Normal Engine (Min/Avg/Max)	PreProcess Engine (Min/Avg/Max)
CSharp	3.11 / 6.37 / 8.54	2.45 / 4.05 / 6.10
Rust	9.94 / 15.28 / 22.66	5.61 / 10.56 / 17.07
Go	2.55 / 4.81 / 8.60	1.62 / 2.61 / 3.85
Node	3.83 / 5.75 / 8.07	1.37 / 2.12 / 2.77
PHP	19.82 / 34.46 / 53.27	7.26 / 10.27 / 13.81
Javascript	3.60 / 6.23 / 9.20	1.30 / 2.43 / 3.70

HtmlRule3

Language	Normal Engine (Min/Avg/Max)	PreProcess Engine (Min/Avg/Max)
CSharp	2.49 / 6.56 / 23.47	3.07 / 5.11 / 7.82
Rust	5.57 / 7.52 / 9.22	2.17 / 2.96 / 3.76
Go	2.01 / 2.40 / 2.70	2.12 / 2.39 / 3.06
Node	1.28 / 2.03 / 3.52	0.80 / 2.39 / 3.05
PHP	13.79 / 17.05 / 21.34	7.60 / 11.75 / 17.67
Javascript	1.40 / 1.95 / 2.50	0.90 / 1.77 / 2.80

JsonRule1

Language	Normal Engine (Min/Avg/Max)	PreProcess Engine (Min/Avg/Max)
CSharp	13.81 / 34.47 / 64.41	2.98 / 4.42 / 6.97
Rust	10.53 / 16.01 / 23.08	1.54 / 2.06 / 3.40
Go	9.54 / 20.46 / 32.20	1.61 / 2.53 / 4.00
Node	3.99 / 9.28 / 17.48	0.57 / 1.51 / 3.56
PHP	20.36 / 50.75 / 113.19	7.24 / 9.64 / 15.45
Javascript	3.30 / 6.97 / 13.20	1.20 / 1.85 / 2.90

JsonRule2

Language	Normal Engine (Min/Avg/Max)	PreProcess Engine (Min/Avg/Max)
CSharp	114.97 / 233.16 / 398.90	2.80 / 4.50 / 8.23
Rust	38.21 / 80.90 / 140.74	1.55 / 6.89 / 15.13
Go	61.34 / 118.28 / 199.56	2.00 / 3.75 / 7.00
Node	25.94 / 54.27 / 91.61	0.92 / 1.74 / 3.23
PHP	91.04 / 127.02 / 160.15	7.96 / 10.07 / 14.52
Javascript	26.10 / 57.52 / 107.60	1.10 / 2.25 / 5.20

Rule1

Language	Normal Engine (Min/Avg/Max)	PreProcess Engine (Min/Avg/Max)
CSharp	147.89 / 182.35 / 216.81	4.70 / 5.48 / 6.27
Rust	39.80 / 47.47 / 55.13	4.34 / 5.89 / 7.44
Go	69.73 / 88.52 / 107.30	3.50 / 5.00 / 6.50
Node	29.60 / 36.02 / 42.45	1.90 / 3.57 / 5.24
PHP	170.26 / 206.63 / 243.01	14.09 / 18.02 / 21.96
Javascript	24.60 / 31.70 / 38.80	2.40 / 4.15 / 5.90

Normal Engine

AppSite/AppView	CSharp	Rust	Go	Node	PHP	Javascript	OutputSize
HtmlRule1A	2.24	5.92	1.64	4.33	12.91	5.50	1264
HtmlRule1B	3.92	12.07	3.72	5.59	22.79	3.30	2123
HtmlRule2A	3.11	9.94	2.55	4.50	19.82	3.60	1910
HtmlRule2B	6.80	12.87	3.83	6.92	29.04	5.90	2365
HtmlRule2C	6.96	11.60	3.69	5.19	32.26	5.20	1920
HtmlRule2D	5.77	14.33	3.62	3.83	28.86	4.70	2083
HtmlRule2E	7.04	20.26	6.58	5.99	43.49	9.20	2840
HtmlRule2F	8.54	22.66	8.60	8.07	53.27	8.80	2874
HtmlRule3A	2.49	6.72	2.70	1.28	13.79	1.40	1428
HtmlRule3A → Html3A	23.47	6.68	2.69	1.80	14.53	1.60	1428
HtmlRule3A → Html3B	2.64	5.57	2.14	1.51	14.04	1.40	1428
HtmlRule3B	2.86	8.25	2.15	2.04	18.93	2.40	1406
HtmlRule3B → Html3A	3.79	8.70	2.70	2.05	19.64	2.40	1406
HtmlRule3B → Html3B	4.10	9.22	2.01	3.52	21.34	2.50	1406
JsonRule1A	29.70	10.53	23.57	5.92	20.36	4.20	1417
JsonRule1B	13.81	11.08	9.54	3.99	21.61	3.30	1924
JsonRule1C	29.97	23.08	16.53	9.73	47.85	7.20	3798
JsonRule1D	64.41	19.35	32.20	17.48	113.19	13.20	320
JsonRule2A	265.76	84.13	128.60	57.51	153.75	54.10	2355
JsonRule2B	398.90	140.74	199.56	91.61	160.15	107.60	2906
JsonRule2C	114.97	38.21	61.34	25.94	91.04	26.10	2799
JsonRule2D	152.99	60.51	83.60	42.01	103.13	42.30	3217
Rule1A	147.89	39.80	69.73	29.60	170.26	24.60	2543
Rule1B	216.81	55.13	107.30	42.45	243.01	38.80	4772

PreProcess Engine

AppSite/AppView	CSharp	Rust	Go	Node	PHP	Javascript	OutputSize
HtmlRule1A	2.39	1.92	1.65	2.72	7.70	3.20	1264
HtmlRule1B	4.97	4.10	3.23	6.11	13.25	2.70	2123
HtmlRule2A	2.45	5.61	2.16	2.14	7.26	2.40	1910
HtmlRule2B	3.59	10.05	2.11	1.79	8.78	1.60	2365
HtmlRule2C	4.21	8.88	1.62	1.37	7.82	1.30	1920
HtmlRule2D	4.25	8.95	2.66	1.87	10.47	3.70	2083
HtmlRule2E	3.67	17.07	3.29	2.77	13.48	3.00	2840
HtmlRule2F	6.10	12.83	3.85	2.77	13.81	2.60	2874
HtmlRule3A	3.07	3.29	2.16	0.80	7.60	0.90	1428
HtmlRule3A → Html3A	3.39	2.39	2.19	2.60	11.04	1.60	1428
HtmlRule3A → Html3B	3.39	2.44	2.12	2.46	9.76	1.00	1428
HtmlRule3B	5.43	2.17	2.19	3.05	10.38	2.10	1406
HtmlRule3B → Html3A	7.82	3.70	2.61	2.52	14.06	2.80	1406
HtmlRule3B → Html3B	7.56	3.76	3.06	2.90	17.67	2.20	1406
JsonRule1A	2.98	1.77	2.00	0.57	7.35	1.50	1417
JsonRule1B	3.53	1.54	2.52	1.01	7.24	1.20	1924
JsonRule1C	6.97	3.40	4.00	3.56	15.45	2.90	3798
JsonRule1D	4.19	1.54	1.61	0.90	8.53	1.80	320
JsonRule2A	4.12	9.25	4.00	1.82	9.44	1.60	2355
JsonRule2B	8.23	15.13	7.00	3.23	14.52	5.20	2906
JsonRule2C	2.86	1.55	2.00	0.92	7.96	1.10	2799
JsonRule2D	2.80	1.62	2.00	0.99	8.35	1.10	3217
Rule1A	4.70	4.34	3.50	1.90	14.09	2.40	2543
Rule1B	6.27	7.44	6.50	5.24	21.96	5.90	4772

Why is Rust Sometimes Slower Than Expected?

Why is Rust Sometimes Slower Than Expected?

Disclaimer

Datasource

Interesting Observations (Generated by Claude Sonnet 4.5)

Interesting Observations (Written by me)

Interesting Observations (Generated by GPT4.1)

Interesting Observations (Generated by Gemini AI)

Consolidated Performance Results

Grouped View (Min/Avg/Max by Rule Groups)

HtmlRule1

HtmlRule2

HtmlRule3

JsonRule1

JsonRule2

Rule1

Normal Engine

PreProcess Engine

Similar Posts