Abstract
In untrusted execution environments such as web browsers, code from remote sources is regularly executed. To harden these environments against attacks, constituent programming languages and their implementations must uphold certain safety properties, such as memory safety. These properties must be maintained across the entire compilation stack, which may include intermediate languages that do not provide the same safety guarantees. Any case where properties are not preserved could lead to a serious security vulnerability.
In this work, we identify a specification vulnerability in the WebGPU Shading Language (WGSL) where code with data races can be compiled to intermediate representations in which an optimizing compiler could legitimately remove memory safety guardrail…
Abstract
In untrusted execution environments such as web browsers, code from remote sources is regularly executed. To harden these environments against attacks, constituent programming languages and their implementations must uphold certain safety properties, such as memory safety. These properties must be maintained across the entire compilation stack, which may include intermediate languages that do not provide the same safety guarantees. Any case where properties are not preserved could lead to a serious security vulnerability.
In this work, we identify a specification vulnerability in the WebGPU Shading Language (WGSL) where code with data races can be compiled to intermediate representations in which an optimizing compiler could legitimately remove memory safety guardrails. To address this, we present SafeRace, a collection of threat assessments and specification proposals across the WGSL execution stack. While our threat assessment showed that this vulnerability does not appear to be exploitable on current systems, it creates a ”ticking time bomb”, especially as compilers in this area are rapidly evolving. Given this, we introduce the SafeRace Memory Safety Guarantee (SMSG), two components that preserve memory safety in the WGSL execution stack even in the presence of data races. The first component specifies that program slices contributing to memory indexing must be race free and is implemented via a compiler pass for WGSL programs. The second component is a requirement on intermediate representations that limits the effects of data races so that they cannot impact race-free program slices. While the first component is not guaranteed to apply to all possible WGSL programs due to limitations on how some data types can be accessed, we show that existing language constructs are sufficient to implement this component with minimal performance overhead on many existing important WebGPU applications. We test the second component by performing a fuzzing campaign of 81 hours across 21 compilation stacks; our results show violations on only one (likely buggy) machine, thus providing evidence that lower-level GPU frameworks could relatively straightforwardly support this constraint. Finally, our assessments discovered GPU memory isolation vulnerabilities in Apple and AMD GPUs, as well as a security-critical miscompilation of WGSL in a pre-release version of Firefox.
Formats available
You can view the full content in the following formats:
References
[1]
2024. Dawn: A WebGPU implementation. https://dawn.googlesource.com/dawn
[2]
R. Abbott, J. Chin, Jed Donnelley, W. Konigsford, S. Tokubo, and D. Webb. 1976. Security analysis and enhancements of computer operating systems. https://nvlpubs.nist.gov/nistpubs/Legacy/IR/nbsir76-1041.pdf
[3]
Apple Inc. 2025. Metal. https://developer.apple.com/documentation/metal/
[4]
Babylon.js Team. 2025. Babylon.js FFT ocean demo. https://playground.babylonjs.com/?webgpu##YX6IB8##758
[5]
Mark Batty, Scott Owens, Susmit Sarkar, Peter Sewell, and Tjark Weber. 2011. Mathematizing C++ concurrency. In Proceedings of the 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. https://doi.org/10.1145/1926385.1926394
[6]
Adam Betts, Nathan Chong, Alastair Donaldson, Shaz Qadeer, and Paul Thomson. 2012. GPUVerify: a verifier for GPU kernels. In Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications. https://doi.org/10.1145/2384616.2384625
[7]
Sam Blackshear, Nikos Gorogiannis, Peter W. O’Hearn, and Ilya Sergey. 2018. RacerD: compositional static race detection. Proc. ACM Program. Lang., https://doi.org/10.1145/3276514
[8]
Google Security Blog. 2019. Queue hardening enhancements. https://security.googleblog.com/2019/05/queue-hardening-enhancements.html
[9]
Hans-J. Boehm. 2011. How to miscompile programs with “benign” data races. In 3rd USENIX Workshop on Hot Topics in Parallelism. https://dl.acm.org/doi/10.5555/2001252.2001255
[10]
Brandon Alexander Burtchell and Martin Burtscher. 2024. Characterizing CUDA and OpenMP synchronization primitives. In 2024 IEEE International Symposium on Workload Characterization. https://userweb.cs.txstate.edu/~burtscher/papers/iiswc24b.pdf
[11]
Microsoft Security Response Center. 2019. We need a safer systems programming language. https://msrc.microsoft.com/blog/2019/07/we-need-a-safer-systems-programming-language/
[12]
Tsong Yueh Chen, Shing-Chi Cheung, and Siu-Ming Yiu. 1998. Metamorphic testing: a new approach for generating next test cases. https://doi.org/10.48550/arXiv.2002.12543
[13]
Tiago Cogumbreiro, Julien Lange, Dennis Liew Zhen Rong, and Hannah Zicarelli. 2021. Checking data-race freedom of GPU kernels, compositionally. In Computer Aided Verification: 33rd International Conference. https://doi.org/10.1007/978-3-030-81685-8_19
[14]
NVIDIA Corporation. 2024. CUDA Ada GPU architecture tuning guide. https://docs.nvidia.com/cuda/ada-tuning-guide/index.html
[15]
cppreference.com. 2024. Multithreading - C++ Language Documentation. https://en.cppreference.com/w/cpp/language/multithread
[16]
cppreference.com. 2025. std::atomic - C++ Reference. https://en.cppreference.com/w/cpp/atomic/atomic
[17]
NIST National Vulnerability Database. 2002. CVE-2002-0649. https://nvd.nist.gov/vuln/detail/CVE-2002-0649
[18]
NIST National Vulnerability Database. 2014. CVE-2014-0160 (Heartbleed). https://nvd.nist.gov/vuln/detail/cve-2014-0160
[19]
NIST National Vulnerability Database. 2016. CVE-2016-4655. https://nvd.nist.gov/vuln/detail/CVE-2016-4655
[20]
NIST National Vulnerability Database. 2022. CVE-2022-32947. https://nvd.nist.gov/vuln/detail/CVE-2022-32947
[21]
NIST National Vulnerability Database. 2025. CVE-2024-36353 (AMD Leftover Global Memory). https://nvd.nist.gov/vuln/detail/CVE-2024-36353
[22]
R.A. DeMillo, R.J. Lipton, and F.G. Sayward. 1978. Hints on test data selection: help for the practicing programmer. Computer, https://doi.org/10.1109/C-M.1978.218136
[23]
Tal Derei and Koh Wei Jie. 2023. webgpu-msm-bls12-377: WebGPU MSM implementation for BLS12‑377 curve (ZPrize 2023). https://github.com/td-kwj-zp2023/webgpu-msm-bls12-377
[24]
Tim Dettmers, Mike Lewis, Younes Belkada, and Luke Zettlemoyer. 2022. LLM.int8(): 8-bit matrix multiplication for transformers at scale. In Proceedings of the 36th International Conference on Neural Information Processing Systems. https://doi.org/10.48550/arXiv.2208.07339
[25]
ONNX Runtime developers. 2025. Using WebGPU with ONNX Runtime. https://onnxruntime.ai/docs/tutorials/web/ep-webgpu.html
[26]
Stephen Dolan, KC Sivaramakrishnan, and Anil Madhavapeddy. 2018. Bounding data races in space and time. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation. https://doi.org/10.1145/3192366.3192421
[27]
Alastair F. Donaldson, Hugues Evrard, Andrei Lascu, and Paul Thomson. 2017. Automated testing of graphics shader compilers. Proc. ACM Program. Lang., https://doi.org/10.1145/3133917
[28]
Alastair F. Donaldson, Paul Thomson, Vasyl Teliman, Stefano Milizia, André Perez Maselco, and Antoni Karpiński. 2021. Test-case reduction and deduplication almost for free with transformation-based compiler testing. In Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation. https://doi.org/10.1145/3453483.3454092
[29]
Cormac Flanagan and Stephen N. Freund. 2001. Detecting race conditions in large programs. In Proceedings of the 2001 ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering. https://doi.org/10.1145/379605.379687
[30]
Cormac Flanagan and Stephen N. Freund. 2009. FastTrack: efficient and precise dynamic race detection. In Proceedings of the 30th ACM SIGPLAN Conference on Programming Language Design and Implementation. https://doi.org/10.1145/1542476.1542490
[31]
Cormac Flanagan and Stephen N. Freund. 2010. Adversarial memory for detecting destructive races. In Proceedings of the 31st ACM SIGPLAN Conference on Programming Language Design and Implementation. https://doi.org/10.1145/1806596.1806625
[32]
Shaked Flur, Kathryn E. Gray, Christopher Pulte, Susmit Sarkar, Ali Sezgin, Luc Maranget, Will Deacon, and Peter Sewell. 2016. Modelling the ARMv8 architecture, operationally: concurrency and ISA. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. https://doi.org/10.1145/2837614.2837615
[33]
Shaked Flur, Susmit Sarkar, Christopher Pulte, Kyndylan Nienhuis, Luc Maranget, Kathryn E. Gray, Ali Sezgin, Mark Batty, and Peter Sewell. 2017. Mixed-size concurrency: ARM, POWER, C/C++11, and SC. In Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages. https://doi.org/10.1145/3009837.3009839
[34]
Apache Software Foundation. [n. d.]. Apache TVM: Open Deep Learning Compiler Stack. https://github.com/apache/tvm
[35]
gfx-rs Developers. 2025. wgpu Issue #4972: support for buffer mapping in chrome WebGPU. https://github.com/gfx-rs/wgpu/issues/4972 Accessed: 2025-03-12
[36]
Google MediaPipe Studio. 2025. MediaPipe Studio: LLM inference demo. https://mediapipe-studio.webapps.google.com/studio/demo/llm_inference
[37]
Khronos Group. [n. d.]. Vulkan 1.3 Specification. https://registry.khronos.org/vulkan/specs/1.3-extensions/html/vkspec.html
[38]
Khronos Group. 2024. SPIR-V Unified Specification. https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#_memory_instructions Accessed: 2024-10-07
[39]
Khronos Group. 2024. Vulkan Guide - Robustness. https://docs.vulkan.org/guide/latest/robustness.html
[40]
Yanan Guo, Zhenkai Zhang, and Jun Yang. 2024. GPU memory exploitation for fun and profit. In 33rd USENIX Security Symposium (USENIX Security 24). https://www.usenix.org/system/files/usenixsecurity24-guo-yanan.pdf
[41]
Hugging Face. 2025. Transformers.js documentation. https://huggingface.co/docs/transformers.js/en/index
[42]
Apple Inc. [n. d.]. Metal shading language specification. https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf
[43]
Brittany Johnson, Yoonki Song, Emerson Murphy-Hill, and Robert Bowdidge. 2013. Why don’t software developers use static analysis tools to find bugs? In 2013 35th International Conference on Software Engineering. https://doi.org/10.1109/ICSE.2013.6606613
[44]
Aditya K. Kamath and Arkaprava Basu. 2021. iGUARD: In-GPU Advanced Race Detection. In Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles. Association for Computing Machinery. https://doi.org/10.1145/3477132.3483545
[45]
Bastian Köpcke, Sergei Gorlatch, and Michel Steuwer. 2024. Descend: a safe GPU systems programming language. Proc. ACM Program. Lang., https://doi.org/10.1145/3656411
[46]
Chris Lattner. 2012. LLVM memory use markers. https://www.nondot.org/sabre/LLVMNotes/MemoryUseMarkers.txt
[47]
Bastien Lecoeur, Hasan Mohsin, and Alastair F. Donaldson. 2023. Program reconditioning: avoiding undefined behaviour when finding and reducing compiler bugs. Proc. ACM Program. Lang., https://doi.org/10.1145/3591294
[48]
Sangho Lee, Youngsok Kim, Jangwoo Kim, and Jong Kim. 2014. Stealing webpages rendered on your browser by exploiting GPU vulnerabilities. In 2014 IEEE Symposium on Security and Privacy. https://doi.org/10.1109/SP.2014.9
[49]
Reese Levine, Mingun Cho, Devon McKee, Andrew Quinn, and Tyler Sorensen. 2023. GPUHarbor: testing GPU memory consistency at large (experience paper). In Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis. https://doi.org/10.1145/3597926.3598095
[50]
Reese Levine, Tianhao Guo, Mingun Cho, Alan Baker, Raph Levien, David Neto, Andrew Quinn, and Tyler Sorensen. 2023. MC Mutants: evaluating and improving testing for memory consistency specifications. In Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems. https://doi.org/10.1145/3575693.3575750
[51]
Reese Levine, Ashley Lee, Neha Abbas, Kyle Little, and Tyler Sorensen. 2025. Artifact for SafeRace: assessing and addressing WebGPU memory safety in the presence of data races. https://doi.org/10.5281/zenodo.16915241
[52]
Guodong Li and Ganesh Gopalakrishnan. 2010. Scalable SMT-based verification of GPU kernel functions. In Proceedings of the Eighteenth ACM SIGSOFT International Symposium on Foundations of Software Engineering. https://doi.org/10.1145/1882291.1882320
[53]
Dennis Liew, Tiago Cogumbreiro, and Julien Lange. 2024. Sound and partially-complete static analysis of data-races in GPU programs. Proc. ACM Program. Lang., https://doi.org/10.1145/3689797
[54]
LLVM Project. 2025. LLVM Language Reference Manual. https://llvm.org/docs/LangRef.html
[55]
Rust Graphics Mages. [n. d.]. wgpu. https://github.com/gfx-rs/wgpu
[56]
Jeremy Manson, William Pugh, and Sarita V. Adve. 2005. The Java memory model. In Proceedings of the 32nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. https://doi.org/10.1145/1040305.1040336
[57]
William M. McKeeman. 1998. Differential testing for software. Digit. Tech. J., https://api.semanticscholar.org/CorpusID:14018070
[58]
Microsoft. 2024. Direct3D 11 Advanced Stages - Compute Shader Access. https://learn.microsoft.com/en-us/windows/win32/direct3d11/direct3d-11-advanced-stages-cs-access
[59]
Microsoft. 2024. ld_raw (sm5 - asm). https://learn.microsoft.com/en-us/windows/win32/direct3dhlsl/ld-raw–sm5—asm-
[60]
Microsoft. 2025. DirectX specifications. https://microsoft.github.io/DirectX-Specs/
[61]
MISRA Consortium. 2023. MISRA C++: guidelines for the use of C++ in critical systems. https://www.misra.org.uk/misra-c-plus-plus/ Originally published June 5, 2008; latest edition released October 2023. Accessed 2025‑07‑24
[62]
Marcus Nachtigall, Lisa Nguyen Quang Do, and Eric Bodden. 2019. Explaining Static Analysis - A Perspective. In 2019 34th IEEE/ACM International Conference on Automated Software Engineering Workshop. https://doi.org/10.1109/ASEW.2019.00023
[63]
Satish Narayanasamy, Zhenghao Wang, Jordan Tigani, Andrew Edwards, and Brad Calder. 2007. Automatically classifying benign and harmful data races using replay analysis. In Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation. https://doi.org/10.1145/1250734.1250738
[64]
Scott Owens, Susmit Sarkar, and Peter Sewell. 2009. A Better x86 memory model: x86-TSO. In Proceedings of the 22nd International Conference on Theorem Proving in Higher Order Logics. https://doi.org/10.1007/978-3-642-03359-9_27
[65]
Roberto Di Pietro, Flavio Lombardi, and Antonio Villani. 2016. CUDA leaks: a detailed hack for CUDA and a (partial) fix. ACM Trans. Embed. Comput. Syst., https://doi.org/10.1145/2801153
[66]
Polyvios Pratikakis, Jeffrey S. Foster, and Michael Hicks. 2011. LOCKSMITH: Practical static race detection for C. ACM Trans. Program. Lang. Syst., https://doi.org/10.1145/1889997.1890000
[67]
Chromium Project. 2023. WebGPU technical report. https://chromium.googlesource.com/chromium/src/+/main/docs/security/research/graphics/webgpu_technical_report.md
[68]
Frederik Dermot Pustelnik, Xhani Marvin Saß, and Jean-Pierre Seifert. 2024. Whispering pixels: exploiting uninitialized register accesses in modern GPUs. https://doi.org/10.48550/arXiv.2401.08881
[69]
John Regehr, Yang Chen, Pascal Cuoq, Eric Eide, Chucky Ellison, and Xuejun Yang. 2012. Test-case reduction for C compiler bugs. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation. https://doi.org/10.1145/2254064.2254104
[70]
Charlie F. Ruan, Yucheng Qin, Xun Zhou, Ruihang Lai, Hongyi Jin, Yixin Dong, Bohan Hou, Meng-Shiun Yu, Yiyan Zhai, Sudeep Agarwal, Hangrui Cao, Siyuan Feng, and Tianqi Chen. 2024. WebLLM: a high-performance in-browser LLM inference engine. arxiv:2412.15803. arxiv:2412.15803
[71]
Andrew Ruef, Leonidas Lampropoulos, Ian Sweet, David Tarditi, and Michael Hicks. 2019. Achieving Safety Incrementally with Checked C. In Principles of security and trust. https://doi.org/10.1007/978-3-030-17138-4_4
[72]
Caitlin Sadowski, Edward Aftandilian, Alex Eagle, Liam Miller-Cushon, and Ciera Jaspan. 2018. Lessons from building static analysis tools at Google. Commun. ACM, https://doi.org/10.1145/3188720
[73]
Stefan Savage, Michael Burrows, Greg Nelson, Patrick Sobalvarro, and Thomas Anderson. 1997. Eraser: a dynamic data race detector for multithreaded programs. ACM Trans. Comput. Syst., https://doi.org/10.1145/265924.265927
[74]
Hermann Schweizer, Maciej Besta, and Torsten Hoefler. 2015. Evaluating the cost of atomic operations on modern architectures. In 2015 International Conference on Parallel Architecture and Compilation. https://doi.org/10.1109/PACT.2015.24
[75]
Konstantin Serebryany, Derek Bruening, Alexander Potapenko, and Dmitry Vyukov. 2012. AddressSanitizer: a fast address sanity checker. In USENIX ATC 2012. https://www.usenix.org/conference/usenixfederatedconferencesweek/addresssanitizer-fast-address-sanity-checker
[76]
Konstantin Serebryany and Timur Iskhodzhanov. 2009. ThreadSanitizer: data race detection in practice. In Proceedings of the Workshop on Binary Instrumentation and Applications. https://doi.org/10.1145/1791194.1791203
[77]
Ying Sheng, Lianmin Zheng, Binhang Yuan, Zhuohan Li, Max Ryabinin, Daniel Y. Fu, Zhiqiang Xie, Beidi Chen, Clark Barrett, Joseph E. Gonzalez, Percy Liang, Christopher Ré, Ion Stoica, and Ce Zhang. 2023. FlexGen: high-throughput generative inference of large language models with a single GPU. https://doi.org/10.48550/arXiv.2303.06865
[78]
Matthew D. Sinclair, Johnathan Alsop, and Sarita V. Adve. 2017. Chasing away RAts: semantics and evaluation for relaxed atomics on heterogeneous systems. In Proceedings of the 44th Annual International Symposium on Computer Architecture. https://doi.org/10.1145/3079856.3080206
[79]
Tyler Sorensen and Heidy Khlaaf. 2024. LeftoverLocals: listening to LLM responses through leaked GPU local memory. https://doi.org/10.48550/arXiv.2401.16603
[80]
Tyler Sorensen, Lucas F. Salvador, Harmit Raval, Hugues Evrard, John Wickerson, Margaret Martonosi, and Alastair F. Donaldson. 2021. Specifying and testing GPU workgroup progress models. Proc. ACM Program. Lang., https://doi.org/10.1145/3485508
[81]
Viktor Vafeiadis, Thibaut Balabonski, Soham Chakraborty, Robin Morisset, and Francesco Zappa Nardelli. 2015. Common compiler optimisations are invalid in the C11 memory model and what we can do about it. In Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. https://doi.org/10.1145/2676726.2676995
[82]
W3C. [n. d.]. WebGPU. https://www.w3.org/TR/webgpu/
[83]
W3C. 2024. WebGPU Shading Language (WGSL). https://www.w3.org/TR/WGSL/ Accessed: 2024-10-08
[84]
Conrad Watt, Christopher Pulte, Anton Podkopaev, Guillaume Barbier, Stephen Dolan, Shaked Flur, Jean Pichon-Pharabod, and Shu-yu Guo. 2020. Repairing and mechanising the JavaScript relaxed memory model. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation. https://doi.org/10.1145/3385412.3385973
[85]
Conrad Watt, Andreas Rossberg, and Jean Pichon-Pharabod. 2019. Weakening WebAssembly. Proc. ACM Program. Lang., https://doi.org/10.1145/3360559
[86]
WebGPU Samples. 2025. WebGPU Samples. https://webgpu.github.io/webgpu-samples/
[87]
WebKit Contributors. 2025. WebGPU Source in WebKit. https://github.com/WebKit/WebKit/tree/main/Source/WebGPU
[88]
webml-community. 2025. Whisper Large V3 Turbo WebGPU. https://huggingface.co/spaces/webml-community/whisper-large-v3-turbo-webgpu
[89]
Mark Weiser. 1984. Program slicing. IEEE Transactions on Software Engineering, https://doi.org/10.1109/TSE.1984.5010248
[90]
wgpu contributors. 2024. Add support for restricting indexing to avoid OOB accesses. https://github.com/gfx-rs/wgpu/pull/6431
[91]
Xenova. 2025. Experimental Phi-3 WebGPU. https://huggingface.co/spaces/Xenova/experimental-phi3-webgpu
[92]
Guangxuan Xiao, Ji Lin, Mickael Seznec, Hao Wu, Julien Demouth, and Song Han. 2023. SmoothQuant: accurate and efficient post-training quantization for large language models. In Proceedings of the 40th International Conference on Machine Learning. https://doi.org/10.48550/arXiv.2211.10438
[93]
Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and understanding bugs in C compilers. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation. https://doi.org/10.1145/1993498.1993532
[94]
A. Zeller and R. Hildebrandt. 2002. Simplifying and isolating failure-inducing input. IEEE Transactions on Software Engineering, https://doi.org/10.1109/32.988498
[95]
ZPrize Initiative. 2025. ZPrize: accelerating the future of zero‑knowledge cryptography. https://www.zprize.io/ Accessed on 2025‑07‑24