Parallel Computing
Multiversion Concurrency Control for Multiversion B-Trees
🗄️Database Recovery Content type: AcademicWhen More Cores Hurts: The Vector Database Scaling Paradox in HPC
🗂️Vector Databases Content type: AcademicAPEX4: Efficient Pure W4A4 LLM Inference via Intra-SM Compute Rebalancing
🎨LUT Compression Content type: Academiczhongkaifu/TensorSharp: A C# inference engine for running large language models (LLMs) locally using GGUF model files. TensorSharp provides a console application, a web-based chatbot interface, and Ollama/OpenAI-compatible HTTP APIs for programmatic access. It supports Windows/MacOS/Linux with full GPU capability
💻Local LLMs Content type: CodeSET: Stream-Event-Triggered Scheduling for Efficient CUDA Graph Pipelines
💻Operating System, OS Content type: AcademicCodegenBench: Can LLMs Write Efficient Code Across Architectures?
💻Operating System, OS Content type: AcademicNo more posts from matmat's subscribed feeds.