Data Locality
Less-relevant results
Making Locality-aware GEMM Compatible with Page-Granularity Placement on Chiplet GPUs
🤖AI Content type: Academiccoherentforge/CambiOS: Zero-trust, capability-based Rust microkernel targeting formal verification. Tri-arch (x86_64 / AArch64 / RISC-V). Sovereign and generative: no telemetry, user owns keys and data. Early-stage — see STATUS.md. Inspired by seL4, Hubris, and Redox.
🧠Memory Management Content type: CodeNeural Field Tokenizations with Hierarchy and Spatial Locality Priors
🔍Search Indexing Content type: Academicbigattichouse/packed-twin-inference: PTI achieves ~2× throughput using a single quantized model (Q5_K_M or better) by running 4 generation streams in one batched decode call. The GPU loads model weights once per step and produces 4 predictions simultaneously. KV cache overhead is ~0.8 GiB total for all 4 streams. No draft model. No quality loss
🤖AI Content type: CodeNo more posts from emschwartz's subscribed feeds.