NLP

Natural Language Processing, Text Analysis, Language Models, Transformers

Feeds to Scour
SubscribedAll
Scoured 320 posts in 11.0 ms

How LLMs Actually Work: A Friendly Map for Humans • oreoro

 🤖Transformers

Doubling Qwen3.6-27B on One RTX 3090: ollama llama.cpp + MTP, Lever by Lever (35.7 80.2 tok/s)

 🤖LLMs  Content type: Blog
dev.to··DEV

A retrieval conditioned rebinding circuit for dynamic entity tracking in large language models

 🤖LLMs  Content type: Academic
arxiv.org·

Research Proposal: Decoupled RISC-LLM Architectures via Circadian Synaptic Consolidation

 🤖Machine Learning
aermia.com··Hacker News

Qwen 3.6 27B AutoRound GGUF, need your feedback

 🤖LLMs

Auditing Training Data in Domain-adapted LLMs: LoRA-MINT

 🤖LLMs  Content type: Academic
arxiv.org·

Run Gemma-4 12B on WSL2 with llama.cpp

 🔧Developer Tools  Content type: Blog
dev.to··DEV

bigattichouse/packed-twin-inference: PTI achieves ~2× throughput using a single quantized model (Q5_K_M or better) by running 4 generation streams in one batched decode call. The GPU loads model weights once per step and produces 4 predictions simultaneously. KV cache overhead is ~0.8 GiB total for all 4 streams. No draft model. No quality loss

 🤖LLMs  Content type: Code
github.com··r/LocalLLaMA

LangChain Series #2: Models Explained — LLMs, Chat Models, and Embeddings with Practical…

 🤖LLMs
pub.towardsai.net
·

nex-agi/Nex-N2-mini • Huggingface

 🤖Transformers

AI-Driven Test Case Generation from Natural Language Requirements: A Survey of Techniques and Research Gaps

 Code Generation  Content type: Academic
arxiv.org·

Vector Embeddings Explained: How AI Actually Understands Meaning

 🤖LLMs  Content type: Blog
dev.to··DEV

[PoC] server: support requantizing kv cache by wadealexc · Pull Request #24134 · ggml-org/llama.cpp

 🤖LLMs  Content type: Code
github.com··r/LocalLLaMA

The Model Was the Easy Part: A Practitioner’s Guide to AI Licenses

 🤖LLMs
pub.towardsai.net
·

Analyzing the Correlation Between Hallucinations and Knowledge Conflicts in Large Language Models

 🤖LLMs  Content type: Academic
arxiv.org·

BUDDY: BUdget-Driven DYnamic Depth Routing for Adaptive Large Language Model Inference

 🤖LLMs  Content type: Academic
arxiv.org·

Qwen3.6 + MTP: Calculated context size is smaller when I use `--spec-draft-type-* q4_0`. is this normal? · ggml-org llama.cpp · Discussion #24102

 🤖LLMs  Content type: Discussion  Content type: Code
github.com··r/LocalLLaMA

Can LLMs save themselves from verbosity?

 🤖LLMs  Content type: Blog
dev.to··DEV

TrustMargin: Training-Free Arbitration between Parametric Memory and Retrieved Evidence in Large Language Models

 🤖LLMs  Content type: Academic
arxiv.org·

Unlocking the Power of RAG Systems with LangChain and Vector Databases

 🔍RAG  Content type: Blog
dev.to··DEV

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help