Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
LLMs
🧠 LLMs
Specific
large language models, GPT, prompt engineering, inference
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
3178
posts in
5.4
ms
harshuljain13/llm-inference-at-scale
: A Practitioner handbook for production
llm
serving.
🍎
Apple
Content type:
Code
github.com
·
4d
4 days ago
·
Hacker News
Actions for harshuljain13/llm-inference-at-scale: A Practitioner handbook for production llm serving.
Ollama
0.30 GPU Boost: Faster local Qwen
inference
on NVIDIA
🖥️
Retro Computing
everylocalai.com
·
11h
11 hours ago
·
DEV
Actions for Ollama 0.30 GPU Boost: Faster local Qwen inference on NVIDIA
The Shibboleth Effect: Auditing the Cross-Lingual Distributional Skew of
Large
Language
Models
🤨
AI Criticism
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for The Shibboleth Effect: Auditing the Cross-Lingual Distributional Skew of Large Language Models
What
Ollama
Reveals About Local AI, Agents, and Open
Models
🤨
AI Criticism
Content type:
Blog
odsc.medium.com
·
9h
9 hours ago
Actions for What Ollama Reveals About Local AI, Agents, and Open Models
lightmetal: GPU
LLM
Inference
From a Single Java 25 JAR
🍎
Apple
Content type:
Blog
adambien.blog
·
2d
2 days ago
Actions for lightmetal: GPU LLM Inference From a Single Java 25 JAR
Using
Scikit-LLM
with Open-Source LLMs
🐍
Python
machinelearningmastery.com
·
6d
6 days ago
Actions for Using Scikit-LLM with Open-Source LLMs
How
Large
Language
Models
Are Creating New Security Challenges
🤨
AI Criticism
Content type:
Blog
medium.com
·
1h
1 hour ago
Actions for How Large Language Models Are Creating New Security Challenges
Inferoa
AI harness claimed 90% cache savings. We ran it and measured 97.8%
⚙️
Systems Programming
zozo123.github.io
·
21h
21 hours ago
·
Hacker News
Actions for Inferoa AI harness claimed 90% cache savings. We ran it and measured 97.8%
Why
LLMs
(still) lack taste
🤨
AI Criticism
beyondtheprior.com
·
2d
2 days ago
·
Hacker News
Actions for Why LLMs (still) lack taste
CommBench: Can
LLMs
Write Correct and Efficient GPU Communication Code?
⚙️
Systems Programming
uccl-project.github.io
·
51m
51 minutes ago
·
Hacker News
Actions for CommBench: Can LLMs Write Correct and Efficient GPU Communication Code?
Running
LLM
Inference
on Kubernetes: What It Actually Takes
🤨
AI Criticism
Content type:
Blog
fairwinds.com
·
5d
5 days ago
Actions for Running LLM Inference on Kubernetes: What It Actually Takes
Fixing a stuck
Ollama
runner and building a GPU watchdog
🦀
Rust
patrickmccanna.net
·
2d
2 days ago
·
Hacker News
Actions for Fixing a stuck Ollama runner and building a GPU watchdog
Fine-tuning
Large
Language Models (LLMs) using PEFT
🤨
AI Criticism
Content type:
Blog
medium.com
·
6h
6 hours ago
Actions for Fine-tuning Large Language Models (LLMs) using PEFT
LLM
Routing: From Strategy Selection to Production
Architecture
🕸️
Networking
Content type:
Blog
blog.n8n.io
·
16h
16 hours ago
Actions for LLM Routing: From Strategy Selection to Production Architecture
RAG
Pipeline Explained: From Query to Answer, Step by Step
🖥️
Retro Computing
Content type:
Blog
medium.com
·
2d
2 days ago
Actions for RAG Pipeline Explained: From Query to Answer, Step by Step
How we fight GPU scarcity without compromise
🔐
Cybersecurity
Content type:
Blog
equixly.com
·
5d
5 days ago
·
Hacker News
Actions for How we fight GPU scarcity without compromise
I've tested so many desktop AI tools, but Hermes with
Ollama
is my new favorite - here's why
🍎
Apple
Content type:
News
Content type:
Tutorial
zdnet.com
·
17h
17 hours ago
Actions for I've tested so many desktop AI tools, but Hermes with Ollama is my new favorite - here's why
LangChain Explained: Understanding
Models
,
Prompts
,
Chains
, Memory, Indexes, and Agents
🤨
AI Criticism
Content type:
Blog
towardsai.net
·
2d
2 days ago
Actions for LangChain Explained: Understanding Models, Prompts, Chains, Memory, Indexes, and Agents
How to Build a Deterministic
RAG
Testing Tool — and Use
LLM
as an Advisor, Not a Judge
🤨
AI Criticism
Content type:
Blog
medium.com
·
5h
5 hours ago
Actions for How to Build a Deterministic RAG Testing Tool — and Use LLM as an Advisor, Not a Judge
LLMs
Are Brilliant. But They Can Be Fooled.
🤨
AI Criticism
Content type:
Blog
medium.com
·
22h
22 hours ago
Actions for LLMs Are Brilliant. But They Can Be Fooled.
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help