gautam6599123's Feed

Rust port of transformers (1M lines of code)

High-performance, memory-safe Rust implementation of Hugging Face Transformers. TrustformeRS brings the power of transformer models to the Rust ecosystem with zero-cost abstractions, fearless concu... Read more ›

Discussed on Hacker News

📡Information Theory binghamton.edu·

Researchers used math to crack Wordle

Using information theory, a team of researchers at Binghamton University has developed a method to solve the popular New York Times puzzle game Wordle with a 99% success rate. Read more ›

Covered by Popular Science

Discussed on Hacker News

📈Data Visualization ggplot2-book.org·

ggplot2: Colour Scales and Legends

You are reading the work-in-progress third edition of the ggplot2 book. This chapter should be readable but is currently undergoing final polishing. Read more ›

Discussed on Hacker News

🌐Distributed Systems hackernoon.com·

How to Plan a Distributed Database Migration Without Any Surprises

Before migrating a distributed database to the cloud, do not start with node count or capacity sizing. Start with the failure model. Decide replication strategy first, then consistency level, then network topology, and only then capacity. A safe migration also needs clear acceptance criteria, a quantitative replication-lag cutover gate, and a tested rollback plan before the maintenance window begins. Capacity is the consequence of architecture, not the starting point. Read more ›

🤖AI arxiv.org·

Zero-order Parameter-free Optimization for LMO-based Methods: Novel Approach for Efficient Fine-tuning

Fine-tuning large language models (LLMs) has become a central application of modern optimization, enabling pretrained models to adapt to diverse downstream tasks and domain-specific data. A major obstacle in large-scale fine-tuning is the memory overhead of backpropagation, which requires storing activations, gradients, and optimizer states. Zeroth-order (ZO) optimization offers a memory-efficient alternative, but its performance is highly sensi... Read more ›

🧠Neural Networks shonczinner.github.io·

Neural Cellular Automata and Recurrent Architectures

Teaching cellular automata to actually do things Read more ›

Covers Unstoppable Upward Spiral

Discussed on Hacker News

🔢Mathematics jirka.org·

Basic Analysis: Introduction to Real Analysis

Free online mathematics textbook for basic real analysis. Read more ›

Discussed on Hacker News

⏱️Time Series Analysis forecastion.com·

A forecasting workbench for analysts and operators

Forecastion ProductHow it worksWorkspaces Log inStart forecasting Built for finance & analytics teams # Build powerful forecasts in seconds. Turn raw data into forecasts, scenarios, and shareable models — fast enough for ad hoc analysis, powerful enough for real work. Build your forecast — free Explore the demo Paste data · Compare methods · Export forecasts app.forecastion.com / forecast-lab End of forecast 611 ± 115 (90% CI) Cumulative 35.7K h = 60 periods Forecast avg 596 ... Read more ›

Discussed on Hacker News

🗣️Large Language Models fareedkhan-dev.github.io·

Train LLM from Scratch

From pretraining to RLHF/GRPO — every algorithm hand-written in pure PyTorch. Read more ›

Discussed on Hacker News

🎮Reinforcement Learning jacobxli.com·

Machine Studying

We increasingly need AI agents to work in domains they never saw during training, like using a new programming library or leveraging the emerging literature around a new disease. Such domains most naturally appear as a corpus of documents, like a textbook on a technical subject or the manual describing a new tool. Read more ›

Covers 8 stories including Personal AI Assistant

Discussed on Hacker News and Hacker News

📐Linear Algebra rocm.blogs.amd.com·

FP8 GEMM Optimization on AMD CDNA4 Architecture

Learn how to build high-performance FP8 GEMM kernels on AMD CDNA™4 GPUs using MFMA, LDS swizzling, and double-buffering. Read more ›

Discussed on Hacker News and Hacker News

🐍Python Python Releases·

Computer Programming for Everybody

The official home of the Python Programming Language Read more ›

Discussed on Hacker News

🔥PyTorch pyrefly.org·

Pyrefly v1.1 is here 27% faster, refactoring tools, tensor shapes

Pyrefly v1.1 brings faster type checking, new IDE refactoring tools, and usability improvements. Read more ›

Discussed on Hacker News

🧠Deep Learning people.idsia.ch·

Munich 1991: The Roots of the Current AI Boom

1991: T and P of ChatGPT, distillation, deep residual learning, LSTM, GAN Read more ›

Covers 2 stories including DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Discussed on Hacker News

💻Computer Science Brain Inspired·

BI 240 Cristopher Moore: Cognition and Computational Complexity

to get full episodes, full archive, and join the Discord community. The Transmitter is an online publication that aims to deliver useful information, insights and tools to build bridges across neuroscience and advance research. Visit thetransmitter.org to explore the latest neuroscience news and perspectives, written by journalists and scientists. Read more about . Sign up for to be notified every time a new Brain Inspired episode is released. To explore more neuroscience news and perspective... Read more ›

Discussed on Hacker News

🤖AI heyneo.com·

Extend Claude limits by offloading AI tasks to Neo

Install neo-mcp, register NEO with Claude Code, and delegate RAG audits, fine-tunes, evals, and pipeline debugging without leaving the terminal. Read more ›

Covered by DEV Community

Discussed on Hacker News

📊Optimization arxiv.org·

Functional Gradient Descent with Adaptive Representations

Functional optimization problems are typically solved by optimizing the parameters of a fixed representation, such as a neural network, resulting in highly nonconvex losses that complicate both training and theoretical analysis. An interesting alternative is functional gradient descent (FGD), that is, gradient descent directly in function space, which benefits from strong convergence results and admits a clean theory. However, FGD is difficult... Read more ›

🎮Reinforcement Learning research.nvidia.com·

Enpire: Agentic Robot Policy Self-Improvement in the Real World

Anonymous ENPIRE project website for agentic robot policy self-improvement in the real world. Read more ›

Covered by 6 sources including Tom's Hardware, Ars Technica

Discussed on Hacker News

💬Natural Language Processing venturebeat.com·

Z.ai’s open-weights GLM-5.2 beats GPT-5.5 on multiple long-horizon coding benchmarks for 1/6th the cost

Today, Chinese AI startup Z.ai (formerly Zhipu AI) , a 753-billion parameter open-weights large language model (LLM) engineered specifically to dominate "long-horizon" autonomous coding and engineering tasks. Available immediately on , the, and more than 20 third-party coding environments, the model boasts a highly stable 1-million-token context window alongside enterprise subscription tiers starting at just $12.60 per month. In excellent news for cost and security-conscious businesses, z.ai ... Read more ›

Covers 8 stories including GLM-5.2 (6 minute read)

Covered by 4 sources including vettedconsumer.com, AI Changes Everything

🕸️Graph Theory arxiv.org·

Designing Efficient and Reachable Routes: The $k$-Step-Central Shortest Path Problem

Designing rapid transportation routes requires balancing efficiency and reachability. Shortest-path models ensure direct, cost-efficient routes but ignore coverage, while centrality-based approaches maximize accessibility but do not enforce operational constraints. We study the problem of selecting a shortest path that maximizes reachability, measured as the number of nodes within a fixed distance of the path. To do this, we introduce the $k$-St... Read more ›