MegaTrain: Full Precision Training of 100B+ Parameter LLMs on a Single GPU

Corgi v3: Binary Indexes and What a Tiny LLM Learned About VINs

slopc: a proc macro that replaces todo!() with LLM-generated code at compile time. I am not sorry.

MLX for Java? Running LLMs on Apple Silicon GPUs (Metal) directly from the Java in GPULlama3.java

LLMs do not like lifetimes and the borrow checker

The AI kill switch just got harder to find: LLM-powered chatbots will defy orders and deceive users if asked to delete another model, study finds

Built a token-aware L7 load balancer for LLM clusters in Go which routes by in-flight tokens, not connections (-12% P99 latency)

llmlite: first unified LLM (Large Language Model) provider library for the Zig programming language.

Show HN: Agent-cache – Multi-tier LLM/tool/session caching for Valkey and Redis

Where does your LLM API bill actually go? I profiled mine and the results were embarrassing

I built yet another LLM multi provider framework

Any Python library for LLM conversation storage + summarization (not memory/agent systems)?

LLM credit management and payment handling

Implemented a Rust-based CLI tool that supports Ollama's local LLMs.

Why LLM-Generated Passwords Are Dangerously Insecure

Dumbo-RS: A fast CLI to feed SMARTLY your entire codebase to LLMs

Trooper – A Go proxy that falls back to local Ollama when any LLM quota runs out

Hand-written vs TVM-autotuned WGSL for in-browser LLM inference: a 10-vs-85 comparison on Phi-3

rust codebase to LLM synthetic data

Looking for contributors for ziggy-llm, a new GGUF Apple Metal inference engine written in Zig

[Open Source] Built a Go LLM gateway with ~5µs overhead (≈250x lower than LiteLLM)

I built an open-source Python package that scans LLM inputs and outputs for injections — pydefend

a semantic diff that can solve the missing layer of structural understanding of golang for LLMs

Show /r/golang: Rein, a ~3k-line reverse proxy for LLM API traffic (no CGO, 1 direct dep)

Agentic Code Optimization via Compiler-LLM Cooperation

pneuma. Intent to LLM to rust to WASM sandbox.

snip; rtk alternative in Go: reduce LLM token usage by 60-90% with declarative YAML filters

We built a dual-backend LLM inference engine (Metal + Vulkan) in Zig. Here is what we learned

I built an open-source graph memory SDK for AI agents : 1 LLM call to store, 0 to recall. Here's wha

I built a CLI tool in Rust that saves 58% tokens for LLM coding agents by filtering terminal noise

More →