Loading...

Tag trends are in beta. Feedback? Thoughts? Email me at [email protected]

Apertus 70B: Truly Open - Swiss LLM by ETH, EPFL and CSCS

Wan2.2-S2V-14B – audio-driven cinematic video generation model

grok-2 on Hugging Face

Qwen Image

LFM2 WebGPU

Qwen3-4B-Thinking-2507

Introducing Pivotal Token Search (PTS): Targeting Critical Decision Points in LLM Training

Unsupervised Model Improvement via Internal Coherence Maximization: Outperforming Human-Supervised Methods Through Self-Elicitation

Implemented the research paper “Memorizing Transformers” from scratch with my own additional modifications in architecture and customized training pipeline .

Beyond Python: AI Agents in JavaScript with KaibanJS

LLM Embeddings Explained: A Visual and Intuitive Guide

Qwen3-Coder-30B-A3B-Instruct

Qwen3 235B beats Claude on some code benchmarks

Qwen3 30B-A3B

Voxtral-Mini-3B-2507 – Open source speech understanding model

Reachy Mini – The Open-Source Robot for Today's and Tomorrow's AI Builders

Qwen3-235B-A22B-Thinking-2507

Qwen3-235B-A22B-Instruct-2507

DeepSeek-TNG-R1T2-Chimera

Smollm3: Smol, multilingual, long-context reasoner LLM

Kyutai 1.6B Streaming TTS

Open Source 1.7tb Dataset of What AI Crawlers Are Doing

DiffuCoder-7B-CpGRPO: A code generation LLM developed by Apple

Evolutionary Algorithm Automatically Discovers GPU Optimizations Beating Expert Code

Jan-nano-128k: A 4B Model with a Super-Long Context Window (Still Outperforms 671B [in MCP])

Nanonets-OCR-s – OCR model that transforms documents into structured markdown

Show HN: ChatToSTL – AI text-to-CAD for 3D printing

Qwen3 embedding models

Show HN: Penny-1.7B Irish Penny Journal style transfer

Deepseek R1-0528

More →