Loading...

Tag trends are in beta. Feedback? Thoughts? Email me at [email protected]

DeepSeek-TNG-R1T2-Chimera

Kyutai 1.6B Streaming TTS

Open Source 1.7tb Dataset of What AI Crawlers Are Doing

DiffuCoder-7B-CpGRPO: A code generation LLM developed by Apple

Evolutionary Algorithm Automatically Discovers GPU Optimizations Beating Expert Code

Jan-nano-128k: A 4B Model with a Super-Long Context Window (Still Outperforms 671B [in MCP])

Nanonets-OCR-s – OCR model that transforms documents into structured markdown

Show HN: ChatToSTL – AI text-to-CAD for 3D printing

Qwen3 embedding models

Show HN: Penny-1.7B Irish Penny Journal style transfer

Deepseek R1-0528

Qwen3 0.6B now on HuggingFace (quantized)

Hugging Face to sell open-source robots thanks to Pollen Robotics acquisition

FUTO open-sources 1M row keyboard swipe dataset

Understanding MCP Evals: Why Evals Matter for MCP

Qwen2.5-Omni Technical Report

Co-Doodle with Gemini

Open-sourcing 5,000hrs of self-driving dataset

Hugging Face datasets and models for cybersecurity/sofwtare vulnerabilities

Vector Search with DuckDB

The Ultra-Scale Playbook: Training LLMs on GPU Clusters

Autonomous AI Agents Should Not Be Developed

Kokoro WebGPU: Real-time text-to-speech 100% locally in the browser

Open-R1: an open reproduction of DeepSeek-R1

Janus-Pro: Autoregressive framework unifying multimodal understanding&generation

Finally, a Replacement for BERT: Introducing ModernBERT

DeepSeek R1

DeepSeek-R1-Distill-Qwen-1.5B Surpasses GPT-4o in certain benchmarks

Train faster static embedding models with sentence transformers

smolagents: A simple library to build AI agents

More →