progscrape: openpipe.ai

Show HN: RULER – Easily apply RL to any agent

6 days ago openpipe.ai show

Using GRPO to Beat o1, o3-mini and R1 at “Temporal Clue”

4 months ago openpipe.ai

Using reinforcement learning and $4.80 of GPU time to find the best HN post

8 months ago openpipe.ai gpu

OpenPipe Mixture of Agents: Outperform GPT-4 at 1/25th the Cost

13 months ago openpipe.ai

Mistral 7B Fine-Tune Optimized

19 months ago openpipe.ai

Is AI the next crypto? Insights from HN comments

20 months ago openpipe.ai ai