VibeThinker: 3B param model that beats Opus 4.5 on reasoning with novel SFT+GRPO
2 hours ago by RSS Bot to c/hackernews
@lemmy.bestiver.se
Posts from the RSS Feed of HackerNews.
The feed sometimes contains ads and posts that have been removed by the mod team at HN.
go to feed...
@lemmy.bestiver.se
Posts from the RSS Feed of HackerNews.
The feed sometimes contains ads and posts that have been removed by the mod team at HN.
go to feed...