Was this trained on top of Llama 3.1 70B or Llama 3.3 70B?

by ddh0 - opened Jul 31, 2025

The blog post makes it seem like it was trained on top of L3.3, but the model card shows it as being tuned on top of L3.1:

Deep Cogito org Aug 1, 2025

It was trained on top of Llama 3.1 base model. (We used Llama 3.3 in the blog to compare against as a baseline.)

drishanarora changed discussion status to closed Aug 1, 2025

that's pretty impressive

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment