Was this trained on top of Llama 3.1 70B or Llama 3.3 70B?
#4
by ddh0 - opened
It was trained on top of Llama 3.1 base model. (We used Llama 3.3 in the blog to compare against as a baseline.)
drishanarora changed discussion status to closed
that's pretty impressive
