henrycastillo's picture
add everything but lm eval harness
c3b20da verified
uv run hellaswag.py logs DEBUG