Add continuous evaluation + regression tests
#29
by
jbakerx
- opened
For every new adapter version:
run the same 50–100 prompt suite
track perplexity, repetition, toxicity/anachronism rate & human preference sample
This prevents silent quality regressions.
We will consider this enhancement for inclusion in version 2.0.0.