Model Card

Mistral 7b Thespis CurtainCall v0.2.2 ultrafeedback DPO QLoRA v0.1

Praxis Maldevide

A nice artifact from a test I ran.

This is a rank 64 QLoRA for Mistral 7b, trained using Thespis CurtainCall v0.2.2 w/ ultrafeedback DPO.

Trained using unsloth for 3 epochs, around 400 samples - the dataset is pretty small and the LR dropped in the expected way. It's a bit tamer than the base model, but still well spoken and descriptive.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support