hamishivi
/

tess2-v0.3

Model card Files Files and versions

hamishivi commited on Feb 17

Commit

1541377

·

verified ·

1 Parent(s): 33949b4

Create README.md

Files changed (1) hide show

README.md +29 -0

README.md ADDED Viewed

	@@ -0,0 +1,29 @@

+---
+license: apache-2.0
+datasets:
+- allenai/tulu-3-sft-mixture
+language:
+- en
+base_model:
+- hamishivi/tess2-v0.3-base
+---
+# TESS 2 v0.3 - A Generalist Instruction Tuned Diffusion LM
+This model is the instruction-tuned TESS 2. This model is a simplex-based diffusion model adapted from Mistral v0.1 7B, further trained on Dolma 1.7 and Tulu 3 SFT data.
+For more details, please check out our paper [TESS-2: A Large-Scale, Generalist Diffusion Language Model](https://todo).
+This is the model based on Mistral v0.3 and Tulu 3.
+This model will only work with our custom codebase found [here](https://github.com/hamishivi/tess-2) -- please go there to see details on how to run training and inference.
+## Using this model
+To run this model, first clone https://github.com/hamishivi/tess-2.
+Then, after creating a python environment with the correct packages, you can run inference via a ui with:
+```sh
+./shell_scripts/run_interactive_demo.sh hamishivi/tess2
+```
+This allows you to directly interact with the model, and shows the diffusion generation process.
+For training or other evaluations, please see our main repository.