hamishivi commited on
Commit
1541377
·
verified ·
1 Parent(s): 33949b4

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -0
README.md ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - allenai/tulu-3-sft-mixture
5
+ language:
6
+ - en
7
+ base_model:
8
+ - hamishivi/tess2-v0.3-base
9
+ ---
10
+ # TESS 2 v0.3 - A Generalist Instruction Tuned Diffusion LM
11
+
12
+ This model is the instruction-tuned TESS 2. This model is a simplex-based diffusion model adapted from Mistral v0.1 7B, further trained on Dolma 1.7 and Tulu 3 SFT data.
13
+ For more details, please check out our paper [TESS-2: A Large-Scale, Generalist Diffusion Language Model](https://todo).
14
+ This is the model based on Mistral v0.3 and Tulu 3.
15
+
16
+ This model will only work with our custom codebase found [here](https://github.com/hamishivi/tess-2) -- please go there to see details on how to run training and inference.
17
+
18
+
19
+ ## Using this model
20
+
21
+ To run this model, first clone https://github.com/hamishivi/tess-2.
22
+
23
+ Then, after creating a python environment with the correct packages, you can run inference via a ui with:
24
+ ```sh
25
+ ./shell_scripts/run_interactive_demo.sh hamishivi/tess2
26
+ ```
27
+
28
+ This allows you to directly interact with the model, and shows the diffusion generation process.
29
+ For training or other evaluations, please see our main repository.