pszemraj SFconvertbot commited on
Commit
70486d2
·
verified ·
0 Parent(s):

Super-squash branch 'main' using huggingface_hub

Browse files

Co-authored-by: SFconvertbot <SFconvertbot@users.noreply.huggingface.co>

.gitattributes ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tflite filter=lfs diff=lfs merge=lfs -text
29
+ *.tgz filter=lfs diff=lfs merge=lfs -text
30
+ *.wasm filter=lfs diff=lfs merge=lfs -text
31
+ *.xz filter=lfs diff=lfs merge=lfs -text
32
+ *.zip filter=lfs diff=lfs merge=lfs -text
33
+ *.zst filter=lfs diff=lfs merge=lfs -text
34
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
.gitignore ADDED
@@ -0,0 +1 @@
 
 
1
+ checkpoint-*/
README.md ADDED
@@ -0,0 +1,106 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - generated_from_trainer
5
+ - alpaca
6
+ - self-instruct
7
+ - instruction generation
8
+ - instructiongen
9
+ - longform
10
+ - prompt-generation
11
+ metrics:
12
+ - rouge
13
+ datasets:
14
+ - akoksal/LongForm
15
+ - pszemraj/fleece2instructions
16
+ widget:
17
+ - text: >-
18
+ You'll need to start by choosing the right venue. Consider the type of
19
+ atmosphere and the size of the area that will be suitable for the number of
20
+ guests you plan to invite. Choose the right decorations based on your
21
+ brother's interests, such as balloons in his favorite colors, banners, and
22
+ streamers. Next, decide on the food and drinks, making sure they are tasty
23
+ and appropriate for the occasion. Then decide on the other games, music, and
24
+ entertainment that will make the party memorable. Finally, involve your
25
+ brother's friends and family to help create the perfect surprise.
26
+ example_title: birthday party
27
+ - text: 1) cookies and cream 2) chocolate chip 3) mint chip 4) oreo
28
+ example_title: ice cream
29
+ - text: >-
30
+ Start by selecting a scale model of a building that fits the theme. Use a
31
+ hobby knife and glue to cut and assemble the model into a ruined or
32
+ abandoned version of itself, adding details like broken windows and
33
+ graffiti. Create a base for the diorama using foam, plaster, or other
34
+ materials, and paint it to resemble a ruined street or sidewalk. Add
35
+ miniature vehicles, debris, and figures to complete the scene, and use
36
+ weathering techniques like dry brushing and rust washes to add realism.
37
+ Display the diorama in a shadow box or other protective case to showcase
38
+ your work.
39
+ example_title: Miniature diorama creation
40
+ - text: >-
41
+ Start by selecting clothing that is futuristic and edgy, such as leather
42
+ jackets, neon-colored accessories, and tech-inspired patterns. Add
43
+ accessories like goggles, cybernetic implants, and LED lights to enhance the
44
+ cyberpunk vibe. Use makeup and body paint to create a futuristic look, such
45
+ as metallic skin or neon makeup. Consider adding functional elements to your
46
+ costume, such as a built-in backpack or hidden pockets for your tech
47
+ gadgets. Finally, practice your confident walk and embrace your inner
48
+ cyberpunk for a memorable and immersive costume experience.
49
+ example_title: Cyberpunk costume design
50
+ - text: >-
51
+ Start by creating a base terrain with mountains, valleys, and other natural
52
+ features. Use fractal noise and displacement mapping to add texture and
53
+ detail to the terrain, and experiment with different materials like rock,
54
+ grass, and water. Add surreal elements like floating islands, giant
55
+ mushrooms, or impossible geometry to create a dreamlike atmosphere. Use
56
+ lighting and color grading to enhance the mood and tone of the scene, and
57
+ render the final image at a high resolution for maximum impact. Share your
58
+ surreal landscape with the world and inspire others to explore the
59
+ possibilities of 3D art.
60
+ example_title: Surreal 3D landscape creation
61
+ - text: >-
62
+ Start by setting a realistic goal and creating a training plan. Build up
63
+ your mileage gradually over time, and incorporate cross-training and
64
+ strength exercises to prevent injury and improve endurance. Be sure to stay
65
+ hydrated and properly fuel your body with nutritious foods. Listen to your
66
+ body and adjust your training as needed to avoid overexertion or burnout.
67
+ Finally, taper your training in the weeks leading up to the race to give
68
+ your body time to rest and recover before the big day.
69
+ example_title: Marathon training
70
+ inference:
71
+ parameters:
72
+ max_length: 96
73
+ num_beams: 4
74
+ ---
75
+
76
+
77
+ # bart-base-instructiongen + LongForm
78
+
79
+ Instead of generating questions from text, generate instructions for LLMs!
80
+
81
+ - Check out a [basic demo on Spaces](https://huggingface.co/spaces/pszemraj/generate-instructions)
82
+ - An example of how to use instructiongen models in a CLI script can be found [here](https://gist.github.com/pszemraj/8b0213e700763106074d3ac15d041c14)
83
+ - You can find other models fine-tuned for instruction generation by [searching for the instructiongen tag](https://huggingface.co/models?other=instructiongen).
84
+
85
+ ## about
86
+
87
+ This model is a fine-tuned version of [pszemraj/bart-base-instructiongen](https://huggingface.co/pszemraj/bart-base-instructiongen) on the `akoksal/LongForm` dataset.
88
+
89
+ This was trained on a dataset of **only** instructions+outputs, with any `inputs` filtered out. This means that text of *1) cookies and cream 2) chocolate chip 3) mint chip 4) oreo* will **not** get you *"Rank the following ice cream flavors: oreo, mint chip, chocolate chip, cookies and cream"*.
90
+
91
+ ## Training procedure
92
+
93
+ ### Training hyperparameters
94
+
95
+ The following hyperparameters were used during training:
96
+ - learning_rate: 8e-05
97
+ - train_batch_size: 4
98
+ - eval_batch_size: 4
99
+ - seed: 42
100
+ - distributed_type: multi-GPU
101
+ - gradient_accumulation_steps: 16
102
+ - total_train_batch_size: 64
103
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
104
+ - lr_scheduler_type: cosine
105
+ - lr_scheduler_warmup_ratio: 0.02
106
+ - num_epochs: 3.0
config.json ADDED
@@ -0,0 +1,75 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "pszemraj/bart-base-instructiongen",
3
+ "activation_dropout": 0.1,
4
+ "activation_function": "gelu",
5
+ "add_bias_logits": false,
6
+ "add_final_layer_norm": false,
7
+ "architectures": [
8
+ "BartForConditionalGeneration"
9
+ ],
10
+ "attention_dropout": 0.1,
11
+ "bos_token_id": 0,
12
+ "classif_dropout": 0.1,
13
+ "classifier_dropout": 0.0,
14
+ "d_model": 768,
15
+ "decoder_attention_heads": 12,
16
+ "decoder_ffn_dim": 3072,
17
+ "decoder_layerdrop": 0.0,
18
+ "decoder_layers": 6,
19
+ "decoder_start_token_id": 2,
20
+ "dropout": 0.1,
21
+ "early_stopping": true,
22
+ "encoder_attention_heads": 12,
23
+ "encoder_ffn_dim": 3072,
24
+ "encoder_layerdrop": 0.0,
25
+ "encoder_layers": 6,
26
+ "eos_token_id": 2,
27
+ "forced_bos_token_id": 0,
28
+ "forced_eos_token_id": 2,
29
+ "gradient_checkpointing": false,
30
+ "id2label": {
31
+ "0": "LABEL_0",
32
+ "1": "LABEL_1",
33
+ "2": "LABEL_2"
34
+ },
35
+ "init_std": 0.02,
36
+ "is_encoder_decoder": true,
37
+ "label2id": {
38
+ "LABEL_0": 0,
39
+ "LABEL_1": 1,
40
+ "LABEL_2": 2
41
+ },
42
+ "max_position_embeddings": 1024,
43
+ "model_type": "bart",
44
+ "no_repeat_ngram_size": 3,
45
+ "normalize_before": false,
46
+ "normalize_embedding": true,
47
+ "num_beams": 4,
48
+ "num_hidden_layers": 6,
49
+ "pad_token_id": 1,
50
+ "scale_embedding": false,
51
+ "task_specific_params": {
52
+ "summarization": {
53
+ "length_penalty": 1.0,
54
+ "max_length": 128,
55
+ "min_length": 12,
56
+ "num_beams": 4
57
+ },
58
+ "summarization_cnn": {
59
+ "length_penalty": 2.0,
60
+ "max_length": 142,
61
+ "min_length": 56,
62
+ "num_beams": 4
63
+ },
64
+ "summarization_xsum": {
65
+ "length_penalty": 1.0,
66
+ "max_length": 62,
67
+ "min_length": 11,
68
+ "num_beams": 6
69
+ }
70
+ },
71
+ "torch_dtype": "float32",
72
+ "transformers_version": "4.29.0.dev0",
73
+ "use_cache": true,
74
+ "vocab_size": 50265
75
+ }
generation_config.json ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token_id": 0,
3
+ "decoder_start_token_id": 2,
4
+ "early_stopping": true,
5
+ "eos_token_id": 2,
6
+ "forced_bos_token_id": 0,
7
+ "forced_eos_token_id": 2,
8
+ "no_repeat_ngram_size": 3,
9
+ "num_beams": 4,
10
+ "pad_token_id": 1,
11
+ "transformers_version": "4.29.0.dev0"
12
+ }
merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:18ec5da5c132087567425ea0521eb5db86225af8956ec11ff316eb28e5dd53d0
3
+ size 557912620
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f346717b0be73ec240d71c865e34ba2caf7cd9c9c5cf2235e5fa180b6cb6448f
3
+ size 557971229
special_tokens_map.json ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": "<s>",
3
+ "cls_token": "<s>",
4
+ "eos_token": "</s>",
5
+ "mask_token": {
6
+ "content": "<mask>",
7
+ "lstrip": true,
8
+ "normalized": false,
9
+ "rstrip": false,
10
+ "single_word": false
11
+ },
12
+ "pad_token": "<pad>",
13
+ "sep_token": "</s>",
14
+ "unk_token": "<unk>"
15
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": false,
3
+ "bos_token": "<s>",
4
+ "clean_up_tokenization_spaces": true,
5
+ "cls_token": "<s>",
6
+ "eos_token": "</s>",
7
+ "errors": "replace",
8
+ "mask_token": "<mask>",
9
+ "model_max_length": 1024,
10
+ "pad_token": "<pad>",
11
+ "sep_token": "</s>",
12
+ "tokenizer_class": "BartTokenizer",
13
+ "trim_offsets": true,
14
+ "unk_token": "<unk>"
15
+ }
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6e61fa7bb8d66fb50f9693021c17fd04ad818ce50b98f34f3a28a6fcca8fc545
3
+ size 4155
vocab.json ADDED
The diff for this file is too large to render. See raw diff