Base Model for TransMLA
mengfanxu
fxmeng
AI & ML interests
None yet
Recent Activity
updated
a model
16 days ago
fxmeng/TransMLA-llama3-8b-32k
updated
a model
16 days ago
fxmeng/TransMLA-llama3-8b-8k
updated
a collection
about 1 month ago
TransMLA-base
Organizations
None yet
models
53
fxmeng/TransMLA-llama3-8b-32k
8B
•
Updated
•
40
fxmeng/TransMLA-llama3-8b-8k
8B
•
Updated
•
55
fxmeng/PiSSA-llama-7b-commonsense-148k
7B
•
Updated
•
7
fxmeng/PiSSA-Llama-3-8b-commonsense-148k
8B
•
Updated
•
6
fxmeng/PiSSA-Llama-2-7b-commonsense-148k
7B
•
Updated
•
6
fxmeng/PiSSA-llama-13b-commonsense-148k
13B
•
Updated
•
8
fxmeng/CLOVER-llama-3-8b-commonsense-148k
8B
•
Updated
•
6
fxmeng/CLOVER-llama-2-7b-commonsense-148k
7B
•
Updated
•
8
fxmeng/CLOVER-llama-13b-commonsense-148k
13B
•
Updated
•
8
fxmeng/CLOVER-llama-7b-commonsense-148k
7B
•
Updated
•
12
datasets
12
fxmeng/transmla_pretrain_100m_tokens
Viewer
•
Updated
•
100k
•
19
fxmeng/transmla_pretrain_1B_tokens
Viewer
•
Updated
•
1.14M
•
146
fxmeng/transmla_pretrain_6B_tokens
Viewer
•
Updated
•
5.94M
•
2.03k
fxmeng/pissa-dataset
Viewer
•
Updated
•
844k
•
1.93k
•
3
fxmeng/big-bench-hard-continue-finetuning
Viewer
•
Updated
•
10.3k
•
71
•
1
fxmeng/commonsense_filtered
Viewer
•
Updated
•
170k
•
121
•
1
fxmeng/MetaMath-GSM240K
Viewer
•
Updated
•
240k
•
25
•
1
fxmeng/MetaMath-MATH155K
Viewer
•
Updated
•
155k
•
47
fxmeng/CodeFeedback-Python105K
Viewer
•
Updated
•
105k
•
167
•
6
fxmeng/llava_finetune_336x336
Preview
•
Updated
•
25