Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
tuandunghcmut
's Collections
RL-Papers
MT-LLM
Visual Chain-of-Thought Reasoning Benchmarks
LLM for Security Benchmarks/Datasets
Visual-CoT/GCoT related
Text Embedding Papers
Quantized versions of LLMs/MLLMs
Multilingual Sentiment Analysis Dataset
LLM Series
LLM/MLLM (20B - 80B, fit on 1-2 A100/H100)
SLM
MLLM (100B - 300B)
Benchmarks for evaluating LLMs/MLLMs
Conversation Dataset
Multilingual Parallel Text Corpus
Multilingual Pretraining Corpus for Southeast Asian Language
Multilingual Pretraining Corpus for Southeast Asian Language
updated
5 days ago
Upvote
-
aisingapore/SEA-PILE-v2
Viewer
•
Updated
Apr 14
•
187M
•
1.88k
•
4
aisingapore/SEA-PILE-v1
Viewer
•
Updated
5 days ago
•
636M
•
1.27k
•
17
airesearch/scb_mt_enth_2020
Updated
Jan 18, 2024
•
217
•
9
aisingapore/WangchanLION-Web
Viewer
•
Updated
Sep 3
•
19.8M
•
104
•
3
aisingapore/WangchanLION-Curated
Viewer
•
Updated
Sep 3
•
402k
•
168
•
3
tuandunghcmut/PhoMT-MTet-Mixture
Viewer
•
Updated
Aug 11
•
7.62M
•
147
•
1
HuggingFaceFW/clean-wikipedia
Viewer
•
Updated
Oct 21
•
61.2M
•
983
•
23
uonlp/CulturaX
Viewer
•
Updated
Dec 16, 2024
•
7.18B
•
10.7k
•
556
allenai/c4
Viewer
•
Updated
Jan 9, 2024
•
10.4B
•
685k
•
485
Upvote
-
Share collection
View history
Collection guide
Browse collections