FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language Paper • 2506.20920 • Published Jun 26, 2025 • 75
ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario Paper • 2501.10132 • Published Jan 17, 2025 • 22
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning Paper • 2507.01006 • Published Jul 1, 2025 • 250
view article Article Introducing the Synthetic Data Generator - Build Datasets with Natural Language +4 Dec 16, 2024 • 152
view article Article How to generate text: using different decoding methods for language generation with Transformers Mar 1, 2020 • 280