Beyond LLM-as-a-Judge: Deterministic Metrics for Multilingual Generative Text Evaluation Paper • 2604.05083 • Published Apr 6
Contrastive Representation Learning: A Framework and Review Paper • 2010.05113 • Published Oct 10, 2020 • 1
NeurIPS 2025 E2LM Competition : Early Training Evaluation of Language Models Paper • 2506.07731 • Published Jun 9, 2025 • 2
Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance Paper • 2507.22448 • Published Jul 30, 2025 • 71
What Really Controls Temporal Reasoning in Large Language Models: Tokenisation or Representation of Time? Paper • 2603.19017 • Published Mar 19 • 3
What Really Controls Temporal Reasoning in Large Language Models: Tokenisation or Representation of Time? Paper • 2603.19017 • Published Mar 19 • 3
From RAG to Agentic RAG for Faithful Islamic Question Answering Paper • 2601.07528 • Published Jan 12 • 4
Prototypicality Bias Reveals Blindspots in Multimodal Evaluation Metrics Paper • 2601.04946 • Published Jan 8
Distributional Semantics Tracing: A Framework for Explaining Hallucinations in Large Language Models Paper • 2510.06107 • Published Oct 7, 2025 • 3
view post Post 8943 We're kick-starting the process of Transformers v5, with @ArthurZ and @cyrilvallez !v5 should be significant: we're using it as a milestone for performance optimizations, saner defaults, and a much cleaner code base worthy of 2025.Fun fact: v4.0.0-rc-1 came out on Nov 19, 2020, nearly five years ago! See translation 6 replies · 🚀 20 20 👍 10 10 🔥 6 6 + Reply
What Language Model to Train if You Have One Million GPU Hours? Paper • 2210.15424 • Published Oct 27, 2022 • 2
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Paper • 2211.05100 • Published Nov 9, 2022 • 39
Enhancing Few-shot Text-to-SQL Capabilities of Large Language Models: A Study on Prompt Design Strategies Paper • 2305.12586 • Published May 21, 2023
TESS 2: A Large-Scale Generalist Diffusion Language Model Paper • 2502.13917 • Published Feb 19, 2025 • 6