Flax Community

non-profit

https://github.com/huggingface/transformers/tree/master/examples/research_projects/jax-projects

AI & ML interests

JAX, Flax, TPU, 🤗

Recent Activity

w11wo authored a paper about 16 hours ago

AnyMo: Geometry-Aware Setup-Agnostic Modeling of Human Motion in the Wild

w11wo authored a paper about 16 hours ago

TrajPrism: A Multi-Task Benchmark for Language-Grounded Urban Trajectory Understanding

w11wo authored a paper 16 days ago

TrajDLM: Topology-Aware Block Diffusion Language Model for Trajectory Generation

View all activity

authored a paper about 1 month ago

Beyond LLM-as-a-Judge: Deterministic Metrics for Multilingual Generative Text Evaluation

Paper • 2604.05083 • Published Apr 6

authored 4 papers about 2 months ago

Contrastive Representation Learning: A Framework and Review

Paper • 2010.05113 • Published Oct 10, 2020 • 1

NeurIPS 2025 E2LM Competition : Early Training Evaluation of Language Models

Paper • 2506.07731 • Published Jun 9, 2025 • 2

Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance

Paper • 2507.22448 • Published Jul 30, 2025 • 71

Falcon Perception

Paper • 2603.27365 • Published Mar 28 • 16

submitted a paper to Daily Papers about 2 months ago

Composer 2 Technical Report

Paper • 2603.24477 • Published Mar 25 • 18

authored 2 papers 2 months ago

Fanar-Sadiq: A Multi-Agent Architecture for Grounded Islamic QA

Paper • 2603.08501 • Published Mar 9

What Really Controls Temporal Reasoning in Large Language Models: Tokenisation or Representation of Time?

Paper • 2603.19017 • Published Mar 19 • 3

submitted 2 papers to Daily Papers 2 months ago

What Really Controls Temporal Reasoning in Large Language Models: Tokenisation or Representation of Time?

Paper • 2603.19017 • Published Mar 19 • 3

Fanar-Sadiq: A Multi-Agent Architecture for Grounded Islamic QA

Paper • 2603.08501 • Published Mar 9

authored 2 papers 4 months ago

From RAG to Agentic RAG for Faithful Islamic Question Answering

Paper • 2601.07528 • Published Jan 12 • 4

Prototypicality Bias Reveals Blindspots in Multimodal Evaluation Metrics

Paper • 2601.04946 • Published Jan 8

authored 2 papers 6 months ago

On Space Folds of ReLU Neural Networks

Paper • 2502.09954 • Published Feb 14, 2025

The Space Between: On Folding, Symmetries and Sampling

Paper • 2503.08502 • Published Mar 11, 2025

authored a paper 8 months ago

Distributional Semantics Tracing: A Framework for Explaining Hallucinations in Large Language Models

Paper • 2510.06107 • Published Oct 7, 2025 • 3

posted an update 8 months ago

Post

8943

We're kick-starting the process of Transformers v5, with @ArthurZ and @cyrilvallez !

v5 should be significant: we're using it as a milestone for performance optimizations, saner defaults, and a much cleaner code base worthy of 2025.

Fun fact: v4.0.0-rc-1 came out on Nov 19, 2020, nearly five years ago!

6 replies

·

authored 4 papers 9 months ago

What Language Model to Train if You Have One Million GPU Hours?

Paper • 2210.15424 • Published Oct 27, 2022 • 2

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Paper • 2211.05100 • Published Nov 9, 2022 • 39

Enhancing Few-shot Text-to-SQL Capabilities of Large Language Models: A Study on Prompt Design Strategies

Paper • 2305.12586 • Published May 21, 2023

TESS 2: A Large-Scale Generalist Diffusion Language Model

Paper • 2502.13917 • Published Feb 19, 2025 • 6