🚀 Flux Attention: Context-Aware Hybrid Attention for Efficient LLMs Inference
-
Flux Attention: Context-Aware Hybrid Attention for Efficient LLMs Inference
Paper • 2604.07394 • Published • 9 -
QQTang1223/full_streaming_Llama-3.1-8B-Instruct
Text Generation • 8B • Updated • 38 -
QQTang1223/full_xattn_Qwen3-8B
Text Generation • 8B • Updated • 37 • 1 -
QQTang1223/full_xattn_Qwen3-4B
Text Generation • 4B • Updated • 35