Axel's picture

6

Axel

axellabs

·

https://axelsimms.com

axellabs

AI & ML interests

None yet

Recent Activity

reacted to flozi00's post with 👍 20 days ago

Running large language models efficiently is more than just raw GPU power. The latest guide breaks down the essential math to determine if your LLM workload is compute-bound or memory-bound. We apply these principles to a real-world example: Qwen's 32B parameter model on the new NVIDIA RTX PRO 6000 Blackwell Edition. In this guide, you will learn how to: Calculate your GPU's operational intensity (Ops:Byte Ratio) Determine your model's arithmetic intensity Identify whether your workload is memory-bound or compute-bound Read the full guide here: https://flozi.net/en/guides/ai/llm-inference-math

updated a Space about 1 month ago

axellabs/Remara

published a Space about 1 month ago

axellabs/Remara

View all activity

Organizations

spaces 1

Remara

a chatbot that learns and self improves

models 0

None public yet

datasets 0

None public yet