Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2503.21144

digital-twin-mo

ChatAnyone: Stylized Real-time Portrait Video Generation with Hierarchical Motion Diffusion Model

Paper • 2503.21144 • Published Mar 27 • 27
Unlock Pose Diversity: Accurate and Efficient Implicit Keypoint-based Spatiotemporal Diffusion for Audio-driven Talking Portrait

Paper • 2503.12963 • Published Mar 17 • 7

kyutai/moshika-vis-pytorch-bf16

Updated Jun 18 • 56
sesame/csm-1b

Text-to-Speech • Updated 10 days ago • 26.1k • 2.28k
kyutai/mimi

Feature Extraction • 96.2M • Updated Jul 2 • 598k • • 272
kyutai/moshiko-pytorch-bf16

Updated Sep 18, 2024 • 320k • 190

One Shot, One Talk: Whole-body Talking Avatar from a Single Image

Paper • 2412.01106 • Published Dec 2, 2024 • 24
MEMO: Memory-Guided Diffusion for Expressive Talking Video Generation

Paper • 2412.04448 • Published Dec 5, 2024 • 10
IDOL: Instant Photorealistic 3D Human Creation from a Single Image

Paper • 2412.14963 • Published Dec 19, 2024 • 6
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models

Paper • 2502.01061 • Published Feb 3 • 222

talking-head-generation

DREAM-Talk: Diffusion-based Realistic Emotional Audio-driven Method for Single Image Talking Face Generation

Paper • 2312.13578 • Published Dec 21, 2023 • 29
Splatter Image: Ultra-Fast Single-View 3D Reconstruction

Paper • 2312.13150 • Published Dec 20, 2023 • 16
Gaussian Head Avatar: Ultra High-fidelity Head Avatar via Dynamic Gaussians

Paper • 2312.03029 • Published Dec 5, 2023 • 26
Relightable Gaussian Codec Avatars

Paper • 2312.03704 • Published Dec 6, 2023 • 33

ChatAnyone: Stylized Real-time Portrait Video Generation with Hierarchical Motion Diffusion Model

Paper • 2503.21144 • Published Mar 27 • 27

deepseek-ai/DeepSeek-R1

Text Generation • 685B • Updated Mar 27 • 1.23M • • 12.9k
deepseek-ai/DeepSeek-V3

Text Generation • 685B • Updated Mar 27 • 806k • • 4.01k
mistralai/Mistral-Small-24B-Instruct-2501

24B • Updated Jul 28 • 553k • 949
deepseek-ai/Janus-Pro-1B

Any-to-Any • Updated Feb 1 • 8.17k • 465

Gen AI Diffusion

Animate-X: Universal Character Image Animation with Enhanced Motion Representation

Paper • 2410.10306 • Published Oct 14, 2024 • 56
ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning

Paper • 2411.05003 • Published Nov 7, 2024 • 71
TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation

Paper • 2411.04709 • Published Nov 5, 2024 • 26
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation

Paper • 2410.07171 • Published Oct 9, 2024 • 43

GAIA: a benchmark for General AI Assistants

Paper • 2311.12983 • Published Nov 21, 2023 • 241
Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models

Paper • 2404.02575 • Published Apr 3, 2024 • 50
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs

Paper • 2404.05719 • Published Apr 8, 2024 • 83
LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA

Paper • 2409.02897 • Published Sep 4, 2024 • 47

digital-twin-mo

ChatAnyone: Stylized Real-time Portrait Video Generation with Hierarchical Motion Diffusion Model

Paper • 2503.21144 • Published Mar 27 • 27
Unlock Pose Diversity: Accurate and Efficient Implicit Keypoint-based Spatiotemporal Diffusion for Audio-driven Talking Portrait

Paper • 2503.12963 • Published Mar 17 • 7

ChatAnyone: Stylized Real-time Portrait Video Generation with Hierarchical Motion Diffusion Model

Paper • 2503.21144 • Published Mar 27 • 27

kyutai/moshika-vis-pytorch-bf16

Updated Jun 18 • 56
sesame/csm-1b

Text-to-Speech • Updated 10 days ago • 26.1k • 2.28k
kyutai/mimi

Feature Extraction • 96.2M • Updated Jul 2 • 598k • • 272
kyutai/moshiko-pytorch-bf16

Updated Sep 18, 2024 • 320k • 190

deepseek-ai/DeepSeek-R1

Text Generation • 685B • Updated Mar 27 • 1.23M • • 12.9k
deepseek-ai/DeepSeek-V3

Text Generation • 685B • Updated Mar 27 • 806k • • 4.01k
mistralai/Mistral-Small-24B-Instruct-2501

24B • Updated Jul 28 • 553k • 949
deepseek-ai/Janus-Pro-1B

Any-to-Any • Updated Feb 1 • 8.17k • 465

One Shot, One Talk: Whole-body Talking Avatar from a Single Image

Paper • 2412.01106 • Published Dec 2, 2024 • 24
MEMO: Memory-Guided Diffusion for Expressive Talking Video Generation

Paper • 2412.04448 • Published Dec 5, 2024 • 10
IDOL: Instant Photorealistic 3D Human Creation from a Single Image

Paper • 2412.14963 • Published Dec 19, 2024 • 6
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models

Paper • 2502.01061 • Published Feb 3 • 222

Gen AI Diffusion

Animate-X: Universal Character Image Animation with Enhanced Motion Representation

Paper • 2410.10306 • Published Oct 14, 2024 • 56
ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning

Paper • 2411.05003 • Published Nov 7, 2024 • 71
TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation

Paper • 2411.04709 • Published Nov 5, 2024 • 26
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation

Paper • 2410.07171 • Published Oct 9, 2024 • 43

talking-head-generation

DREAM-Talk: Diffusion-based Realistic Emotional Audio-driven Method for Single Image Talking Face Generation

Paper • 2312.13578 • Published Dec 21, 2023 • 29
Splatter Image: Ultra-Fast Single-View 3D Reconstruction

Paper • 2312.13150 • Published Dec 20, 2023 • 16
Gaussian Head Avatar: Ultra High-fidelity Head Avatar via Dynamic Gaussians

Paper • 2312.03029 • Published Dec 5, 2023 • 26
Relightable Gaussian Codec Avatars

Paper • 2312.03704 • Published Dec 6, 2023 • 33

GAIA: a benchmark for General AI Assistants

Paper • 2311.12983 • Published Nov 21, 2023 • 241
Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models

Paper • 2404.02575 • Published Apr 3, 2024 • 50
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs

Paper • 2404.05719 • Published Apr 8, 2024 • 83
LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA

Paper • 2409.02897 • Published Sep 4, 2024 • 47

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs