Xi

xi0v

·

AI & ML interests

RL, Model merging, Model Editing and Vision/Multimodal Model Fine-tuning.

Recent Activity

liked a dataset 6 days ago

MingSafeR/The_full_pure_image_of_FLUX-Reason-6M

liked a dataset 11 days ago

Shio-Koube/Danbooru_filter

liked a model 11 days ago

MiniT2I/MiniT2I

View all activity

Organizations

upvoted a paper 3 months ago

REAP the Experts: Why Pruning Prevails for One-Shot MoE compression

Paper • 2510.13999 • Published Oct 15, 2025 • 20

upvoted a paper 4 months ago

One Model, Many Budgets: Elastic Latent Interfaces for Diffusion Transformers

Paper • 2603.12245 • Published Mar 12 • 18

upvoted a collection 4 months ago

Qwen3.5

21 items • Updated Mar 9 • 1.7k

upvoted an article 6 months ago

Article

The Optimal Architecture for Small Language Models

codelion

•

Dec 26, 2025

• 121

upvoted a paper 7 months ago

Souper-Model: How Simple Arithmetic Unlocks State-of-the-Art LLM Performance

Paper • 2511.13254 • Published Nov 17, 2025 • 140

upvoted a collection 8 months ago

timm DINOv3

Meta AI's DINOv3 weights in timm. ViTs with `qkvb` have a zero QV bias present, otherwise bias is disabled. QKV bias are all 0 in original weights. • 18 items • Updated Sep 19, 2025 • 38

upvoted a paper 8 months ago

Black-Box On-Policy Distillation of Large Language Models

Paper • 2511.10643 • Published Nov 13, 2025 • 54

upvoted an article 8 months ago

Article

Projected Abliteration

grimjim

•

Oct 25, 2025

• 45

upvoted a paper 8 months ago

π_RL: Online RL Fine-tuning for Flow-based Vision-Language-Action Models

Paper • 2510.25889 • Published Oct 29, 2025 • 66

upvoted a collection 9 months ago

_Originals

0 items • Updated Mar 2 • 1

upvoted a paper 10 months ago

AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning

Paper • 2509.08755 • Published Sep 10, 2025 • 56

upvoted 2 papers 11 months ago

Puppeteer: Rig and Animate Your 3D Models

Paper • 2508.10898 • Published Aug 14, 2025 • 33

Seeing, Listening, Remembering, and Reasoning: A Multimodal Agent with Long-Term Memory

Paper • 2508.09736 • Published Aug 13, 2025 • 58

upvoted a collection 11 months ago

Hybrid Linear Attention Research

All 1.3B & 340M hybrid linear-attention experiments. • 62 items • Updated Sep 11, 2025 • 14

upvoted a paper 11 months ago

Geometric-Mean Policy Optimization

Paper • 2507.20673 • Published Jul 28, 2025 • 32

upvoted an article 12 months ago

Article

Vibe coding for data science: how to label a dataset with Kimi K2

dvilasuero

•

Jul 22, 2025

• 22

upvoted 4 papers 12 months ago

Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities

Paper • 2507.13158 • Published Jul 17, 2025 • 24

One Token to Fool LLM-as-a-Judge

Paper • 2507.08794 • Published Jul 11, 2025 • 32

Lost in Latent Space: An Empirical Study of Latent Diffusion Models for Physics Emulation

Paper • 2507.02608 • Published Jul 3, 2025 • 22

Fast and Simplex: 2-Simplicial Attention in Triton

Paper • 2507.02754 • Published Jul 3, 2025 • 25