Xiangpeng Yang's picture

4 16 5

Xiangpeng Yang PRO

XiangpengYang

·

https://xiangpengyang.github.io/

AI & ML interests

diffusion models, video generaiton, video editing

Recent Activity

replied to their post 1 day ago

🚀 Introducing VideoCoF: Unified Video Editing with a Temporal Reasoner (Chain-of-Frames)! We’re excited to introduce VideoCoF, a unified framework for instruction-based video editing that enables temporal reasoning and ~4× video length extrapolation, trained with only 50k video pairs. 🔥 🔍 What makes VideoCoF different? 🧠 Chain-of-Frames reasoning , mimic human thinking process like Seeing → Reasoning → Editing to apply edits accurately over time without external masks, ensuring physically plausible results. 📈 Strong length generalization — trained on 33-frame clips, yet supports multi-shot editing and long-video extrapolation (~4×). 🎯 Unified fine-grained editing — Object Removal, Addition, Swap, and Local Style Transfer, with instance-level & part-level, spatial-aware control. ⚡ Fast inference update 🚀 H100: ~20s / video with 4-step inference, making high-quality video editing far more practical for real-world use. 🔗 Links 📄 Paper: https://arxiv.org/abs/2512.07469 💻 Code: https://github.com/knightyxp/VideoCoF 🤗 Demo: https://huggingface.co/spaces/XiangpengYang/VideoCoF 🧩 Models: https://huggingface.co/XiangpengYang/VideoCoF 🌐 Project Page: https://videocof.github.io/ #VideoEditing #DiffusionModels #GenerativeAI #ComputerVision #AI

replied to their post 1 day ago

🚀 Introducing VideoCoF: Unified Video Editing with a Temporal Reasoner (Chain-of-Frames)! We’re excited to introduce VideoCoF, a unified framework for instruction-based video editing that enables temporal reasoning and ~4× video length extrapolation, trained with only 50k video pairs. 🔥 🔍 What makes VideoCoF different? 🧠 Chain-of-Frames reasoning , mimic human thinking process like Seeing → Reasoning → Editing to apply edits accurately over time without external masks, ensuring physically plausible results. 📈 Strong length generalization — trained on 33-frame clips, yet supports multi-shot editing and long-video extrapolation (~4×). 🎯 Unified fine-grained editing — Object Removal, Addition, Swap, and Local Style Transfer, with instance-level & part-level, spatial-aware control. ⚡ Fast inference update 🚀 H100: ~20s / video with 4-step inference, making high-quality video editing far more practical for real-world use. 🔗 Links 📄 Paper: https://arxiv.org/abs/2512.07469 💻 Code: https://github.com/knightyxp/VideoCoF 🤗 Demo: https://huggingface.co/spaces/XiangpengYang/VideoCoF 🧩 Models: https://huggingface.co/XiangpengYang/VideoCoF 🌐 Project Page: https://videocof.github.io/ #VideoEditing #DiffusionModels #GenerativeAI #ComputerVision #AI

updated a Space 1 day ago

XiangpengYang/VideoCoF

View all activity

Organizations

XiangpengYang 's models 2

XiangpengYang/VideoCoF

Video-to-Video • Updated 8 days ago • 57 • 15

XiangpengYang/videox-fun-env