Building on HF

32 20

Yauhen Yavorski

slappatuski

AI & ML interests

image generation, image-to-image, text-to-image, inpainting, and video generation

Recent Activity

upvoted a paper 1 day ago

RAG-Anything: All-in-One RAG Framework

upvoted a paper 1 day ago

SAM 3: Segment Anything with Concepts

liked a Space 16 days ago

Mihir1107/TheSnitch

View all activity

Organizations

upvoted 2 papers 1 day ago

RAG-Anything: All-in-One RAG Framework

Paper • 2510.12323 • Published Oct 14, 2025 • 82

SAM 3: Segment Anything with Concepts

Paper • 2511.16719 • Published Nov 20, 2025 • 137

upvoted a paper 16 days ago

Hierarchical Abstract Tree for Cross-Document Retrieval-Augmented Generation

Paper • 2605.00529 • Published 21 days ago • 5

upvoted 3 papers 17 days ago

upvoted an article 17 days ago

Article

LightOnOCR-2-1B: a lightweight high-performance end-to-end OCR model family

lightonai

•

Jan 19

• 94

upvoted a paper 20 days ago

Recursive Multi-Agent Systems

Paper • 2604.25917 • Published 24 days ago • 273

upvoted a paper 22 days ago

LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model

Paper • 2604.20796 • Published 30 days ago • 240

upvoted a paper 23 days ago

Operationalizing a National Digital Library: The Case for a Norwegian Transformer Model

Paper • 2104.09617 • Published Apr 19, 2021 • 2

upvoted an article 24 days ago

Article

Introducing the Synthetic Data Generator - Build Datasets with Natural Language

davidberenstein1957, sdiazlor, Leiyre, dvilasuero, Ameeeee, burtenshaw

•

Dec 16, 2024

• 158

upvoted a collection 27 days ago

BERT release

Collection

Regroups the original BERT models released by the Google team. Except for the models marked otherwise, the checkpoints support English. • 8 items • Updated Mar 12 • 44

upvoted a collection about 1 month ago

Gemma 4

Collection

12 items • Updated 16 days ago • 835

upvoted an article 2 months ago

Article

Using LoRA for Efficient Stable Diffusion Fine-Tuning

pcuenq, sayakpaul

•

Jan 26, 2023

• 82

upvoted a paper 8 months ago

SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion

Paper • 2503.11576 • Published Mar 14, 2025 • 158

upvoted 2 articles 8 months ago

Article

Vision Language Models (Better, faster, stronger)

merve, sergiopaniego, ariG23498, pcuenq, andito

•

May 12, 2025

• 612

Article

TimeScope: How Long Can Your Video Large Multimodal Model Go?

orrzohar, ruili0, andito, nicholswang

•

Jul 23, 2025

• 48

upvoted 2 articles 9 months ago

Article

NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks

nvidia

•

Aug 11, 2025

• 76

Article

Introducing Idefics2: A Powerful 8B Vision-Language Model for the community

Leyo, HugoLaurencon, VictorSanh

•

Apr 15, 2024

• 191

upvoted a paper 9 months ago

PaliGemma: A versatile 3B VLM for transfer

Paper • 2407.07726 • Published Jul 10, 2024 • 73

Yauhen Yavorski

AI & ML interests

Recent Activity

Organizations

slappatuski's activity

LightOnOCR-2-1B: a lightweight high-performance end-to-end OCR model family

Introducing the Synthetic Data Generator - Build Datasets with Natural Language

Using LoRA for Efficient Stable Diffusion Fine-Tuning

Vision Language Models (Better, faster, stronger)

TimeScope: How Long Can Your Video Large Multimodal Model Go?

NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks

Introducing Idefics2: A Powerful 8B Vision-Language Model for the community