24 4 62

Qi Liu

baiall

AI & ML interests

None yet

Recent Activity

liked a model 29 days ago

numind/NuExtract3

liked a model 2 months ago

kai-os/Carnice-9b

liked a model 2 months ago

netflix/void-model

View all activity

Organizations

None yet

liked a model 29 days ago

numind/NuExtract3

Image-to-Text • 5B • Updated 21 days ago • 111k • 259

liked 2 models 2 months ago

kai-os/Carnice-9b

Text Generation • 9B • Updated Apr 4 • 160 • 184

netflix/void-model

Video-to-Video • Updated Apr 6 • 947

liked a model 3 months ago

Tesslate/OmniCoder-9B

Text Generation • 9B • Updated Mar 13 • 4.14k • 647

liked a model 5 months ago

FutureMa/Eva-4B

Text Generation • 4B • Updated Jan 18 • 35 • • 78

liked 2 models 7 months ago

ApsaraStackMaaS/EvoQwen2.5-VL-Retriever-7B-v1

Visual Document Retrieval • 8B • Updated Apr 9 • 34 • 17

aliRafik/invoices-donut-finetuned-lora

Updated Aug 30, 2025 • 1

liked a model 9 months ago

moondream/moondream3-preview

Image-Text-to-Text • 9B • Updated Apr 9 • 304k • 658

liked a Space 9 months ago

Moondream3 Preview

🐠

Process images and text to answer questions, caption, detect objects, and find points

New activity in numind/NuMarkdown-8B-Thinking 10 months ago

NuMarkdown-8B-reasoning on A100 40GB is extremely slow (even for 1 token)

👍 1

#4 opened 10 months ago by

Fedoration

liked a model 10 months ago

google/embeddinggemma-300m

New activity in numind/NuMarkdown-8B-Thinking 10 months ago

Quantizations version

#5 opened 10 months ago by

baiall

New activity in futurehouse/ether0 10 months ago

dose it can work in the vllm

#3 opened 10 months ago by

baiall

New activity in numind/NuExtract-2.0-4B 11 months ago

Why is NuExtract-2.0-8B is inferior than 4B?

#1 opened 11 months ago by

ikiransuryavanshi

reacted to anakin87's post with ❤️ 11 months ago

Post

1102

Haystack can now see 👀

The latest release of the Haystack OSS LLM framework adds a long-requested feature: image support!

📓 Notebooks below

This isn't just about passing images to an LLM. We built several features to enable practical multimodal use cases.

What's new?
🧠 Support for multiple LLM providers: OpenAI, Amazon Bedrock, Google Gemini, Mistral, NVIDIA, OpenRouter, Ollama and more (support for Hugging Face API coming 🔜)
🎛️ Prompt template language to handle structured inputs, including images
📄 PDF and image converters
🔍 Image embedders using CLIP-like models
🧾 LLM-based extractor to pull text from images
🧩 Components to build multimodal RAG pipelines and Agents

I had the chance of leading this effort with @sjrhuschlee (great collab).

📓 Below you can find two notebooks to explore the new features:
󠁯•󠁏󠁏 Introduction to Multimodal Text Generation https://haystack.deepset.ai/cookbook/multimodal_intro
󠁯•󠁏󠁏 Creating Vision+Text RAG Pipelines https://haystack.deepset.ai/tutorials/46_multimodal_rag

(🖼️ image by @bilgeyucel )