Instructions to use wraps/moondream-caption with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use wraps/moondream-caption with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="wraps/moondream-caption", trust_remote_code=True)# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("wraps/moondream-caption", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use wraps/moondream-caption with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "wraps/moondream-caption" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "wraps/moondream-caption", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/wraps/moondream-caption
- SGLang
How to use wraps/moondream-caption with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "wraps/moondream-caption" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "wraps/moondream-caption", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "wraps/moondream-caption" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "wraps/moondream-caption", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use wraps/moondream-caption with Docker Model Runner:
docker model run hf.co/wraps/moondream-caption
Moondream-Caption: Custom Small Vision Model based on Moondream2
Moondream-Caption is a custom small vision model based on moondream2 by vikhyatk. It has been fine-tuned on a specific dataset to enhance its image description capabilities.
Key Features:
- Based on the moondream2 architecture
- Fine-tuned for image caption generation
- Trained on a high-quality custom dataset
Dataset
The dataset used for training Moondream-Caption is specifically designed for image captioning tasks. It has the following characteristics:
- Images generated with flux1_dev
- Highly accurate and verified descriptive captions
- Wide variety of visual content
Usage
You can use Moondream-Caption for image captioning tasks by leveraging the Hugging Face Transformers library. Here's a quick example of how to generate captions for an image:
from transformers import AutoTokenizer, AutoModelForCausalLM
from PIL import Image
moondream = AutoModelForCausalLM.from_pretrained(
"wraps/moondream-caption", trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained("wraps/moondream-caption")
image = Image.open("path/to/your/image.jpg")
enc_image = moondream.encode_image(image)
caption = model.answer_question(enc_image, "Write a long caption for this image")
print(caption)
Example
Output Caption: A close-up portrait of a green alien with a large oval head, enormous black almond-shaped eyes, small nostrils, and a tiny mouth. The alien has a long, thin neck and is wearing a black t-shirt with white text that reads 'humans scare me'. The background shows a pale blue sky with soft, wispy clouds.
Limitations
While Moondream-Caption is designed to generate accurate and relevant image captions, it may not perform optimally on images that significantly differ from the training dataset. Additionally, the model may struggle with complex or abstract images that deviate from the dataset's content. Please open an issue on the model's repository if you encounter any limitations or issues.
- Downloads last month
- 27
Model tree for wraps/moondream-caption
Base model
vikhyatk/moondream2