Spaces:
Running
A newer version of the Gradio SDK is available:
6.0.2
title: Pip - Emotional AI Companion
emoji: ๐ซง
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 6.0.1
app_file: app.py
pinned: true
license: mit
short_description: A blob friend who transforms your feelings into visual art
tags:
- mcp-in-action-track-creative
- mcp-in-action-track-consumer
- agents
- mcp
๐ฅ Demo Video: https://youtu.be/bWDj4gyngNI
๐ข Social Post: https://x.com/07amit10/status/1995270517251801162
๐ฅ Team: @Itsjustamit
๐ซง Pip - Your Emotional AI Companion
Pip is a cute blob companion that understands your emotions and responds with conversation, context-specific imagery, and soothing voice.
Not a generic assistant - Pip is an emotional friend who knows when to reflect, celebrate, or gently intervene.
โจ What Makes Pip Special
Emotional Intelligence
Pip doesn't just respond - it understands. Using Claude's nuanced emotional analysis, Pip detects:
- Multiple co-existing emotions
- Emotional intensity
- Underlying needs (validation, comfort, celebration)
- When gentle intervention might help
Context-Specific Imagery
Every image Pip creates is unique to your conversation. Not generic stock photos - visual art that captures YOUR emotional moment:
- Mood Alchemist: Transform emotions into magical artifacts
- Day's Artist: Turn your day into impressionistic art
- Dream Weaver: Visualize thoughts in surreal imagery
- Night Companion: Calming visuals for 3am moments
Multi-Service Architecture
Pip uses multiple AI services intelligently:
| Service | Role |
|---|---|
| Anthropic Claude | Deep emotional analysis, intervention logic |
| SambaNova | Fast acknowledgments, prompt enhancement |
| OpenAI | Image generation, speech-to-text (Whisper) |
| Google Gemini | Image generation (load balanced) |
| Flux/SDXL | Artistic image generation (via Modal/HuggingFace) |
| ElevenLabs | Expressive voice with emotional tone matching |
Low-Latency Design
Pip is designed for responsiveness:
- Quick acknowledgment (< 500ms)
- Progressive state changes while processing
- Parallel task execution
- Streaming responses
๐ฎ How to Use
Chat Interface
- Type how you're feeling or what's on your mind
- Watch Pip's expression change as it processes
- Receive a thoughtful response + custom image
- Optionally enable voice to hear Pip speak
Voice Input
- Click the microphone button
- Speak your thoughts
- Pip transcribes and responds with voice
Modes
- Auto: Pip decides the best visualization style
- Alchemist: Emotions become magical artifacts
- Artist: Your day becomes a painting
- Dream: Thoughts become surreal visions
- Night: Calming imagery for late hours
๐ค MCP Integration
Pip is available as an MCP (Model Context Protocol) server. Connect your AI agent!
For SSE-compatible clients (Cursor, Windsurf, Cline):
{
"mcpServers": {
"Pip": {
"url": "https://YOUR-SPACE.hf.space/gradio_api/mcp/"
}
}
}
For stdio clients (Claude Desktop):
{
"mcpServers": {
"Pip": {
"command": "npx",
"args": [
"mcp-remote",
"https://YOUR-SPACE.hf.space/gradio_api/mcp/sse",
"--transport",
"sse-only"
]
}
}
}
Available MCP Tools
chat_with_pip(message, session_id)- Talk to Pipgenerate_mood_artifact(emotion, context)- Create emotional artget_pip_gallery(session_id)- View conversation historyset_pip_mode(mode, session_id)- Change interaction mode
๐ง The Architecture
User Input
โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ SambaNova: Quick Acknowledgment โ โ Immediate response
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Claude: Emotion Analysis โ โ Deep understanding
โ - Primary emotions โ
โ - Intensity (1-10) โ
โ - Intervention needed? โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Claude: Action Decision โ โ What should Pip do?
โ - reflect / celebrate / comfort โ
โ - calm / energize / intervene โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ SambaNova: Prompt Enhancement โ โ Create vivid image prompt
โ (Context-specific, never generic) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Image Generation (Load Balanced) โ
โ โโโโโโโโโโ โโโโโโโโโโ โโโโโโโโโโ โ
โ โ OpenAI โ โ Gemini โ โ Flux โ โ
โ โโโโโโโโโโ โโโโโโโโโโ โโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Claude/SambaNova: Response โ โ Streaming text
โ (Load balanced for efficiency) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ ElevenLabs: Voice (Optional) โ โ Emotional tone matching
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
๐จ Pip's Expressions
Pip has 10 distinct emotional states with unique animations:
- Neutral (gentle wobble)
- Happy (bouncing)
- Sad (drooping)
- Thinking (looking up, swaying)
- Concerned (worried eyebrows, shaking)
- Excited (energetic bouncing with sparkles)
- Sleepy (half-closed eyes, breathing)
- Listening (wide eyes, pulsing)
- Attentive (leaning forward)
- Speaking (animated mouth)
๐ก Key Features
Intervention Without Preaching
When Pip detects concerning emotional signals, it doesn't lecture. Instead:
- Brief acknowledgment
- Gentle redirect to curiosity/wonder
- Show something beautiful or intriguing
- Invite engagement, not advice
Not Generic
Every image prompt is crafted from YOUR specific words and context. Pip extracts:
- Specific details you mentioned
- Emotional undertones
- Time/context clues
- Your unique situation
๐ ๏ธ Tech Stack
- Frontend: Gradio
- Character: SVG + CSS animations
- LLMs: Anthropic Claude, SambaNova (Llama)
- Images: OpenAI DALL-E 3, Google Imagen, Flux
- Voice: ElevenLabs (Flash v2.5 for speed, v3 for expression)
- STT: OpenAI Whisper
- Compute: Modal (for Flux/SDXL)
- Hosting: HuggingFace Spaces
๐ง Environment Variables
ANTHROPIC_API_KEY=your_key
SAMBANOVA_API_KEY=your_key
OPENAI_API_KEY=your_key
GOOGLE_API_KEY=your_key
ELEVENLABS_API_KEY=your_key
HF_TOKEN=your_token (optional, for HuggingFace models)
๐ License
MIT License - Feel free to use, modify, and share!
Built with ๐ for MCP's 1st Birthday Hackathon 2025
Pip uses: Anthropic ($25K), OpenAI ($25), HuggingFace ($25), SambaNova ($25), ElevenLabs ($44), Modal ($250), Blaxel ($250)