pipV1 / README.md
Itsjustamit's picture
Update README.md
bb26a57 verified

A newer version of the Gradio SDK is available: 6.0.2

Upgrade
metadata
title: Pip - Emotional AI Companion
emoji: ๐Ÿซง
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 6.0.1
app_file: app.py
pinned: true
license: mit
short_description: A blob friend who transforms your feelings into visual art
tags:
  - mcp-in-action-track-creative
  - mcp-in-action-track-consumer
  - agents
  - mcp

๐ŸŽฅ Demo Video: https://youtu.be/bWDj4gyngNI

๐Ÿ“ข Social Post: https://x.com/07amit10/status/1995270517251801162

๐Ÿ‘ฅ Team: @Itsjustamit

๐Ÿซง Pip - Your Emotional AI Companion

Pip is a cute blob companion that understands your emotions and responds with conversation, context-specific imagery, and soothing voice.

Not a generic assistant - Pip is an emotional friend who knows when to reflect, celebrate, or gently intervene.


โœจ What Makes Pip Special

Emotional Intelligence

Pip doesn't just respond - it understands. Using Claude's nuanced emotional analysis, Pip detects:

  • Multiple co-existing emotions
  • Emotional intensity
  • Underlying needs (validation, comfort, celebration)
  • When gentle intervention might help

Context-Specific Imagery

Every image Pip creates is unique to your conversation. Not generic stock photos - visual art that captures YOUR emotional moment:

  • Mood Alchemist: Transform emotions into magical artifacts
  • Day's Artist: Turn your day into impressionistic art
  • Dream Weaver: Visualize thoughts in surreal imagery
  • Night Companion: Calming visuals for 3am moments

Multi-Service Architecture

Pip uses multiple AI services intelligently:

Service Role
Anthropic Claude Deep emotional analysis, intervention logic
SambaNova Fast acknowledgments, prompt enhancement
OpenAI Image generation, speech-to-text (Whisper)
Google Gemini Image generation (load balanced)
Flux/SDXL Artistic image generation (via Modal/HuggingFace)
ElevenLabs Expressive voice with emotional tone matching

Low-Latency Design

Pip is designed for responsiveness:

  • Quick acknowledgment (< 500ms)
  • Progressive state changes while processing
  • Parallel task execution
  • Streaming responses

๐ŸŽฎ How to Use

Chat Interface

  1. Type how you're feeling or what's on your mind
  2. Watch Pip's expression change as it processes
  3. Receive a thoughtful response + custom image
  4. Optionally enable voice to hear Pip speak

Voice Input

  1. Click the microphone button
  2. Speak your thoughts
  3. Pip transcribes and responds with voice

Modes

  • Auto: Pip decides the best visualization style
  • Alchemist: Emotions become magical artifacts
  • Artist: Your day becomes a painting
  • Dream: Thoughts become surreal visions
  • Night: Calming imagery for late hours

๐Ÿค– MCP Integration

Pip is available as an MCP (Model Context Protocol) server. Connect your AI agent!

For SSE-compatible clients (Cursor, Windsurf, Cline):

{
  "mcpServers": {
    "Pip": {
      "url": "https://YOUR-SPACE.hf.space/gradio_api/mcp/"
    }
  }
}

For stdio clients (Claude Desktop):

{
  "mcpServers": {
    "Pip": {
      "command": "npx",
      "args": [
        "mcp-remote",
        "https://YOUR-SPACE.hf.space/gradio_api/mcp/sse",
        "--transport",
        "sse-only"
      ]
    }
  }
}

Available MCP Tools

  • chat_with_pip(message, session_id) - Talk to Pip
  • generate_mood_artifact(emotion, context) - Create emotional art
  • get_pip_gallery(session_id) - View conversation history
  • set_pip_mode(mode, session_id) - Change interaction mode

๐Ÿง  The Architecture

User Input
    โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  SambaNova: Quick Acknowledgment    โ”‚ โ† Immediate response
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
    โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Claude: Emotion Analysis           โ”‚ โ† Deep understanding
โ”‚  - Primary emotions                 โ”‚
โ”‚  - Intensity (1-10)                 โ”‚
โ”‚  - Intervention needed?             โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
    โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Claude: Action Decision            โ”‚ โ† What should Pip do?
โ”‚  - reflect / celebrate / comfort    โ”‚
โ”‚  - calm / energize / intervene      โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
    โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  SambaNova: Prompt Enhancement      โ”‚ โ† Create vivid image prompt
โ”‚  (Context-specific, never generic)  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
    โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Image Generation (Load Balanced)   โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚
โ”‚  โ”‚ OpenAI โ”‚ โ”‚ Gemini โ”‚ โ”‚  Flux  โ”‚  โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
    โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Claude/SambaNova: Response         โ”‚ โ† Streaming text
โ”‚  (Load balanced for efficiency)     โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
    โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  ElevenLabs: Voice (Optional)       โ”‚ โ† Emotional tone matching
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐ŸŽจ Pip's Expressions

Pip has 10 distinct emotional states with unique animations:

  • Neutral (gentle wobble)
  • Happy (bouncing)
  • Sad (drooping)
  • Thinking (looking up, swaying)
  • Concerned (worried eyebrows, shaking)
  • Excited (energetic bouncing with sparkles)
  • Sleepy (half-closed eyes, breathing)
  • Listening (wide eyes, pulsing)
  • Attentive (leaning forward)
  • Speaking (animated mouth)

๐Ÿ’ก Key Features

Intervention Without Preaching

When Pip detects concerning emotional signals, it doesn't lecture. Instead:

  • Brief acknowledgment
  • Gentle redirect to curiosity/wonder
  • Show something beautiful or intriguing
  • Invite engagement, not advice

Not Generic

Every image prompt is crafted from YOUR specific words and context. Pip extracts:

  • Specific details you mentioned
  • Emotional undertones
  • Time/context clues
  • Your unique situation

๐Ÿ› ๏ธ Tech Stack

  • Frontend: Gradio
  • Character: SVG + CSS animations
  • LLMs: Anthropic Claude, SambaNova (Llama)
  • Images: OpenAI DALL-E 3, Google Imagen, Flux
  • Voice: ElevenLabs (Flash v2.5 for speed, v3 for expression)
  • STT: OpenAI Whisper
  • Compute: Modal (for Flux/SDXL)
  • Hosting: HuggingFace Spaces

๐Ÿ”ง Environment Variables

ANTHROPIC_API_KEY=your_key
SAMBANOVA_API_KEY=your_key
OPENAI_API_KEY=your_key
GOOGLE_API_KEY=your_key
ELEVENLABS_API_KEY=your_key
HF_TOKEN=your_token (optional, for HuggingFace models)

๐Ÿ“ License

MIT License - Feel free to use, modify, and share!


Built with ๐Ÿ’™ for MCP's 1st Birthday Hackathon 2025

Pip uses: Anthropic ($25K), OpenAI ($25), HuggingFace ($25), SambaNova ($25), ElevenLabs ($44), Modal ($250), Blaxel ($250)