Spaces:

Bellok
/

warbler-cda

Running on Zero

Bellok commited on 11 days ago

Commit

f22e6ff

1 Parent(s): 1635a41

docs: enhance README with search mode guides and app info updates, add entanglement resonance feature

- Add detailed explanations for semantic and hybrid search modes in README
- Update query examples with semantic and FractalStat hybrid searches
- Enhance app description with document count and fallback features
- Implement entanglement_resonance function for cross-coordinate conceptual connections in fractalstat_rag_bridge.py

Files changed (3) hide show

README.md +33 -4
app.py +7 -6
warbler_cda/fractalstat_rag_bridge.py +149 -0

README.md CHANGED Viewed

@@ -140,12 +140,22 @@ cd warbler-cda-package/k8s
 # Health check
 curl http://localhost:8000/health
-# Query the system
 curl -X POST http://localhost:8000/query \
   -H "Content-Type: application/json" \
   -d '{
-    "query_id": "test1",
-    "semantic_query": "hello world",
     "max_results": 5
   }'
@@ -153,6 +163,25 @@ curl -X POST http://localhost:8000/query \
 curl http://localhost:8000/metrics
 ```
 ### Using Python Programmatically
 ```python
@@ -395,4 +424,4 @@ MIT License - see [LICENSE](LICENSE) for details.
 ---
-### **Made with ❤️ by Tiny Walnut Games**

 # Health check
 curl http://localhost:8000/health
+# Semantic search (plain English queries)
 curl -X POST http://localhost:8000/query \
   -H "Content-Type: application/json" \
   -d '{
+    "query_id": "semantic1",
+    "semantic_query": "dancing under the moon",
+    "max_results": 5
+  }'
+# FractalStat hybrid search (technical/science with dimensional awareness)
+curl -X POST http://localhost:8000/query \
+  -H "Content-Type: application/json" \
+  -d '{
+    "query_id": "hybrid1",
+    "semantic_query": "interplanetary approach maneuvers",
+    "fractalstat_hybrid": true,
     "max_results": 5
   }'
 curl http://localhost:8000/metrics
 ```
+### Understanding Search Modes
+The system provides two search approaches with intelligent fallback:
+#### Semantic Search (Default)
+- **Use for**: Plain English queries, casual search, general questions
+- **Behavior**: Pure semantic similarity matching
+- **Examples**: "How does gravity work?", "tell me about dancing", "operating a spaceship"
+- **Results**: Always returns matches when available, best for natural language
+#### FractalStat Hybrid Search
+- **Use for**: Technical/scientific queries, specific terminology, multi-dimensional search
+- **Behavior**: Combines semantic similarity with 8D FractalStat resonance
+- **Examples**: "rotation dynamics of Saturn's moons", "quantum chromodynamics", "interplanetary approach maneuvers"
+- **Results**: Superior for technical content, may filter out general results
+- **Fallback**: Automatically switches to semantic search if hybrid returns no results
+**Pro Tip**: When hybrid search fails (threshold below 0.3), the system automatically falls back to semantic search, ensuring you always get relevant results.
 ### Using Python Programmatically
 ```python
 ---
+### **Made with ❤️ by Tiny Walnut Games**

app.py CHANGED Viewed

@@ -282,14 +282,15 @@ def get_system_stats() -> str:
 with gr.Blocks(title="Warbler CDA - FractalStat RAG") as demo:
     gr.Markdown("""
     # 🦜 Warbler CDA - FractalStat RAG System
-    Semantic retrieval with 8D FractalStat multi-dimensional addressing.
     **Features:**
-    - 100k+ documents from arXiv, education, fiction, and more
-    - Hybrid semantic + FractalStat scoring
     - Bob the Skeptic bias detection
-    - Narrative coherence analysis
     """)
     with gr.Tab("Query"):

 with gr.Blocks(title="Warbler CDA - FractalStat RAG") as demo:
     gr.Markdown("""
     # 🦜 Warbler CDA - FractalStat RAG System
+    Semantic retrieval with 8D FractalStat multi-dimensional addressing and intelligent fallback.
     **Features:**
+    - 165,000+ documents from arXiv, novels, education, and fiction
+    - Hybrid semantic + FractalStat scoring with automatic fallback
+    - Smart scoring: semantic search works for plain English, hybrid excels at technical queries
     - Bob the Skeptic bias detection
+    - ZeroGPU compatible for reliable HuggingFace Spaces deployment
     """)
     with gr.Tab("Query"):

warbler_cda/fractalstat_rag_bridge.py CHANGED Viewed

@@ -129,6 +129,155 @@ def cosine_similarity(a: List[float], b: List[float]) -> float:
     return dot / denom
 def fractalstat_resonance(
     query_fractalstat: FractalStatAddress,
     doc_fractalstat: FractalStatAddress

     return dot / denom
+# ============================================================================
+# ENTANGLEMENT RESONANCE: Cross-Coordinate Conceptual Connections
+# ============================================================================
+def entanglement_resonance(
+    query_text: str,
+    doc_content: str,
+    query_fractalstat: FractalStatAddress,
+    doc_fractalstat: FractalStatAddress
+) -> float:
+    """
+    Compute ENTANGLEMENT resonance between query and document.
+    Entanglement connects ACROSS coordinate gaps - it finds conceptual relationships
+    that transcend mathematical positioning. This is the "telepathic" dimension.
+    Examples:
+    - "sound barrier" ↔ "sound waves" (share "sound" concept despite different coords)
+    - "dance" ↔ "dancing" ↔ "danced" (morphological entanglement)
+    - "moon's orbit" ↔ "planetary dynamics" can entangle through "celestial mechanics"
+    Algorithm:
+    1. Extract semantic concepts from text
+    2. Find coordinate-independent relationships
+    3. Score entanglement strength
+    """
+    # Semantic field associations (expandable knowledge base)
+    SEMANTIC_FIELDS = {
+        # Physics/Sound
+        "sound": {"sound", "sonic", "audio", "noise", "waves", "sonic", "vibration", "frequency"},
+        "motion": {"move", "kinetic", "dynamic", "rotation", "orbit", "trajectory", "velocity", "acceleration"},
+        "space": {"space", "planetary", "astronomical", "celestial", "cosmological", "orbital", "satellite"},
+        # Actions
+        "dance": {"dance", "dancing", "danced", "choreography", "movement", "rhythm"},
+        "walk": {"walk", "walking", "walked", "step", "stride", "amble", "stroll"},
+        # Emotion/Knowledge
+        "wisdom": {"wisdom", "knowledge", "understanding", "insight", "sapience", "discernment"},
+        "learn": {"learn", "learning", "learned", "education", "study", "teach", "knowledge"},
+        # Technology
+        "machine": {"machine", "mechanical", "automated", "robotic", "device", "apparatus"},
+        "compute": {"compute", "calculate", "algorithm", "process", "logic", "programming"},
+    }
+    # Add morphological variations (basic stemming)
+    MORPHOLOGICAL_PATTERNS = {
+        "ing": lambda word: word.replace("ing", ""),  # dancing → dance
+        "ed": lambda word: word.replace("ed", ""),    # walked → walk
+        "er": lambda word: word.replace("er", ""),    # learner → learn
+        "est": lambda word: word.replace("est", ""),  # fastest → fast
+        "s": lambda word: word.rstrip("s"),           # runs → run
+    }
+    def extract_concepts(text: str) -> set:
+        """Extract semantic concepts from text including morphological variations."""
+        words = set(text.lower().split())
+        concepts = set(words)  # Start with raw words
+        # Add stemmed variations
+        for word in words:
+            for suffix, stemmer in MORPHOLOGICAL_PATTERNS.items():
+                if word.endswith(suffix) and len(word) > len(suffix) + 1:
+                    stemmed = stemmer(word)
+                    if len(stemmed) > 2:  # Avoid too-short stems
+                        concepts.add(stemmed)
+        # Add semantic field memberships
+        for word in list(concepts):
+            for field, field_words in SEMANTIC_FIELDS.items():
+                if word in field_words:
+                    # Add the field concept itself
+                    concepts.add(field)
+        return concepts
+    def calculate_concept_overlap(query_concepts: set, doc_concepts: set) -> float:
+        """Calculate overlap between concept sets."""
+        if not query_concepts or not doc_concepts:
+            return 0.0
+        intersection = query_concepts & doc_concepts
+        union = query_concepts | doc_concepts
+        overlap_score = len(intersection) / len(union) if union else 0.0
+        # Weight by concept quality (prefer multi-word overlaps and semantic fields)
+        quality_weight = 1.0
+        # Bonus for semantic field overlaps (deeper conceptual connection)
+        semantic_field_overlaps = 0
+        for concept in intersection:
+            if concept in SEMANTIC_FIELDS:
+                semantic_field_overlaps += 1
+        field_bonus = semantic_field_overlaps * 0.2  # Up to 20% bonus per field overlap
+        return min(overlap_score + field_bonus, 1.0)
+    def calculate_bridge_distance(concept1: str, concept2: str) -> int:
+        """Calculate how many "bridges" separate concepts."""
+        # Find semantic fields that can bridge the gap
+        bridge_fields = []
+        for field, words in SEMANTIC_FIELDS.items():
+            if concept1 in words or concept2 in words:
+                bridge_fields.append(field)
+        # Concepts in same field have distance 1 (direct entanglement)
+        if len(bridge_fields) > 0:
+            return 1
+        else:
+            return 3  # No direct bridge = distant entanglement
+    # Extract concepts
+    query_concepts = extract_concepts(query_text)
+    doc_concepts = extract_concepts(doc_content)
+    # Calculate direct overlap
+    direct_overlap = calculate_concept_overlap(query_concepts, doc_concepts)
+    # Calculate bridged connections (entanglement can span)
+    total_bridge_strength = 0.0
+    bridge_count = 0
+    for q_concept in query_concepts:
+        for d_concept in doc_concepts:
+            if q_concept != d_concept:  # Avoid double-counting direct overlaps
+                bridge_distance = calculate_bridge_distance(q_concept, d_concept)
+                if bridge_distance > 0:
+                    bridge_strength = 1.0 / bridge_distance  # Closer = stronger
+                    total_bridge_strength += bridge_strength
+                    bridge_count += 1
+    # Average bridge strength across all concept pairs
+    avg_bridge_strength = total_bridge_strength / max(bridge_count, 1)
+    # Combine direct overlap with bridged connections
+    total_entanglement = direct_overlap * 0.7 + avg_bridge_strength * 0.3
+    # Scale based on concept set sizes (richer entanglement = higher scale)
+    concept_richness = min(len(query_concepts), len(doc_concepts)) / 10.0
+    scale_factor = 0.8 + concept_richness  # 0.8 to 1.8 scaling
+    final_entanglement = min(total_entanglement * scale_factor, 1.0)
+    return final_entanglement
 def fractalstat_resonance(
     query_fractalstat: FractalStatAddress,
     doc_fractalstat: FractalStatAddress