Spaces:
Running
on
Zero
Running
on
Zero
Bellok
commited on
Commit
Β·
f22e6ff
1
Parent(s):
1635a41
docs: enhance README with search mode guides and app info updates, add entanglement resonance feature
Browse files- Add detailed explanations for semantic and hybrid search modes in README
- Update query examples with semantic and FractalStat hybrid searches
- Enhance app description with document count and fallback features
- Implement entanglement_resonance function for cross-coordinate conceptual connections in fractalstat_rag_bridge.py
- README.md +33 -4
- app.py +7 -6
- warbler_cda/fractalstat_rag_bridge.py +149 -0
README.md
CHANGED
|
@@ -140,12 +140,22 @@ cd warbler-cda-package/k8s
|
|
| 140 |
# Health check
|
| 141 |
curl http://localhost:8000/health
|
| 142 |
|
| 143 |
-
#
|
| 144 |
curl -X POST http://localhost:8000/query \
|
| 145 |
-H "Content-Type: application/json" \
|
| 146 |
-d '{
|
| 147 |
-
"query_id": "
|
| 148 |
-
"semantic_query": "
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 149 |
"max_results": 5
|
| 150 |
}'
|
| 151 |
|
|
@@ -153,6 +163,25 @@ curl -X POST http://localhost:8000/query \
|
|
| 153 |
curl http://localhost:8000/metrics
|
| 154 |
```
|
| 155 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 156 |
### Using Python Programmatically
|
| 157 |
|
| 158 |
```python
|
|
@@ -395,4 +424,4 @@ MIT License - see [LICENSE](LICENSE) for details.
|
|
| 395 |
|
| 396 |
---
|
| 397 |
|
| 398 |
-
### **Made with β€οΈ by Tiny Walnut Games**
|
|
|
|
| 140 |
# Health check
|
| 141 |
curl http://localhost:8000/health
|
| 142 |
|
| 143 |
+
# Semantic search (plain English queries)
|
| 144 |
curl -X POST http://localhost:8000/query \
|
| 145 |
-H "Content-Type: application/json" \
|
| 146 |
-d '{
|
| 147 |
+
"query_id": "semantic1",
|
| 148 |
+
"semantic_query": "dancing under the moon",
|
| 149 |
+
"max_results": 5
|
| 150 |
+
}'
|
| 151 |
+
|
| 152 |
+
# FractalStat hybrid search (technical/science with dimensional awareness)
|
| 153 |
+
curl -X POST http://localhost:8000/query \
|
| 154 |
+
-H "Content-Type: application/json" \
|
| 155 |
+
-d '{
|
| 156 |
+
"query_id": "hybrid1",
|
| 157 |
+
"semantic_query": "interplanetary approach maneuvers",
|
| 158 |
+
"fractalstat_hybrid": true,
|
| 159 |
"max_results": 5
|
| 160 |
}'
|
| 161 |
|
|
|
|
| 163 |
curl http://localhost:8000/metrics
|
| 164 |
```
|
| 165 |
|
| 166 |
+
### Understanding Search Modes
|
| 167 |
+
|
| 168 |
+
The system provides two search approaches with intelligent fallback:
|
| 169 |
+
|
| 170 |
+
#### Semantic Search (Default)
|
| 171 |
+
- **Use for**: Plain English queries, casual search, general questions
|
| 172 |
+
- **Behavior**: Pure semantic similarity matching
|
| 173 |
+
- **Examples**: "How does gravity work?", "tell me about dancing", "operating a spaceship"
|
| 174 |
+
- **Results**: Always returns matches when available, best for natural language
|
| 175 |
+
|
| 176 |
+
#### FractalStat Hybrid Search
|
| 177 |
+
- **Use for**: Technical/scientific queries, specific terminology, multi-dimensional search
|
| 178 |
+
- **Behavior**: Combines semantic similarity with 8D FractalStat resonance
|
| 179 |
+
- **Examples**: "rotation dynamics of Saturn's moons", "quantum chromodynamics", "interplanetary approach maneuvers"
|
| 180 |
+
- **Results**: Superior for technical content, may filter out general results
|
| 181 |
+
- **Fallback**: Automatically switches to semantic search if hybrid returns no results
|
| 182 |
+
|
| 183 |
+
**Pro Tip**: When hybrid search fails (threshold below 0.3), the system automatically falls back to semantic search, ensuring you always get relevant results.
|
| 184 |
+
|
| 185 |
### Using Python Programmatically
|
| 186 |
|
| 187 |
```python
|
|
|
|
| 424 |
|
| 425 |
---
|
| 426 |
|
| 427 |
+
### **Made with β€οΈ by Tiny Walnut Games**
|
app.py
CHANGED
|
@@ -282,14 +282,15 @@ def get_system_stats() -> str:
|
|
| 282 |
with gr.Blocks(title="Warbler CDA - FractalStat RAG") as demo:
|
| 283 |
gr.Markdown("""
|
| 284 |
# π¦ Warbler CDA - FractalStat RAG System
|
| 285 |
-
|
| 286 |
-
Semantic retrieval with 8D FractalStat multi-dimensional addressing.
|
| 287 |
-
|
| 288 |
**Features:**
|
| 289 |
-
-
|
| 290 |
-
- Hybrid semantic + FractalStat scoring
|
|
|
|
| 291 |
- Bob the Skeptic bias detection
|
| 292 |
-
-
|
| 293 |
""")
|
| 294 |
|
| 295 |
with gr.Tab("Query"):
|
|
|
|
| 282 |
with gr.Blocks(title="Warbler CDA - FractalStat RAG") as demo:
|
| 283 |
gr.Markdown("""
|
| 284 |
# π¦ Warbler CDA - FractalStat RAG System
|
| 285 |
+
|
| 286 |
+
Semantic retrieval with 8D FractalStat multi-dimensional addressing and intelligent fallback.
|
| 287 |
+
|
| 288 |
**Features:**
|
| 289 |
+
- 165,000+ documents from arXiv, novels, education, and fiction
|
| 290 |
+
- Hybrid semantic + FractalStat scoring with automatic fallback
|
| 291 |
+
- Smart scoring: semantic search works for plain English, hybrid excels at technical queries
|
| 292 |
- Bob the Skeptic bias detection
|
| 293 |
+
- ZeroGPU compatible for reliable HuggingFace Spaces deployment
|
| 294 |
""")
|
| 295 |
|
| 296 |
with gr.Tab("Query"):
|
warbler_cda/fractalstat_rag_bridge.py
CHANGED
|
@@ -129,6 +129,155 @@ def cosine_similarity(a: List[float], b: List[float]) -> float:
|
|
| 129 |
return dot / denom
|
| 130 |
|
| 131 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 132 |
def fractalstat_resonance(
|
| 133 |
query_fractalstat: FractalStatAddress,
|
| 134 |
doc_fractalstat: FractalStatAddress
|
|
|
|
| 129 |
return dot / denom
|
| 130 |
|
| 131 |
|
| 132 |
+
# ============================================================================
|
| 133 |
+
# ENTANGLEMENT RESONANCE: Cross-Coordinate Conceptual Connections
|
| 134 |
+
# ============================================================================
|
| 135 |
+
|
| 136 |
+
def entanglement_resonance(
|
| 137 |
+
query_text: str,
|
| 138 |
+
doc_content: str,
|
| 139 |
+
query_fractalstat: FractalStatAddress,
|
| 140 |
+
doc_fractalstat: FractalStatAddress
|
| 141 |
+
) -> float:
|
| 142 |
+
"""
|
| 143 |
+
Compute ENTANGLEMENT resonance between query and document.
|
| 144 |
+
|
| 145 |
+
Entanglement connects ACROSS coordinate gaps - it finds conceptual relationships
|
| 146 |
+
that transcend mathematical positioning. This is the "telepathic" dimension.
|
| 147 |
+
|
| 148 |
+
Examples:
|
| 149 |
+
- "sound barrier" β "sound waves" (share "sound" concept despite different coords)
|
| 150 |
+
- "dance" β "dancing" β "danced" (morphological entanglement)
|
| 151 |
+
- "moon's orbit" β "planetary dynamics" can entangle through "celestial mechanics"
|
| 152 |
+
|
| 153 |
+
Algorithm:
|
| 154 |
+
1. Extract semantic concepts from text
|
| 155 |
+
2. Find coordinate-independent relationships
|
| 156 |
+
3. Score entanglement strength
|
| 157 |
+
"""
|
| 158 |
+
# Semantic field associations (expandable knowledge base)
|
| 159 |
+
SEMANTIC_FIELDS = {
|
| 160 |
+
# Physics/Sound
|
| 161 |
+
"sound": {"sound", "sonic", "audio", "noise", "waves", "sonic", "vibration", "frequency"},
|
| 162 |
+
"motion": {"move", "kinetic", "dynamic", "rotation", "orbit", "trajectory", "velocity", "acceleration"},
|
| 163 |
+
"space": {"space", "planetary", "astronomical", "celestial", "cosmological", "orbital", "satellite"},
|
| 164 |
+
|
| 165 |
+
# Actions
|
| 166 |
+
"dance": {"dance", "dancing", "danced", "choreography", "movement", "rhythm"},
|
| 167 |
+
"walk": {"walk", "walking", "walked", "step", "stride", "amble", "stroll"},
|
| 168 |
+
|
| 169 |
+
# Emotion/Knowledge
|
| 170 |
+
"wisdom": {"wisdom", "knowledge", "understanding", "insight", "sapience", "discernment"},
|
| 171 |
+
"learn": {"learn", "learning", "learned", "education", "study", "teach", "knowledge"},
|
| 172 |
+
|
| 173 |
+
# Technology
|
| 174 |
+
"machine": {"machine", "mechanical", "automated", "robotic", "device", "apparatus"},
|
| 175 |
+
"compute": {"compute", "calculate", "algorithm", "process", "logic", "programming"},
|
| 176 |
+
}
|
| 177 |
+
|
| 178 |
+
# Add morphological variations (basic stemming)
|
| 179 |
+
MORPHOLOGICAL_PATTERNS = {
|
| 180 |
+
"ing": lambda word: word.replace("ing", ""), # dancing β dance
|
| 181 |
+
"ed": lambda word: word.replace("ed", ""), # walked β walk
|
| 182 |
+
"er": lambda word: word.replace("er", ""), # learner β learn
|
| 183 |
+
"est": lambda word: word.replace("est", ""), # fastest β fast
|
| 184 |
+
"s": lambda word: word.rstrip("s"), # runs β run
|
| 185 |
+
}
|
| 186 |
+
|
| 187 |
+
def extract_concepts(text: str) -> set:
|
| 188 |
+
"""Extract semantic concepts from text including morphological variations."""
|
| 189 |
+
words = set(text.lower().split())
|
| 190 |
+
concepts = set(words) # Start with raw words
|
| 191 |
+
|
| 192 |
+
# Add stemmed variations
|
| 193 |
+
for word in words:
|
| 194 |
+
for suffix, stemmer in MORPHOLOGICAL_PATTERNS.items():
|
| 195 |
+
if word.endswith(suffix) and len(word) > len(suffix) + 1:
|
| 196 |
+
stemmed = stemmer(word)
|
| 197 |
+
if len(stemmed) > 2: # Avoid too-short stems
|
| 198 |
+
concepts.add(stemmed)
|
| 199 |
+
|
| 200 |
+
# Add semantic field memberships
|
| 201 |
+
for word in list(concepts):
|
| 202 |
+
for field, field_words in SEMANTIC_FIELDS.items():
|
| 203 |
+
if word in field_words:
|
| 204 |
+
# Add the field concept itself
|
| 205 |
+
concepts.add(field)
|
| 206 |
+
|
| 207 |
+
return concepts
|
| 208 |
+
|
| 209 |
+
def calculate_concept_overlap(query_concepts: set, doc_concepts: set) -> float:
|
| 210 |
+
"""Calculate overlap between concept sets."""
|
| 211 |
+
if not query_concepts or not doc_concepts:
|
| 212 |
+
return 0.0
|
| 213 |
+
|
| 214 |
+
intersection = query_concepts & doc_concepts
|
| 215 |
+
union = query_concepts | doc_concepts
|
| 216 |
+
|
| 217 |
+
overlap_score = len(intersection) / len(union) if union else 0.0
|
| 218 |
+
|
| 219 |
+
# Weight by concept quality (prefer multi-word overlaps and semantic fields)
|
| 220 |
+
quality_weight = 1.0
|
| 221 |
+
|
| 222 |
+
# Bonus for semantic field overlaps (deeper conceptual connection)
|
| 223 |
+
semantic_field_overlaps = 0
|
| 224 |
+
for concept in intersection:
|
| 225 |
+
if concept in SEMANTIC_FIELDS:
|
| 226 |
+
semantic_field_overlaps += 1
|
| 227 |
+
|
| 228 |
+
field_bonus = semantic_field_overlaps * 0.2 # Up to 20% bonus per field overlap
|
| 229 |
+
|
| 230 |
+
return min(overlap_score + field_bonus, 1.0)
|
| 231 |
+
|
| 232 |
+
def calculate_bridge_distance(concept1: str, concept2: str) -> int:
|
| 233 |
+
"""Calculate how many "bridges" separate concepts."""
|
| 234 |
+
# Find semantic fields that can bridge the gap
|
| 235 |
+
bridge_fields = []
|
| 236 |
+
for field, words in SEMANTIC_FIELDS.items():
|
| 237 |
+
if concept1 in words or concept2 in words:
|
| 238 |
+
bridge_fields.append(field)
|
| 239 |
+
|
| 240 |
+
# Concepts in same field have distance 1 (direct entanglement)
|
| 241 |
+
if len(bridge_fields) > 0:
|
| 242 |
+
return 1
|
| 243 |
+
else:
|
| 244 |
+
return 3 # No direct bridge = distant entanglement
|
| 245 |
+
|
| 246 |
+
# Extract concepts
|
| 247 |
+
query_concepts = extract_concepts(query_text)
|
| 248 |
+
doc_concepts = extract_concepts(doc_content)
|
| 249 |
+
|
| 250 |
+
# Calculate direct overlap
|
| 251 |
+
direct_overlap = calculate_concept_overlap(query_concepts, doc_concepts)
|
| 252 |
+
|
| 253 |
+
# Calculate bridged connections (entanglement can span)
|
| 254 |
+
total_bridge_strength = 0.0
|
| 255 |
+
bridge_count = 0
|
| 256 |
+
|
| 257 |
+
for q_concept in query_concepts:
|
| 258 |
+
for d_concept in doc_concepts:
|
| 259 |
+
if q_concept != d_concept: # Avoid double-counting direct overlaps
|
| 260 |
+
bridge_distance = calculate_bridge_distance(q_concept, d_concept)
|
| 261 |
+
if bridge_distance > 0:
|
| 262 |
+
bridge_strength = 1.0 / bridge_distance # Closer = stronger
|
| 263 |
+
total_bridge_strength += bridge_strength
|
| 264 |
+
bridge_count += 1
|
| 265 |
+
|
| 266 |
+
# Average bridge strength across all concept pairs
|
| 267 |
+
avg_bridge_strength = total_bridge_strength / max(bridge_count, 1)
|
| 268 |
+
|
| 269 |
+
# Combine direct overlap with bridged connections
|
| 270 |
+
total_entanglement = direct_overlap * 0.7 + avg_bridge_strength * 0.3
|
| 271 |
+
|
| 272 |
+
# Scale based on concept set sizes (richer entanglement = higher scale)
|
| 273 |
+
concept_richness = min(len(query_concepts), len(doc_concepts)) / 10.0
|
| 274 |
+
scale_factor = 0.8 + concept_richness # 0.8 to 1.8 scaling
|
| 275 |
+
|
| 276 |
+
final_entanglement = min(total_entanglement * scale_factor, 1.0)
|
| 277 |
+
|
| 278 |
+
return final_entanglement
|
| 279 |
+
|
| 280 |
+
|
| 281 |
def fractalstat_resonance(
|
| 282 |
query_fractalstat: FractalStatAddress,
|
| 283 |
doc_fractalstat: FractalStatAddress
|