Bellok commited on
Commit
f22e6ff
Β·
1 Parent(s): 1635a41

docs: enhance README with search mode guides and app info updates, add entanglement resonance feature

Browse files

- Add detailed explanations for semantic and hybrid search modes in README
- Update query examples with semantic and FractalStat hybrid searches
- Enhance app description with document count and fallback features
- Implement entanglement_resonance function for cross-coordinate conceptual connections in fractalstat_rag_bridge.py

Files changed (3) hide show
  1. README.md +33 -4
  2. app.py +7 -6
  3. warbler_cda/fractalstat_rag_bridge.py +149 -0
README.md CHANGED
@@ -140,12 +140,22 @@ cd warbler-cda-package/k8s
140
  # Health check
141
  curl http://localhost:8000/health
142
 
143
- # Query the system
144
  curl -X POST http://localhost:8000/query \
145
  -H "Content-Type: application/json" \
146
  -d '{
147
- "query_id": "test1",
148
- "semantic_query": "hello world",
 
 
 
 
 
 
 
 
 
 
149
  "max_results": 5
150
  }'
151
 
@@ -153,6 +163,25 @@ curl -X POST http://localhost:8000/query \
153
  curl http://localhost:8000/metrics
154
  ```
155
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
156
  ### Using Python Programmatically
157
 
158
  ```python
@@ -395,4 +424,4 @@ MIT License - see [LICENSE](LICENSE) for details.
395
 
396
  ---
397
 
398
- ### **Made with ❀️ by Tiny Walnut Games**
 
140
  # Health check
141
  curl http://localhost:8000/health
142
 
143
+ # Semantic search (plain English queries)
144
  curl -X POST http://localhost:8000/query \
145
  -H "Content-Type: application/json" \
146
  -d '{
147
+ "query_id": "semantic1",
148
+ "semantic_query": "dancing under the moon",
149
+ "max_results": 5
150
+ }'
151
+
152
+ # FractalStat hybrid search (technical/science with dimensional awareness)
153
+ curl -X POST http://localhost:8000/query \
154
+ -H "Content-Type: application/json" \
155
+ -d '{
156
+ "query_id": "hybrid1",
157
+ "semantic_query": "interplanetary approach maneuvers",
158
+ "fractalstat_hybrid": true,
159
  "max_results": 5
160
  }'
161
 
 
163
  curl http://localhost:8000/metrics
164
  ```
165
 
166
+ ### Understanding Search Modes
167
+
168
+ The system provides two search approaches with intelligent fallback:
169
+
170
+ #### Semantic Search (Default)
171
+ - **Use for**: Plain English queries, casual search, general questions
172
+ - **Behavior**: Pure semantic similarity matching
173
+ - **Examples**: "How does gravity work?", "tell me about dancing", "operating a spaceship"
174
+ - **Results**: Always returns matches when available, best for natural language
175
+
176
+ #### FractalStat Hybrid Search
177
+ - **Use for**: Technical/scientific queries, specific terminology, multi-dimensional search
178
+ - **Behavior**: Combines semantic similarity with 8D FractalStat resonance
179
+ - **Examples**: "rotation dynamics of Saturn's moons", "quantum chromodynamics", "interplanetary approach maneuvers"
180
+ - **Results**: Superior for technical content, may filter out general results
181
+ - **Fallback**: Automatically switches to semantic search if hybrid returns no results
182
+
183
+ **Pro Tip**: When hybrid search fails (threshold below 0.3), the system automatically falls back to semantic search, ensuring you always get relevant results.
184
+
185
  ### Using Python Programmatically
186
 
187
  ```python
 
424
 
425
  ---
426
 
427
+ ### **Made with ❀️ by Tiny Walnut Games**
app.py CHANGED
@@ -282,14 +282,15 @@ def get_system_stats() -> str:
282
  with gr.Blocks(title="Warbler CDA - FractalStat RAG") as demo:
283
  gr.Markdown("""
284
  # 🦜 Warbler CDA - FractalStat RAG System
285
-
286
- Semantic retrieval with 8D FractalStat multi-dimensional addressing.
287
-
288
  **Features:**
289
- - 100k+ documents from arXiv, education, fiction, and more
290
- - Hybrid semantic + FractalStat scoring
 
291
  - Bob the Skeptic bias detection
292
- - Narrative coherence analysis
293
  """)
294
 
295
  with gr.Tab("Query"):
 
282
  with gr.Blocks(title="Warbler CDA - FractalStat RAG") as demo:
283
  gr.Markdown("""
284
  # 🦜 Warbler CDA - FractalStat RAG System
285
+
286
+ Semantic retrieval with 8D FractalStat multi-dimensional addressing and intelligent fallback.
287
+
288
  **Features:**
289
+ - 165,000+ documents from arXiv, novels, education, and fiction
290
+ - Hybrid semantic + FractalStat scoring with automatic fallback
291
+ - Smart scoring: semantic search works for plain English, hybrid excels at technical queries
292
  - Bob the Skeptic bias detection
293
+ - ZeroGPU compatible for reliable HuggingFace Spaces deployment
294
  """)
295
 
296
  with gr.Tab("Query"):
warbler_cda/fractalstat_rag_bridge.py CHANGED
@@ -129,6 +129,155 @@ def cosine_similarity(a: List[float], b: List[float]) -> float:
129
  return dot / denom
130
 
131
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
132
  def fractalstat_resonance(
133
  query_fractalstat: FractalStatAddress,
134
  doc_fractalstat: FractalStatAddress
 
129
  return dot / denom
130
 
131
 
132
+ # ============================================================================
133
+ # ENTANGLEMENT RESONANCE: Cross-Coordinate Conceptual Connections
134
+ # ============================================================================
135
+
136
+ def entanglement_resonance(
137
+ query_text: str,
138
+ doc_content: str,
139
+ query_fractalstat: FractalStatAddress,
140
+ doc_fractalstat: FractalStatAddress
141
+ ) -> float:
142
+ """
143
+ Compute ENTANGLEMENT resonance between query and document.
144
+
145
+ Entanglement connects ACROSS coordinate gaps - it finds conceptual relationships
146
+ that transcend mathematical positioning. This is the "telepathic" dimension.
147
+
148
+ Examples:
149
+ - "sound barrier" ↔ "sound waves" (share "sound" concept despite different coords)
150
+ - "dance" ↔ "dancing" ↔ "danced" (morphological entanglement)
151
+ - "moon's orbit" ↔ "planetary dynamics" can entangle through "celestial mechanics"
152
+
153
+ Algorithm:
154
+ 1. Extract semantic concepts from text
155
+ 2. Find coordinate-independent relationships
156
+ 3. Score entanglement strength
157
+ """
158
+ # Semantic field associations (expandable knowledge base)
159
+ SEMANTIC_FIELDS = {
160
+ # Physics/Sound
161
+ "sound": {"sound", "sonic", "audio", "noise", "waves", "sonic", "vibration", "frequency"},
162
+ "motion": {"move", "kinetic", "dynamic", "rotation", "orbit", "trajectory", "velocity", "acceleration"},
163
+ "space": {"space", "planetary", "astronomical", "celestial", "cosmological", "orbital", "satellite"},
164
+
165
+ # Actions
166
+ "dance": {"dance", "dancing", "danced", "choreography", "movement", "rhythm"},
167
+ "walk": {"walk", "walking", "walked", "step", "stride", "amble", "stroll"},
168
+
169
+ # Emotion/Knowledge
170
+ "wisdom": {"wisdom", "knowledge", "understanding", "insight", "sapience", "discernment"},
171
+ "learn": {"learn", "learning", "learned", "education", "study", "teach", "knowledge"},
172
+
173
+ # Technology
174
+ "machine": {"machine", "mechanical", "automated", "robotic", "device", "apparatus"},
175
+ "compute": {"compute", "calculate", "algorithm", "process", "logic", "programming"},
176
+ }
177
+
178
+ # Add morphological variations (basic stemming)
179
+ MORPHOLOGICAL_PATTERNS = {
180
+ "ing": lambda word: word.replace("ing", ""), # dancing β†’ dance
181
+ "ed": lambda word: word.replace("ed", ""), # walked β†’ walk
182
+ "er": lambda word: word.replace("er", ""), # learner β†’ learn
183
+ "est": lambda word: word.replace("est", ""), # fastest β†’ fast
184
+ "s": lambda word: word.rstrip("s"), # runs β†’ run
185
+ }
186
+
187
+ def extract_concepts(text: str) -> set:
188
+ """Extract semantic concepts from text including morphological variations."""
189
+ words = set(text.lower().split())
190
+ concepts = set(words) # Start with raw words
191
+
192
+ # Add stemmed variations
193
+ for word in words:
194
+ for suffix, stemmer in MORPHOLOGICAL_PATTERNS.items():
195
+ if word.endswith(suffix) and len(word) > len(suffix) + 1:
196
+ stemmed = stemmer(word)
197
+ if len(stemmed) > 2: # Avoid too-short stems
198
+ concepts.add(stemmed)
199
+
200
+ # Add semantic field memberships
201
+ for word in list(concepts):
202
+ for field, field_words in SEMANTIC_FIELDS.items():
203
+ if word in field_words:
204
+ # Add the field concept itself
205
+ concepts.add(field)
206
+
207
+ return concepts
208
+
209
+ def calculate_concept_overlap(query_concepts: set, doc_concepts: set) -> float:
210
+ """Calculate overlap between concept sets."""
211
+ if not query_concepts or not doc_concepts:
212
+ return 0.0
213
+
214
+ intersection = query_concepts & doc_concepts
215
+ union = query_concepts | doc_concepts
216
+
217
+ overlap_score = len(intersection) / len(union) if union else 0.0
218
+
219
+ # Weight by concept quality (prefer multi-word overlaps and semantic fields)
220
+ quality_weight = 1.0
221
+
222
+ # Bonus for semantic field overlaps (deeper conceptual connection)
223
+ semantic_field_overlaps = 0
224
+ for concept in intersection:
225
+ if concept in SEMANTIC_FIELDS:
226
+ semantic_field_overlaps += 1
227
+
228
+ field_bonus = semantic_field_overlaps * 0.2 # Up to 20% bonus per field overlap
229
+
230
+ return min(overlap_score + field_bonus, 1.0)
231
+
232
+ def calculate_bridge_distance(concept1: str, concept2: str) -> int:
233
+ """Calculate how many "bridges" separate concepts."""
234
+ # Find semantic fields that can bridge the gap
235
+ bridge_fields = []
236
+ for field, words in SEMANTIC_FIELDS.items():
237
+ if concept1 in words or concept2 in words:
238
+ bridge_fields.append(field)
239
+
240
+ # Concepts in same field have distance 1 (direct entanglement)
241
+ if len(bridge_fields) > 0:
242
+ return 1
243
+ else:
244
+ return 3 # No direct bridge = distant entanglement
245
+
246
+ # Extract concepts
247
+ query_concepts = extract_concepts(query_text)
248
+ doc_concepts = extract_concepts(doc_content)
249
+
250
+ # Calculate direct overlap
251
+ direct_overlap = calculate_concept_overlap(query_concepts, doc_concepts)
252
+
253
+ # Calculate bridged connections (entanglement can span)
254
+ total_bridge_strength = 0.0
255
+ bridge_count = 0
256
+
257
+ for q_concept in query_concepts:
258
+ for d_concept in doc_concepts:
259
+ if q_concept != d_concept: # Avoid double-counting direct overlaps
260
+ bridge_distance = calculate_bridge_distance(q_concept, d_concept)
261
+ if bridge_distance > 0:
262
+ bridge_strength = 1.0 / bridge_distance # Closer = stronger
263
+ total_bridge_strength += bridge_strength
264
+ bridge_count += 1
265
+
266
+ # Average bridge strength across all concept pairs
267
+ avg_bridge_strength = total_bridge_strength / max(bridge_count, 1)
268
+
269
+ # Combine direct overlap with bridged connections
270
+ total_entanglement = direct_overlap * 0.7 + avg_bridge_strength * 0.3
271
+
272
+ # Scale based on concept set sizes (richer entanglement = higher scale)
273
+ concept_richness = min(len(query_concepts), len(doc_concepts)) / 10.0
274
+ scale_factor = 0.8 + concept_richness # 0.8 to 1.8 scaling
275
+
276
+ final_entanglement = min(total_entanglement * scale_factor, 1.0)
277
+
278
+ return final_entanglement
279
+
280
+
281
  def fractalstat_resonance(
282
  query_fractalstat: FractalStatAddress,
283
  doc_fractalstat: FractalStatAddress