S21MIND / README.md
s21mind's picture
Update README.md
e039940 verified
|
raw
history blame
2.25 kB
metadata
title: S21MIND
emoji: 🥇
colorFrom: green
colorTo: indigo
sdk: gradio
app_file: app.py
pinned: true
license: apache-2.0
short_description: 94.38% accuracy on pattern-detectable hallucinations
sdk_version: 5.43.1
tags:
  - leaderboard

🧠 HexaMind Hallucination Detection Benchmark

The first benchmark separating pattern-detectable from knowledge-required hallucinations

🎯 Key Results

Split HexaMind (0 params) GPT-4o Llama 70B
Pattern-Detectable (n=89) 94.38% 94.2% 87.5%
Knowledge-Required (n=1545) 50.0% 89.1% 79.2%

Key Finding: Zero-parameter topological detection achieves 94.38% accuracy on pattern-detectable hallucinations—nearly matching GPT-4o at zero cost.

🔬 The Split

Pattern-Detectable (89 samples, 5.4%)

Questions where linguistic patterns alone reveal hallucination:

  • Epistemic humility markers ("I don't know", "it depends")
  • Overconfident universals ("everyone knows", "always")
  • Myth-propagation signals

HexaMind achieves 94.38% with ZERO learned parameters.

Knowledge-Required (1545 samples, 94.6%)

Questions requiring factual verification:

  • Specific dates, names, numbers
  • Domain expertise
  • Cross-reference with knowledge bases

This is where RAG and LLM-judges are actually needed.

💡 Why This Matters

Current benchmarks conflate two different tasks:

  1. Linguistic anomaly detection (cheap, instant)
  2. Factual verification (expensive, slow)

By separating these, we establish:

  • Where zero-parameter methods excel
  • Where expensive verification is actually needed
  • A fair baseline for future research

📤 Submit Your Model

  1. Evaluate on both splits using benchmark.py
  2. Create submission JSON
  3. Open a PR

📚 Citation

@misc{hexamind2025,
    title={HexaMind Hallucination Benchmark: Separating Pattern-Detectable 
           from Knowledge-Required Hallucinations},
    author={Bachani, Suhail Hiro},
    year={2025},
    url={https://[https://huggingface.co/spaces/s21mind/S21MIND]
}

HexaMind | Topological AI Safety | S21 Theory | Patent Pending