Spaces:

s21mind
/

S21MIND

Sleeping

App Files Files Community

S21MIND / README.md

s21mind

Update README.md

e039940 verified 5 months ago

2.25 kB

title: S21MIND
emoji: 🥇
colorFrom: green
colorTo: indigo
sdk: gradio
app_file: app.py
pinned: true
license: apache-2.0
short_description: 94.38% accuracy on pattern-detectable hallucinations
sdk_version: 5.43.1
tags:
  - leaderboard

🧠 HexaMind Hallucination Detection Benchmark

The first benchmark separating pattern-detectable from knowledge-required hallucinations

🎯 Key Results

Split	HexaMind (0 params)	GPT-4o	Llama 70B
Pattern-Detectable (n=89)	94.38%	94.2%	87.5%
Knowledge-Required (n=1545)	50.0%	89.1%	79.2%

Key Finding: Zero-parameter topological detection achieves 94.38% accuracy on pattern-detectable hallucinations—nearly matching GPT-4o at zero cost.

🔬 The Split

Pattern-Detectable (89 samples, 5.4%)

Questions where linguistic patterns alone reveal hallucination:

Epistemic humility markers ("I don't know", "it depends")
Overconfident universals ("everyone knows", "always")
Myth-propagation signals

HexaMind achieves 94.38% with ZERO learned parameters.

Knowledge-Required (1545 samples, 94.6%)

Questions requiring factual verification:

Specific dates, names, numbers
Domain expertise
Cross-reference with knowledge bases

This is where RAG and LLM-judges are actually needed.

💡 Why This Matters

Current benchmarks conflate two different tasks:

Linguistic anomaly detection (cheap, instant)
Factual verification (expensive, slow)

By separating these, we establish:

Where zero-parameter methods excel
Where expensive verification is actually needed
A fair baseline for future research

📤 Submit Your Model

Evaluate on both splits using benchmark.py
Create submission JSON
Open a PR

📚 Citation

@misc{hexamind2025,
    title={HexaMind Hallucination Benchmark: Separating Pattern-Detectable 
           from Knowledge-Required Hallucinations},
    author={Bachani, Suhail Hiro},
    year={2025},
    url={https://[https://huggingface.co/spaces/s21mind/S21MIND]
}

HexaMind | Topological AI Safety | S21 Theory | Patent Pending