Wikontic: Constructing Wikidata-Aligned, Ontology-Aware Knowledge Graphs with Large Language Models Paper • 2512.00590 • Published 9 days ago • 35
Multimodal Evaluation of Russian-language Architectures Paper • 2511.15552 • Published 19 days ago • 78
When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection with PsiloQA Paper • 2510.04849 • Published Oct 6 • 113
HUME: Measuring the Human-Model Performance Gap in Text Embedding Task Paper • 2510.10062 • Published Oct 11 • 8
Beyond Memorization: Extending Reasoning Depth with Recurrence, Memory and Test-Time Compute Scaling Paper • 2508.16745 • Published Aug 22 • 29
Maintaining MTEB: Towards Long Term Usability and Reproducibility of Embedding Benchmarks Paper • 2506.21182 • Published Jun 26 • 2
Will It Still Be True Tomorrow? Multilingual Evergreen Question Classification to Improve Trustworthy QA Paper • 2505.21115 • Published May 27 • 140
Diagonal Batching Unlocks Parallelism in Recurrent Memory Transformers for Long Contexts Paper • 2506.05229 • Published Jun 5 • 38