AI & ML interests
Dutch NLP · PII detection · LLM workflow tooling · AI agent governance · RAG groundedness · Prompt-injection detection · Local-first & on-device ML · GDPR & EU AI Act compliance
Recent Activity
🏠 LokaalHub
Small, local-first models for LLM workflows — currently focused on Dutch.
We publish compact, open models that help developers and organisations work with LLMs responsibly: privacy, grounding, governance, routing. The approach is language-agnostic; our current models are tuned for Dutch, with other languages planned. Everything runs on a laptop; nothing needs a cloud.
Current models
| Model | Task | Size | Score |
|---|---|---|---|
nl-lokaal-middel |
Dutch PII NER | 473 MB | F1 0.84 |
nl-lokaal-klein |
Dutch PII NER (fast) | 181 MB | F1 0.78 |
Naming and size tiers
All models follow a two-part convention: <task-in-local-language>-<size-tier>.
klein— up to ~200 MB, throughput tier (runs fast on older laptops).middel— up to ~500 MB, accuracy tier (recommended default).groot— reserved for larger models.
Tools
- Filenthropist — local-first file scanner and labeler that lets you work safely with autonomous AI agents. Available on PyPI.
What's next
- Dutch NLI / groundedness scoring (
nl-nli-*) — verify that RAG outputs are actually supported by their sources. - Prompt-injection detection (
nl-injectie-*) — detect adversarial prompts in Dutch inputs. - Intent / domain routing (
nl-router-*) — classify queries for agentic workflows.
Why local-first?
SMEs and public institutions operating under GDPR and the EU AI Act often can't send documents or queries to foreign cloud APIs for compliance or procurement reasons. We build models small enough to run on a laptop so local deployment is a first-class option, not a compromise. We're starting with Dutch because that's the market we know; the architecture generalises to other languages.
LokaalHub is an independent open-source initiative. Models are Apache-2.0 where possible.