Ai4Privacy has been working on this for the past year. π
Today we're releasing the PII Masking 2M Series, the world's largest open source privacy masking dataset. (Again. ππ)
π’ 2M+ synthetic examples π 32 locales across Europe π·οΈ 98 entity types π₯π¬π¦πΌπ 5 industry verticals: Health, Finance, Digital, Work, Location β 1M+ entries freely available on Hugging Face
Every example is 100% synthetic. No real personal data. Built so you can train and evaluate PII detection models without the legal headaches. π
Thank you for 15,000,000+ downloads across our datasets, models, and libraries. This one's for you. β€οΈ
Ai4Privacy has been working on this for the past year. π
Today we're releasing the PII Masking 2M Series, the world's largest open source privacy masking dataset. (Again. ππ)
π’ 2M+ synthetic examples π 32 locales across Europe π·οΈ 98 entity types π₯π¬π¦πΌπ 5 industry verticals: Health, Finance, Digital, Work, Location β 1M+ entries freely available on Hugging Face
Every example is 100% synthetic. No real personal data. Built so you can train and evaluate PII detection models without the legal headaches. π
Thank you for 15,000,000+ downloads across our datasets, models, and libraries. This one's for you. β€οΈ