Controlling the Extraction of Memorized Data from Large Language Models via Prompt-Tuning Paper • 2305.11759 • Published May 19, 2023 • 2
Using multiple ASR hypotheses to boost i18n NLU performance Paper • 2012.04099 • Published Dec 7, 2020
When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards Paper • 2402.01781 • Published Feb 1, 2024 • 4
AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model Paper • 2208.01448 • Published Aug 2, 2022
A Systematic Survey and Critical Review on Evaluating Large Language Models: Challenges, Limitations, and Recommendations Paper • 2407.04069 • Published Jul 4, 2024
ALLaM: Large Language Models for Arabic and English Paper • 2407.15390 • Published Jul 22, 2024 • 3
When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards Paper • 2402.01781 • Published Feb 1, 2024 • 4
ALLaM: Large Language Models for Arabic and English Paper • 2407.15390 • Published Jul 22, 2024 • 3
ArabicMMLU: Assessing Massive Multitask Language Understanding in Arabic Paper • 2402.12840 • Published Feb 20, 2024 • 1
Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning Paper • 2402.06619 • Published Feb 9, 2024 • 56