BigScience Workshop

non-profit

https://bigscience.huggingface.co

bigscienceW

bigscience-workshop

AI & ML interests

A one-year long research workshop on large language models: the Summer of Language Models 21 🌸

Recent Activity

shubhamagarwal92 authored a paper 17 days ago

BhashaKritika: Building Synthetic Pretraining Data at Scale for Indic Languages

craffel authored a paper 20 days ago

TokSuite: Measuring the Impact of Tokenizer Choice on Language Model Behavior

christopher new activity about 1 month ago

bigscience/bloomz-560m:Fails to load with transformers v4.57+

View all activity

ybelkada

authored a paper 6 days ago

Learnable Multipliers: Freeing the Scale of Language Model Matrix Layers

Paper • 2601.04890 • Published 7 days ago • 39

christopher

in bigscience/bloomz-560m about 1 month ago

Fails to load with transformers v4.57+

#14 opened about 1 month ago by

monsoon-nlp

posted an update about 2 months ago

Post

401

PatchDNA, a DNA foundation model based on Meta's BLT tokenization strategy https://www.biorxiv.org/content/10.1101/2025.11.28.691095v1

christopher

in bigscience/petals-api 2 months ago

Bloom

#2 opened 2 months ago by

rabiulawal

authored a paper 2 months ago

Grounding Computer Use Agents on Human Demonstrations

Paper • 2511.07332 • Published Nov 10, 2025 • 105

Zaid

authored a paper 3 months ago

Global PIQA: Evaluating Physical Commonsense Reasoning Across 100+ Languages and Cultures

Paper • 2510.24081 • Published Oct 28, 2025 • 18

teelinsan

authored a paper 3 months ago

Language Models are Injective and Hence Invertible

Paper • 2510.15511 • Published Oct 17, 2025 • 69

Zaid

authored a paper 3 months ago

MeXtract: Light-Weight Metadata Extraction from Scientific Papers

Paper • 2510.06889 • Published Oct 8, 2025 • 1

monsoon-nlp

posted an update 4 months ago

Post

461

Bio LLMs train on many genomes, but can we encode differences within a species? TomatoTomato adds pangenome tokens to represent a domestic tomato and a wild tomato in one sequence 🍅 🧬
monsoon-nlp/tomatotomato-gLM2-150M-v0.1

christopher

in bigscience/bloom 5 months ago

Let's talk about the model

#284 opened 5 months ago by

ybelkada

authored 2 papers 6 months ago

NeurIPS 2025 E2LM Competition : Early Training Evaluation of Language Models

Paper • 2506.07731 • Published Jun 9, 2025 • 2

Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance

Paper • 2507.22448 • Published Jul 30, 2025 • 69

RTT1

authored a paper 6 months ago

Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving

Paper • 2507.06229 • Published Jul 8, 2025 • 75

WojciechKusa

authored a paper 7 months ago

PL-Guard: Benchmarking Language Model Safety for Polish

Paper • 2506.16322 • Published Jun 19, 2025 • 1

christopher

in bigscience/bloom-1b1-intermediate 7 months ago

Issue with step 400_000

#3 opened 7 months ago by

Muennighoff

in bigscience/bloom-1b1-intermediate 7 months ago

Issue with step 400_000

#3 opened 7 months ago by

WojciechKusa

authored 4 papers 7 months ago

ConECT Dataset: Overcoming Data Scarcity in Context-Aware E-Commerce MT

Paper • 2506.04929 • Published Jun 5, 2025 • 2

Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order

Paper • 2404.00399 • Published Mar 30, 2024 • 42

BigBIO: A Framework for Data-Centric Biomedical Natural Language Processing

Paper • 2206.15076 • Published Jun 30, 2022 • 5

CSMeD: Bridging the Dataset Gap in Automated Citation Screening for Systematic Literature Reviews

Paper • 2311.12474 • Published Nov 21, 2023 • 1