LLM4HEP / COMPLETE_MODEL_VERSIONS.md

ho22joshua

initial commit

cfcbbc8 about 2 months ago

preview code

raw

history blame contribute delete

5.03 kB

Complete Model Version Information

Discovered via CBORG API Testing - October 29, 2025

This document shows the complete mapping from CBORG model aliases to their underlying versions, including all version dates discovered through API testing.

Models with Version Dates

Anthropic Claude Models

Model Alias	Display Name	Underlying Version	Version Date
`anthropic/claude-haiku:latest`	Claude Haiku 4.5 (2025-10-01)	`claude-haiku-4-5@20251001`	Oct 1, 2025
`anthropic/claude-opus:latest`	Claude Opus 4.1 (2025-08-05)	`us.anthropic.claude-opus-4-1-20250805-v1:0`	Aug 5, 2025
`anthropic/claude-sonnet:latest`	Claude Sonnet 4.5 (2025-09-29)	`claude-sonnet-4-5@20250929`	Sep 29, 2025
`claude-3-5-haiku-latest`	Claude 3.5 Haiku (2024-10-22)	`claude-3-5-haiku@20241022`	Oct 22, 2024

OpenAI Models (via Azure)

Model Alias	Display Name	Underlying Version	Version Date
`openai/gpt-5`	GPT-5 (2025-08-07)	`gpt-5-2025-08-07`	Aug 7, 2025
`openai/gpt-5-mini`	GPT-5 Mini (2025-08-07)	`gpt-5-mini-2025-08-07`	Aug 7, 2025
`openai/o:latest`	O3 (2025-04-16)	`azure/o3-2025-04-16`	Apr 16, 2025
`openai/o3`	O3 (2025-04-16)	`azure/o3-2025-04-16`	Apr 16, 2025
`openai/o3-mini`	O3 Mini (2025-01-31)	`azure/o3-mini-2025-01-31`	Jan 31, 2025
`openai/o4-mini`	O4 Mini (2025-04-16)	`azure/o4-mini-2025-04-16`	Apr 16, 2025

Key Finding: Both openai/o:latest and openai/o3 map to the same model version (2025-04-16)

Models with Model Size Information

AWS Llama Models

Model Alias	Display Name	Underlying Version
`aws/llama-4-maverick`	Llama-4 Maverick (17B)	`us.meta.llama4-maverick-17b-instruct-v1:0`
`aws/llama-4-scout`	Llama-4 Scout (17B)	`us.meta.llama4-scout-17b-instruct-v1:0`

Key Finding: Both models are 17 billion parameter variants

GCP Models

Model Alias	Display Name	Underlying Version
`gcp/qwen-3`	Qwen-3 (235B)	`qwen/qwen3-235b-a22b-instruct-2507-maas`

Key Finding: This is a massive 235 billion parameter model

Google Gemini Models

Model Alias	Display Name	Underlying Version	Notes
`google/gemini:latest`	Gemini 2.5 Pro	`gemini-2.5-pro`	Latest generation
`google/gemini-flash`	Gemini 2.5 Flash	`gemini-2.5-flash`	Fast variant
`gemini-2.0-flash-lite`	Gemini 2.0 Flash Lite	(no alias - direct name)	Lightweight variant

xAI Grok Models

Model Alias	Display Name	Underlying Version	Notes
`xai/grok:latest`	Grok-3	`grok-3`	Latest generation
`xai/grok-mini`	Grok Mini	(rate limited during test)	Smaller variant
`xai/grok-code-fast-1`	Grok Code Fast 1	(rate limited during test)	Code-focused fast variant

Other Models

Model Alias	Display Name	Underlying Version	Notes
`gpt-oss-120b`	GPT-OSS-120B	`hosted_vllm/hosted_vllm/gpt-oss-120b`	Open source, hosted via vLLM
`gpt-5-codex`	GPT-5 Codex	(not accessible during test)	Code-focused variant
`deepseek-r1`	DeepSeek-R1	`MAI-DS-R1`	DeepSeek reasoning model

Key Insights

Version Date Patterns

Most Recent Claude Models: September-October 2025
- Sonnet 4.5: Sep 29, 2025
- Haiku 4.5: Oct 1, 2025
- Opus 4.1: Aug 5, 2025
Most Recent OpenAI Models: April-August 2025
- GPT-5: Aug 7, 2025
- O4 Mini: Apr 16, 2025
- O3: Apr 16, 2025
- O3 Mini: Jan 31, 2025
Older Models Still in Use:
- Claude 3.5 Haiku: Oct 22, 2024 (over a year old)

Model Sizes Discovered

235B parameters: Qwen-3 (largest)
120B parameters: GPT-OSS-120B
17B parameters: Llama-4 Maverick, Llama-4 Scout

`:latest` Aliases

All :latest suffixes have been resolved:

anthropic/claude-*:latest → Specific dated versions
google/gemini:latest → gemini-2.5-pro
xai/grok:latest → grok-3
openai/o:latest → azure/o3-2025-04-16

Usage in Notebook

The notebook now displays all these version dates and model sizes in plot titles and legends, making it clear exactly which model versions were used in the experiments.

Example plot titles:

"Claude Haiku 4.5 (2025-10-01)" instead of "anthropic/claude-haiku:latest"
"O3 (2025-04-16)" instead of "openai/o3"
"GPT-5 Mini (2025-08-07)" instead of "openai/gpt-5-mini"
"Qwen-3 (235B)" instead of "gcp/qwen-3"

This provides complete transparency about which exact model snapshots were used in your analysis!