MIMIC Checkpoints

Trained model checkpoints from MIMIC experiments. Each checkpoint contains the model weights, optimizer state, scheduler state, normalization stats, and full model config.

Trained on erickfm/frame-melee (~95k tournament replays).

Naming Convention

Filenames encode the full config:

d{d_model}_L{layers}_ff{feedforward}_{encoder}_pos-{pos_enc}_seq{seq_len}_sl-{stick_loss}_bl-{btn_loss}_step{step}.pt

Checkpoints

Checkpoint d_model Layers Encoder Pos Enc Seq Len Stick Loss Size
d768_L4_..._seq180_sl-mse_..._step15625 768 4 hybrid16 learned 180 mse 391MB
d768_L4_..._seq60_sl-huber_..._step5208 768 4 hybrid16 learned 60 huber 389MB
d768_L4_..._seq60_sl-mse_..._step18000 768 4 hybrid16 learned 60 mse 389MB
d768_L4_..._seq90_..._step7812 768 4 hybrid16 learned 90 mse 390MB
d768_L4_..._seq120_..._step10416 768 4 hybrid16 learned 120 mse 390MB
d768_L4_..._seq30_..._step2604 768 4 hybrid16 learned 30 mse 389MB
d768_L4_..._sl-discrete_..._step5208 768 4 hybrid16 learned 60 discrete 393MB
d768_L8_..._step10416 768 8 hybrid16 learned 60 mse 730MB
d1024_L4_..._hybrid16_..._step24000 1024 4 hybrid16 learned 60 mse 659MB
d1024_L4_..._default_..._step30000 1024 4 default learned 60 mse 708MB
d1024_L8_..._step34375 1024 8 hybrid16 learned 60 mse 1.3GB
d1536_L2_..._step62500 1536 2 hybrid16 learned 60 mse 744MB
d1536_L8_..._step37500 1536 8 hybrid16 learned 60 mse 2.8GB
d512_L4_..._pos-alibi_..._step15625 512 4 hybrid16 alibi 60 mse 195MB
d512_L4_..._pos-rope_..._step15625 512 4 hybrid16 rope 60 mse 195MB
d512_L4_..._seq60_..._step30000 512 4 hybrid16 learned 60 mse 196MB
d512_L4_..._seq120_..._step31240 512 4 hybrid16 learned 120 mse 196MB
d512_L4_..._seq180_..._step41666 512 4 hybrid16 learned 180 mse 196MB
d512_L4_..._seq240_..._step62500 512 4 hybrid16 learned 240 mse 197MB
d512_L4_..._seq360_..._step62500 512 4 hybrid16 learned 360 mse 197MB
d512_L8_..._hybrid16_..._step31250 512 8 hybrid16 learned 60 mse 347MB
d512_L8_..._composite8_..._step30000 512 8 composite8 learned 60 mse 341MB

Loading

import torch
from model import FramePredictor, ModelConfig

ckpt = torch.load("checkpoint.pt", map_location="cpu")
cfg = ModelConfig(**ckpt["config"])
model = FramePredictor(cfg)
model.load_state_dict(ckpt["model_state_dict"])
model.eval()

Note on Missing Checkpoints

Multiple runs per machine shared the same checkpoints/ directory with step-number filenames. Runs that finished at the same step count overwrote each other. These 23 checkpoints are the surviving unique configs. Some notable runs (depth-2L, depth-6L, focal-btn, baseline at step 5208) were overwritten.

Related

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support