MIMIC Checkpoints
Trained model checkpoints from MIMIC experiments. Each checkpoint contains the model weights, optimizer state, scheduler state, normalization stats, and full model config.
Trained on erickfm/frame-melee (~95k tournament replays).
Naming Convention
Filenames encode the full config:
d{d_model}_L{layers}_ff{feedforward}_{encoder}_pos-{pos_enc}_seq{seq_len}_sl-{stick_loss}_bl-{btn_loss}_step{step}.pt
Checkpoints
| Checkpoint | d_model | Layers | Encoder | Pos Enc | Seq Len | Stick Loss | Size |
|---|---|---|---|---|---|---|---|
d768_L4_..._seq180_sl-mse_..._step15625 |
768 | 4 | hybrid16 | learned | 180 | mse | 391MB |
d768_L4_..._seq60_sl-huber_..._step5208 |
768 | 4 | hybrid16 | learned | 60 | huber | 389MB |
d768_L4_..._seq60_sl-mse_..._step18000 |
768 | 4 | hybrid16 | learned | 60 | mse | 389MB |
d768_L4_..._seq90_..._step7812 |
768 | 4 | hybrid16 | learned | 90 | mse | 390MB |
d768_L4_..._seq120_..._step10416 |
768 | 4 | hybrid16 | learned | 120 | mse | 390MB |
d768_L4_..._seq30_..._step2604 |
768 | 4 | hybrid16 | learned | 30 | mse | 389MB |
d768_L4_..._sl-discrete_..._step5208 |
768 | 4 | hybrid16 | learned | 60 | discrete | 393MB |
d768_L8_..._step10416 |
768 | 8 | hybrid16 | learned | 60 | mse | 730MB |
d1024_L4_..._hybrid16_..._step24000 |
1024 | 4 | hybrid16 | learned | 60 | mse | 659MB |
d1024_L4_..._default_..._step30000 |
1024 | 4 | default | learned | 60 | mse | 708MB |
d1024_L8_..._step34375 |
1024 | 8 | hybrid16 | learned | 60 | mse | 1.3GB |
d1536_L2_..._step62500 |
1536 | 2 | hybrid16 | learned | 60 | mse | 744MB |
d1536_L8_..._step37500 |
1536 | 8 | hybrid16 | learned | 60 | mse | 2.8GB |
d512_L4_..._pos-alibi_..._step15625 |
512 | 4 | hybrid16 | alibi | 60 | mse | 195MB |
d512_L4_..._pos-rope_..._step15625 |
512 | 4 | hybrid16 | rope | 60 | mse | 195MB |
d512_L4_..._seq60_..._step30000 |
512 | 4 | hybrid16 | learned | 60 | mse | 196MB |
d512_L4_..._seq120_..._step31240 |
512 | 4 | hybrid16 | learned | 120 | mse | 196MB |
d512_L4_..._seq180_..._step41666 |
512 | 4 | hybrid16 | learned | 180 | mse | 196MB |
d512_L4_..._seq240_..._step62500 |
512 | 4 | hybrid16 | learned | 240 | mse | 197MB |
d512_L4_..._seq360_..._step62500 |
512 | 4 | hybrid16 | learned | 360 | mse | 197MB |
d512_L8_..._hybrid16_..._step31250 |
512 | 8 | hybrid16 | learned | 60 | mse | 347MB |
d512_L8_..._composite8_..._step30000 |
512 | 8 | composite8 | learned | 60 | mse | 341MB |
Loading
import torch
from model import FramePredictor, ModelConfig
ckpt = torch.load("checkpoint.pt", map_location="cpu")
cfg = ModelConfig(**ckpt["config"])
model = FramePredictor(cfg)
model.load_state_dict(ckpt["model_state_dict"])
model.eval()
Note on Missing Checkpoints
Multiple runs per machine shared the same checkpoints/ directory with step-number filenames. Runs that finished at the same step count overwrote each other. These 23 checkpoints are the surviving unique configs. Some notable runs (depth-2L, depth-6L, focal-btn, baseline at step 5208) were overwritten.
Related
- MIMIC -- Training code and inference
- erickfm/frame-melee -- Training dataset
- RESULTS.md -- Full experiment results
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support