nicklashansen commited on
Commit
bd1192c
·
verified ·
1 Parent(s): fd6cb63

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +72 -3
README.md CHANGED
@@ -1,3 +1,72 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - reinforcement learning
5
+ - world model
6
+ - continuous control
7
+ - robotics
8
+ pipeline_tag: reinforcement-learning
9
+ ---
10
+
11
+ # Newt
12
+
13
+ Official release of Newt model checkpoints for the paper
14
+
15
+ [Learning Massively Multitask World Models for Continuous Control](https://www.nicklashansen.com/NewtWM) by
16
+
17
+ [Nicklas Hansen](https://nicklashansen.github.io), [Hao Su](https://cseweb.ucsd.edu/~haosu)\*, [Xiaolong Wang](https://xiaolonw.github.io)\* (UC San Diego)
18
+
19
+ **Quick links:** [[Website]](https://www.nicklashansen.com/NewtWM) [[Paper]](https://arxiv.org/abs/2310.16828) [[Dataset]](https://huggingface.co/datasets/nicklashansen/tdmpc2)
20
+
21
+
22
+ ## Model details
23
+
24
+ We open-source 200+ model checkpoints, including a multi-task Newt agent trained on 200 tasks simultaneously. We are excited to see what the community will do with these models, and hope that our release will encourage other research labs to open-source their checkpoints as well. This section aims to provide further details about the released models.
25
+
26
+
27
+ ### Model description
28
+
29
+ - **Developed by:** [Nicklas Hansen](https://nicklashansen.github.io) (UC San Diego)
30
+ - **Model type:** TD-MPC2 and Newt models trained on tasks from MMBench (DMControl, Meta-World, Maniskill3, MiniArcade, Atari, and more).
31
+ - **License:** MIT
32
+
33
+ ### Model sources
34
+
35
+ - **Repository:** [https://github.com/nicklashansen/newt](https://github.com/nicklashansen/newt)
36
+ - **Paper:** [https://arxiv.org/abs/2310.16828](https://arxiv.org/abs/2310.16828)
37
+
38
+ ## Uses
39
+
40
+ As one of the first major releases of model checkpoints for reinforcement learning, use of our TD-MPC2 and Newt checkpoints is fairly open-ended. We envision that our checkpoints will be useful for researchers interested in training, finetuning, evaluating, and analyzing single-task and multitask models on any of the 200 continuous control tasks that we release models for. However, we also expect the community to discover new use cases for these checkpoints.
41
+
42
+ ### Direct use
43
+
44
+ Model checkpoints can be loaded using the [official implementation](https://github.com/nicklashansen/newt), and then be used to reproduce our results and/or generate trajectories for any of the supported tasks.
45
+
46
+ ### Out-of-scope use
47
+
48
+ We do not expect our model checkpoints to reliably generalize to new (unseen) tasks as is. Such model usage will most likely require some amount of fine-tuning on target task data.
49
+
50
+ ## How to get started with the models
51
+
52
+ Refer to the [official implementation](https://github.com/nicklashansen/newt) for installation instructions and example usage.
53
+
54
+ ## Citation
55
+
56
+ If you find our work useful, please consider citing the paper as follows:
57
+
58
+ **BibTeX:**
59
+ ```
60
+ @misc{Hansen2025Newt,
61
+ title={Learning Massively Multitask World Models for Continuous Control},
62
+ author={Nicklas Hansen and Hao Su and Xiaolong Wang},
63
+ year={2025},
64
+ eprint={2310.16828},
65
+ archivePrefix={arXiv},
66
+ primaryClass={cs.LG}
67
+ }
68
+ ```
69
+
70
+ ## Contact
71
+
72
+ Correspondence to: [Nicklas Hansen](https://nicklashansen.github.io)