Update README.md
Browse files
README.md
CHANGED
|
@@ -14,7 +14,7 @@ model-index:
|
|
| 14 |
|
| 15 |
|
| 16 |
|
| 17 |
-
***It's garbage,
|
| 18 |
|
| 19 |
```sh
|
| 20 |
docker run --rm --runtime nvidia --ipc=host --gpus 'all' \
|
|
@@ -32,12 +32,16 @@ docker run --rm --runtime nvidia --ipc=host --gpus 'all' \
|
|
| 32 |
|
| 33 |
## Qwen3-4B BFCL GT (w/o thinking)
|
| 34 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
| 35 |
π Running test: parallel_multiple
|
| 36 |
β
Test completed: parallel_multiple. π― Accuracy: 0.89
|
| 37 |
π Running test: parallel
|
| 38 |
-
β
Test completed: parallel. π― Accuracy: 0.
|
| 39 |
π Running test: simple
|
| 40 |
-
β
Test completed: simple. π― Accuracy: 0.
|
| 41 |
π Running test: multiple
|
| 42 |
β
Test completed: multiple. π― Accuracy: 0.92
|
| 43 |
```
|
|
|
|
| 14 |
|
| 15 |
|
| 16 |
|
| 17 |
+
***It's garbage, this learning settings are wrong.***
|
| 18 |
|
| 19 |
```sh
|
| 20 |
docker run --rm --runtime nvidia --ipc=host --gpus 'all' \
|
|
|
|
| 32 |
|
| 33 |
## Qwen3-4B BFCL GT (w/o thinking)
|
| 34 |
```
|
| 35 |
+
π Running test: irrelevance
|
| 36 |
+
β
Test completed: irrelevance. π― Accuracy: 0.875
|
| 37 |
+
π Running test: multi_turn_base
|
| 38 |
+
β
Test completed: multi_turn_base. π― Accuracy: 0.085
|
| 39 |
π Running test: parallel_multiple
|
| 40 |
β
Test completed: parallel_multiple. π― Accuracy: 0.89
|
| 41 |
π Running test: parallel
|
| 42 |
+
β
Test completed: parallel. π― Accuracy: 0.885
|
| 43 |
π Running test: simple
|
| 44 |
+
β
Test completed: simple. π― Accuracy: 0.9325
|
| 45 |
π Running test: multiple
|
| 46 |
β
Test completed: multiple. π― Accuracy: 0.92
|
| 47 |
```
|