rts-commander / docs /reports /qwen_mcp_evaluation_results.json
Luigi's picture
Organize project structure: move test scripts to tests/scripts and documentation to docs/reports
d28c36c
[
{
"test": "Compréhension MCP de base",
"response": "```json\n{\n \"call\": \"get_game_state\"\n}\n```",
"response_time": 0.9023318290710449,
"score": 5,
"success": true
},
{
"test": "Traduction d'action simple",
"response": "```json\n{\n \"move_units\": {\n \"unit_ids\": [\"infanterie\"],\n \"target_x\": 150,\n \"target_y\": 200\n }\n}\n```",
"response_time": 1.97584867477417,
"score": 5,
"success": true
},
{
"test": "Extraction de paramètres complexes",
"response": "```json\n{\n \"move_units\": {\n \"units\": [\n \"tank_1\",\n \"tank_3\",\n \"tank_7\"\n ],\n \"target_x\": 120,\n \"target_y\": 80\n }\n}\n```",
"response_time": 2.287006378173828,
"score": 5,
"success": true
},
{
"test": "Planification stratégique",
"response": "```json\n{\n \"get_game_state\": {},\n \"move_units\": {\n \"units\": [\n {\n \"name\": \"Base\",\n \"position\": {\n \"x\": 10,\n \"y\": 10\n }\n },\n {\n \"name\": \"Minerai\",\n \"position\": {\n \"x\": 10,\n \"y\": 10\n }\n }\n ]\n },\n \"attack_unit\": {\n \"units\": [\n {\n \"name\": \"Minerai\",\n \"position\": {\n \"x\": 10,\n \"y\": 10\n }\n }\n ]\n }\n}\n```",
"response_time": 5.687268257141113,
"score": 3,
"success": false
},
{
"test": "Gestion d'ambiguïté",
"response": "Voici comment je gérerais cette instruction ambiguë :\n\n```json\n{\n \"move_units\": {\n \"unit_ids\": [\"unit1_id\", \"unit2_id\", \"unit3_id\"],\n \"target_x\": 0,\n \"target_y\": 0\n }\n}\n```\n\nDans ce cas, je vais déplacer les unités vers la base ennemie.",
"response_time": 3.0712451934814453,
"score": 3,
"success": false
}
]