episode_id stringlengths 9 9 | messages listlengths 7 11 | reward float64 0 1 | done bool 1
class | tool_trace listlengths 3 5 | metrics dict | verify_metrics dict | artifacts dict | task null | extra unknown |
|---|---|---|---|---|---|---|---|---|---|
ep-000000 | [
{
"role": "user",
"content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8"
},
{
"role": "assistant",
"content": "Playing action_id=0 ... | 1 | true | [
{
"tool_name": "play_move",
"arguments": {
"action_id": 0
},
"result": {
"data": {
"action_id": 0,
"legal_actions": [
1,
2,
3,
4,
5,
6,
7
],
"reward": 0,
"done": false,
"... | {
"turns": 3,
"tool_calls": 3
} | {
"step_count": 5
} | {
"final_state": {
"episode_id": "61776ad1-1d9b-441a-b16c-cae15cc924ac",
"step_count": 5,
"game_name": "tic_tac_toe",
"agent_player": 0,
"opponent_policy": "random",
"game_params": {},
"num_players": 2
},
"transcript_length": 7
} | null | {} |
ep-000003 | [
{
"role": "user",
"content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8"
},
{
"role": "assistant",
"content": "Playing action_id=0 ... | 1 | true | [
{
"tool_name": "play_move",
"arguments": {
"action_id": 0
},
"result": {
"data": {
"action_id": 0,
"legal_actions": [
1,
2,
3,
4,
5,
6,
8
],
"reward": 0,
"done": false,
"... | {
"turns": 3,
"tool_calls": 3
} | {
"step_count": 5
} | {
"final_state": {
"episode_id": "369b3d15-fe49-431c-abd8-1dda48a23eb8",
"step_count": 5,
"game_name": "tic_tac_toe",
"agent_player": 0,
"opponent_policy": "random",
"game_params": {},
"num_players": 2
},
"transcript_length": 7
} | null | {} |
ep-000007 | [
{
"role": "user",
"content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8"
},
{
"role": "assistant",
"content": "Playing action_id=0 ... | 1 | true | [
{
"tool_name": "play_move",
"arguments": {
"action_id": 0
},
"result": {
"data": {
"action_id": 0,
"legal_actions": [
1,
2,
3,
4,
5,
6,
8
],
"reward": 0,
"done": false,
"... | {
"turns": 3,
"tool_calls": 3
} | {
"step_count": 5
} | {
"final_state": {
"episode_id": "74b9dcd0-e7aa-4db2-ab42-21f8ff6a985f",
"step_count": 5,
"game_name": "tic_tac_toe",
"agent_player": 0,
"opponent_policy": "random",
"game_params": {},
"num_players": 2
},
"transcript_length": 7
} | null | {} |
ep-000008 | [
{
"role": "user",
"content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8"
},
{
"role": "assistant",
"content": "Playing action_id=0 ... | 1 | true | [
{
"tool_name": "play_move",
"arguments": {
"action_id": 0
},
"result": {
"data": {
"action_id": 0,
"legal_actions": [
1,
2,
3,
4,
5,
6,
8
],
"reward": 0,
"done": false,
"... | {
"turns": 3,
"tool_calls": 3
} | {
"step_count": 5
} | {
"final_state": {
"episode_id": "1641dcd1-7c67-4c80-878c-90db2f6d604a",
"step_count": 5,
"game_name": "tic_tac_toe",
"agent_player": 0,
"opponent_policy": "random",
"game_params": {},
"num_players": 2
},
"transcript_length": 7
} | null | {} |
ep-000009 | [
{
"role": "user",
"content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8"
},
{
"role": "assistant",
"content": "Playing action_id=0 ... | 1 | true | [
{
"tool_name": "play_move",
"arguments": {
"action_id": 0
},
"result": {
"data": {
"action_id": 0,
"legal_actions": [
1,
2,
4,
5,
6,
7,
8
],
"reward": 0,
"done": false,
"... | {
"turns": 3,
"tool_calls": 3
} | {
"step_count": 5
} | {
"final_state": {
"episode_id": "b0157f83-d54a-43b1-980b-be1f1588f628",
"step_count": 5,
"game_name": "tic_tac_toe",
"agent_player": 0,
"opponent_policy": "random",
"game_params": {},
"num_players": 2
},
"transcript_length": 7
} | null | {} |
ep-000011 | [
{
"role": "user",
"content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8"
},
{
"role": "assistant",
"content": "Playing action_id=0 ... | 1 | true | [
{
"tool_name": "play_move",
"arguments": {
"action_id": 0
},
"result": {
"data": {
"action_id": 0,
"legal_actions": [
1,
2,
3,
4,
5,
7,
8
],
"reward": 0,
"done": false,
"... | {
"turns": 5,
"tool_calls": 5
} | {
"step_count": 9
} | {
"final_state": {
"episode_id": "55e1d862-8f9f-4ef0-a699-3f4b3fddc57e",
"step_count": 9,
"game_name": "tic_tac_toe",
"agent_player": 0,
"opponent_policy": "random",
"game_params": {},
"num_players": 2
},
"transcript_length": 11
} | null | {} |
ep-000012 | [
{
"role": "user",
"content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8"
},
{
"role": "assistant",
"content": "Playing action_id=0 ... | 0 | true | [
{
"tool_name": "play_move",
"arguments": {
"action_id": 0
},
"result": {
"data": {
"action_id": 0,
"legal_actions": [
2,
3,
4,
5,
6,
7,
8
],
"reward": 0,
"done": false,
"... | {
"turns": 5,
"tool_calls": 5
} | {
"step_count": 9
} | {
"final_state": {
"episode_id": "9504fc08-45f6-482f-abc7-77aade3fc46c",
"step_count": 9,
"game_name": "tic_tac_toe",
"agent_player": 0,
"opponent_policy": "random",
"game_params": {},
"num_players": 2
},
"transcript_length": 11
} | null | {} |
ep-000013 | [
{
"role": "user",
"content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8"
},
{
"role": "assistant",
"content": "Playing action_id=0 ... | 1 | true | [
{
"tool_name": "play_move",
"arguments": {
"action_id": 0
},
"result": {
"data": {
"action_id": 0,
"legal_actions": [
1,
2,
4,
5,
6,
7,
8
],
"reward": 0,
"done": false,
"... | {
"turns": 3,
"tool_calls": 3
} | {
"step_count": 5
} | {
"final_state": {
"episode_id": "706cf82d-fb2e-4510-a63a-c36ec76a389b",
"step_count": 5,
"game_name": "tic_tac_toe",
"agent_player": 0,
"opponent_policy": "random",
"game_params": {},
"num_players": 2
},
"transcript_length": 7
} | null | {} |
ep-000015 | [
{
"role": "user",
"content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8"
},
{
"role": "assistant",
"content": "Playing action_id=0 ... | 1 | true | [
{
"tool_name": "play_move",
"arguments": {
"action_id": 0
},
"result": {
"data": {
"action_id": 0,
"legal_actions": [
1,
2,
3,
4,
5,
6,
8
],
"reward": 0,
"done": false,
"... | {
"turns": 3,
"tool_calls": 3
} | {
"step_count": 5
} | {
"final_state": {
"episode_id": "1564b0b6-8fb4-4a38-8e71-3cf4731781ff",
"step_count": 5,
"game_name": "tic_tac_toe",
"agent_player": 0,
"opponent_policy": "random",
"game_params": {},
"num_players": 2
},
"transcript_length": 7
} | null | {} |
ep-000016 | [
{
"role": "user",
"content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8"
},
{
"role": "assistant",
"content": "Playing action_id=0 ... | 1 | true | [
{
"tool_name": "play_move",
"arguments": {
"action_id": 0
},
"result": {
"data": {
"action_id": 0,
"legal_actions": [
2,
3,
4,
5,
6,
7,
8
],
"reward": 0,
"done": false,
"... | {
"turns": 5,
"tool_calls": 5
} | {
"step_count": 9
} | {
"final_state": {
"episode_id": "9b758a46-2fc1-45b4-94cf-32f07bf4fcfe",
"step_count": 9,
"game_name": "tic_tac_toe",
"agent_player": 0,
"opponent_policy": "random",
"game_params": {},
"num_players": 2
},
"transcript_length": 11
} | null | {} |
ep-000017 | [
{
"role": "user",
"content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8"
},
{
"role": "assistant",
"content": "Playing action_id=0 ... | 1 | true | [
{
"tool_name": "play_move",
"arguments": {
"action_id": 0
},
"result": {
"data": {
"action_id": 0,
"legal_actions": [
1,
2,
3,
4,
5,
6,
7
],
"reward": 0,
"done": false,
"... | {
"turns": 3,
"tool_calls": 3
} | {
"step_count": 5
} | {
"final_state": {
"episode_id": "95f0dd59-a449-4b9b-8056-523ad8c803f5",
"step_count": 5,
"game_name": "tic_tac_toe",
"agent_player": 0,
"opponent_policy": "random",
"game_params": {},
"num_players": 2
},
"transcript_length": 7
} | null | {} |
ep-000018 | [
{
"role": "user",
"content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8"
},
{
"role": "assistant",
"content": "Playing action_id=0 ... | 1 | true | [
{
"tool_name": "play_move",
"arguments": {
"action_id": 0
},
"result": {
"data": {
"action_id": 0,
"legal_actions": [
1,
2,
3,
4,
5,
7,
8
],
"reward": 0,
"done": false,
"... | {
"turns": 3,
"tool_calls": 3
} | {
"step_count": 5
} | {
"final_state": {
"episode_id": "232f353b-64c2-46f4-ae55-759e9afe1345",
"step_count": 5,
"game_name": "tic_tac_toe",
"agent_player": 0,
"opponent_policy": "random",
"game_params": {},
"num_players": 2
},
"transcript_length": 7
} | null | {} |
ep-000019 | [
{
"role": "user",
"content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8"
},
{
"role": "assistant",
"content": "Playing action_id=0 ... | 1 | true | [
{
"tool_name": "play_move",
"arguments": {
"action_id": 0
},
"result": {
"data": {
"action_id": 0,
"legal_actions": [
1,
3,
4,
5,
6,
7,
8
],
"reward": 0,
"done": false,
"... | {
"turns": 5,
"tool_calls": 5
} | {
"step_count": 9
} | {
"final_state": {
"episode_id": "6ebde04e-586a-4a43-a332-3632d8c0c4b9",
"step_count": 9,
"game_name": "tic_tac_toe",
"agent_player": 0,
"opponent_policy": "random",
"game_params": {},
"num_players": 2
},
"transcript_length": 11
} | null | {} |
ep-000001 | [
{
"role": "user",
"content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8"
},
{
"role": "assistant",
"content": "Playing action_id=0 ... | 1 | true | [
{
"tool_name": "play_move",
"arguments": {
"action_id": 0
},
"result": {
"data": {
"action_id": 0,
"legal_actions": [
1,
2,
3,
4,
6,
7,
8
],
"reward": 0,
"done": false,
"... | {
"turns": 3,
"tool_calls": 3
} | {
"step_count": 5
} | {
"final_state": {
"episode_id": "92942fef-4aa4-44f9-a9d4-1fc4596826e1",
"step_count": 5,
"game_name": "tic_tac_toe",
"agent_player": 0,
"opponent_policy": "random",
"game_params": {},
"num_players": 2
},
"transcript_length": 7
} | null | {} |
ep-000002 | [
{
"role": "user",
"content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8"
},
{
"role": "assistant",
"content": "Playing action_id=0 ... | 1 | true | [
{
"tool_name": "play_move",
"arguments": {
"action_id": 0
},
"result": {
"data": {
"action_id": 0,
"legal_actions": [
1,
2,
3,
4,
5,
6,
8
],
"reward": 0,
"done": false,
"... | {
"turns": 3,
"tool_calls": 3
} | {
"step_count": 5
} | {
"final_state": {
"episode_id": "3af13450-3f5c-4591-979f-7ab8bdf1c047",
"step_count": 5,
"game_name": "tic_tac_toe",
"agent_player": 0,
"opponent_policy": "random",
"game_params": {},
"num_players": 2
},
"transcript_length": 7
} | null | {} |
ep-000004 | [
{
"role": "user",
"content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8"
},
{
"role": "assistant",
"content": "Playing action_id=0 ... | 1 | true | [
{
"tool_name": "play_move",
"arguments": {
"action_id": 0
},
"result": {
"data": {
"action_id": 0,
"legal_actions": [
1,
2,
3,
5,
6,
7,
8
],
"reward": 0,
"done": false,
"... | {
"turns": 3,
"tool_calls": 3
} | {
"step_count": 5
} | {
"final_state": {
"episode_id": "354cbe7e-9844-49b5-9305-4bc5700163a3",
"step_count": 5,
"game_name": "tic_tac_toe",
"agent_player": 0,
"opponent_policy": "random",
"game_params": {},
"num_players": 2
},
"transcript_length": 7
} | null | {} |
ep-000005 | [
{
"role": "user",
"content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8"
},
{
"role": "assistant",
"content": "Playing action_id=0 ... | 1 | true | [
{
"tool_name": "play_move",
"arguments": {
"action_id": 0
},
"result": {
"data": {
"action_id": 0,
"legal_actions": [
1,
2,
4,
5,
6,
7,
8
],
"reward": 0,
"done": false,
"... | {
"turns": 3,
"tool_calls": 3
} | {
"step_count": 5
} | {
"final_state": {
"episode_id": "1ec9bf9e-8217-463c-8b6c-ebd387a653f6",
"step_count": 5,
"game_name": "tic_tac_toe",
"agent_player": 0,
"opponent_policy": "random",
"game_params": {},
"num_players": 2
},
"transcript_length": 7
} | null | {} |
ep-000006 | [
{
"role": "user",
"content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8"
},
{
"role": "assistant",
"content": "Playing action_id=0 ... | 1 | true | [
{
"tool_name": "play_move",
"arguments": {
"action_id": 0
},
"result": {
"data": {
"action_id": 0,
"legal_actions": [
1,
2,
3,
4,
6,
7,
8
],
"reward": 0,
"done": false,
"... | {
"turns": 3,
"tool_calls": 3
} | {
"step_count": 5
} | {
"final_state": {
"episode_id": "d37ce7b0-660d-4964-b7e2-228ad4b96e79",
"step_count": 5,
"game_name": "tic_tac_toe",
"agent_player": 0,
"opponent_policy": "random",
"game_params": {},
"num_players": 2
},
"transcript_length": 7
} | null | {} |
ep-000010 | [
{
"role": "user",
"content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8"
},
{
"role": "assistant",
"content": "Playing action_id=0 ... | 1 | true | [
{
"tool_name": "play_move",
"arguments": {
"action_id": 0
},
"result": {
"data": {
"action_id": 0,
"legal_actions": [
1,
2,
3,
4,
6,
7,
8
],
"reward": 0,
"done": false,
"... | {
"turns": 3,
"tool_calls": 3
} | {
"step_count": 5
} | {
"final_state": {
"episode_id": "664b364f-e7ef-4bb2-9709-1f931e9fddea",
"step_count": 5,
"game_name": "tic_tac_toe",
"agent_player": 0,
"opponent_policy": "random",
"game_params": {},
"num_players": 2
},
"transcript_length": 7
} | null | {} |
ep-000014 | [
{
"role": "user",
"content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8"
},
{
"role": "assistant",
"content": "Playing action_id=0 ... | 1 | true | [
{
"tool_name": "play_move",
"arguments": {
"action_id": 0
},
"result": {
"data": {
"action_id": 0,
"legal_actions": [
1,
2,
3,
4,
6,
7,
8
],
"reward": 0,
"done": false,
"... | {
"turns": 3,
"tool_calls": 3
} | {
"step_count": 5
} | {
"final_state": {
"episode_id": "61841cce-9a56-4226-bf89-bebb6d76d098",
"step_count": 5,
"game_name": "tic_tac_toe",
"agent_player": 0,
"opponent_policy": "random",
"game_params": {},
"num_players": 2
},
"transcript_length": 7
} | null | {} |
OpenEnv rollouts
Collected with OpenEnv (openenv collect).
Episodes: 20
Run metadata:
key value envopenspiel:tic_tac_toeenv_base_urlhttps://sergiopaniego-openspiel-ttt-env.hf.spaceproviderscriptedmodelNonenum_episodes_requested20temperature0.2keep_lossesFalse
Schema
Each line of results.jsonl is one episode:
episode_id(string)messages(chat transcript; TRLSFTTrainer-compatible)reward(float)done(bool)tool_trace(list of{tool_name, arguments, result})metrics,verify_metrics,artifactstask,extra(optional call-site annotations)
Load
from datasets import load_dataset
ds = load_dataset("<user>/<repo>", split="train")
- Downloads last month
- 20