Dataset Viewer
Auto-converted to Parquet Duplicate
episode_id
stringlengths
9
9
messages
listlengths
7
11
reward
float64
0
1
done
bool
1 class
tool_trace
listlengths
3
5
metrics
dict
verify_metrics
dict
artifacts
dict
task
null
extra
unknown
ep-000000
[ { "role": "user", "content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8" }, { "role": "assistant", "content": "Playing action_id=0 ...
1
true
[ { "tool_name": "play_move", "arguments": { "action_id": 0 }, "result": { "data": { "action_id": 0, "legal_actions": [ 1, 2, 3, 4, 5, 6, 7 ], "reward": 0, "done": false, "...
{ "turns": 3, "tool_calls": 3 }
{ "step_count": 5 }
{ "final_state": { "episode_id": "61776ad1-1d9b-441a-b16c-cae15cc924ac", "step_count": 5, "game_name": "tic_tac_toe", "agent_player": 0, "opponent_policy": "random", "game_params": {}, "num_players": 2 }, "transcript_length": 7 }
null
{}
ep-000003
[ { "role": "user", "content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8" }, { "role": "assistant", "content": "Playing action_id=0 ...
1
true
[ { "tool_name": "play_move", "arguments": { "action_id": 0 }, "result": { "data": { "action_id": 0, "legal_actions": [ 1, 2, 3, 4, 5, 6, 8 ], "reward": 0, "done": false, "...
{ "turns": 3, "tool_calls": 3 }
{ "step_count": 5 }
{ "final_state": { "episode_id": "369b3d15-fe49-431c-abd8-1dda48a23eb8", "step_count": 5, "game_name": "tic_tac_toe", "agent_player": 0, "opponent_policy": "random", "game_params": {}, "num_players": 2 }, "transcript_length": 7 }
null
{}
ep-000007
[ { "role": "user", "content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8" }, { "role": "assistant", "content": "Playing action_id=0 ...
1
true
[ { "tool_name": "play_move", "arguments": { "action_id": 0 }, "result": { "data": { "action_id": 0, "legal_actions": [ 1, 2, 3, 4, 5, 6, 8 ], "reward": 0, "done": false, "...
{ "turns": 3, "tool_calls": 3 }
{ "step_count": 5 }
{ "final_state": { "episode_id": "74b9dcd0-e7aa-4db2-ab42-21f8ff6a985f", "step_count": 5, "game_name": "tic_tac_toe", "agent_player": 0, "opponent_policy": "random", "game_params": {}, "num_players": 2 }, "transcript_length": 7 }
null
{}
ep-000008
[ { "role": "user", "content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8" }, { "role": "assistant", "content": "Playing action_id=0 ...
1
true
[ { "tool_name": "play_move", "arguments": { "action_id": 0 }, "result": { "data": { "action_id": 0, "legal_actions": [ 1, 2, 3, 4, 5, 6, 8 ], "reward": 0, "done": false, "...
{ "turns": 3, "tool_calls": 3 }
{ "step_count": 5 }
{ "final_state": { "episode_id": "1641dcd1-7c67-4c80-878c-90db2f6d604a", "step_count": 5, "game_name": "tic_tac_toe", "agent_player": 0, "opponent_policy": "random", "game_params": {}, "num_players": 2 }, "transcript_length": 7 }
null
{}
ep-000009
[ { "role": "user", "content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8" }, { "role": "assistant", "content": "Playing action_id=0 ...
1
true
[ { "tool_name": "play_move", "arguments": { "action_id": 0 }, "result": { "data": { "action_id": 0, "legal_actions": [ 1, 2, 4, 5, 6, 7, 8 ], "reward": 0, "done": false, "...
{ "turns": 3, "tool_calls": 3 }
{ "step_count": 5 }
{ "final_state": { "episode_id": "b0157f83-d54a-43b1-980b-be1f1588f628", "step_count": 5, "game_name": "tic_tac_toe", "agent_player": 0, "opponent_policy": "random", "game_params": {}, "num_players": 2 }, "transcript_length": 7 }
null
{}
ep-000011
[ { "role": "user", "content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8" }, { "role": "assistant", "content": "Playing action_id=0 ...
1
true
[ { "tool_name": "play_move", "arguments": { "action_id": 0 }, "result": { "data": { "action_id": 0, "legal_actions": [ 1, 2, 3, 4, 5, 7, 8 ], "reward": 0, "done": false, "...
{ "turns": 5, "tool_calls": 5 }
{ "step_count": 9 }
{ "final_state": { "episode_id": "55e1d862-8f9f-4ef0-a699-3f4b3fddc57e", "step_count": 9, "game_name": "tic_tac_toe", "agent_player": 0, "opponent_policy": "random", "game_params": {}, "num_players": 2 }, "transcript_length": 11 }
null
{}
ep-000012
[ { "role": "user", "content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8" }, { "role": "assistant", "content": "Playing action_id=0 ...
0
true
[ { "tool_name": "play_move", "arguments": { "action_id": 0 }, "result": { "data": { "action_id": 0, "legal_actions": [ 2, 3, 4, 5, 6, 7, 8 ], "reward": 0, "done": false, "...
{ "turns": 5, "tool_calls": 5 }
{ "step_count": 9 }
{ "final_state": { "episode_id": "9504fc08-45f6-482f-abc7-77aade3fc46c", "step_count": 9, "game_name": "tic_tac_toe", "agent_player": 0, "opponent_policy": "random", "game_params": {}, "num_players": 2 }, "transcript_length": 11 }
null
{}
ep-000013
[ { "role": "user", "content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8" }, { "role": "assistant", "content": "Playing action_id=0 ...
1
true
[ { "tool_name": "play_move", "arguments": { "action_id": 0 }, "result": { "data": { "action_id": 0, "legal_actions": [ 1, 2, 4, 5, 6, 7, 8 ], "reward": 0, "done": false, "...
{ "turns": 3, "tool_calls": 3 }
{ "step_count": 5 }
{ "final_state": { "episode_id": "706cf82d-fb2e-4510-a63a-c36ec76a389b", "step_count": 5, "game_name": "tic_tac_toe", "agent_player": 0, "opponent_policy": "random", "game_params": {}, "num_players": 2 }, "transcript_length": 7 }
null
{}
ep-000015
[ { "role": "user", "content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8" }, { "role": "assistant", "content": "Playing action_id=0 ...
1
true
[ { "tool_name": "play_move", "arguments": { "action_id": 0 }, "result": { "data": { "action_id": 0, "legal_actions": [ 1, 2, 3, 4, 5, 6, 8 ], "reward": 0, "done": false, "...
{ "turns": 3, "tool_calls": 3 }
{ "step_count": 5 }
{ "final_state": { "episode_id": "1564b0b6-8fb4-4a38-8e71-3cf4731781ff", "step_count": 5, "game_name": "tic_tac_toe", "agent_player": 0, "opponent_policy": "random", "game_params": {}, "num_players": 2 }, "transcript_length": 7 }
null
{}
ep-000016
[ { "role": "user", "content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8" }, { "role": "assistant", "content": "Playing action_id=0 ...
1
true
[ { "tool_name": "play_move", "arguments": { "action_id": 0 }, "result": { "data": { "action_id": 0, "legal_actions": [ 2, 3, 4, 5, 6, 7, 8 ], "reward": 0, "done": false, "...
{ "turns": 5, "tool_calls": 5 }
{ "step_count": 9 }
{ "final_state": { "episode_id": "9b758a46-2fc1-45b4-94cf-32f07bf4fcfe", "step_count": 9, "game_name": "tic_tac_toe", "agent_player": 0, "opponent_policy": "random", "game_params": {}, "num_players": 2 }, "transcript_length": 11 }
null
{}
ep-000017
[ { "role": "user", "content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8" }, { "role": "assistant", "content": "Playing action_id=0 ...
1
true
[ { "tool_name": "play_move", "arguments": { "action_id": 0 }, "result": { "data": { "action_id": 0, "legal_actions": [ 1, 2, 3, 4, 5, 6, 7 ], "reward": 0, "done": false, "...
{ "turns": 3, "tool_calls": 3 }
{ "step_count": 5 }
{ "final_state": { "episode_id": "95f0dd59-a449-4b9b-8056-523ad8c803f5", "step_count": 5, "game_name": "tic_tac_toe", "agent_player": 0, "opponent_policy": "random", "game_params": {}, "num_players": 2 }, "transcript_length": 7 }
null
{}
ep-000018
[ { "role": "user", "content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8" }, { "role": "assistant", "content": "Playing action_id=0 ...
1
true
[ { "tool_name": "play_move", "arguments": { "action_id": 0 }, "result": { "data": { "action_id": 0, "legal_actions": [ 1, 2, 3, 4, 5, 7, 8 ], "reward": 0, "done": false, "...
{ "turns": 3, "tool_calls": 3 }
{ "step_count": 5 }
{ "final_state": { "episode_id": "232f353b-64c2-46f4-ae55-759e9afe1345", "step_count": 5, "game_name": "tic_tac_toe", "agent_player": 0, "opponent_policy": "random", "game_params": {}, "num_players": 2 }, "transcript_length": 7 }
null
{}
ep-000019
[ { "role": "user", "content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8" }, { "role": "assistant", "content": "Playing action_id=0 ...
1
true
[ { "tool_name": "play_move", "arguments": { "action_id": 0 }, "result": { "data": { "action_id": 0, "legal_actions": [ 1, 3, 4, 5, 6, 7, 8 ], "reward": 0, "done": false, "...
{ "turns": 5, "tool_calls": 5 }
{ "step_count": 9 }
{ "final_state": { "episode_id": "6ebde04e-586a-4a43-a332-3632d8c0c4b9", "step_count": 9, "game_name": "tic_tac_toe", "agent_player": 0, "opponent_policy": "random", "game_params": {}, "num_players": 2 }, "transcript_length": 11 }
null
{}
ep-000001
[ { "role": "user", "content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8" }, { "role": "assistant", "content": "Playing action_id=0 ...
1
true
[ { "tool_name": "play_move", "arguments": { "action_id": 0 }, "result": { "data": { "action_id": 0, "legal_actions": [ 1, 2, 3, 4, 6, 7, 8 ], "reward": 0, "done": false, "...
{ "turns": 3, "tool_calls": 3 }
{ "step_count": 5 }
{ "final_state": { "episode_id": "92942fef-4aa4-44f9-a9d4-1fc4596826e1", "step_count": 5, "game_name": "tic_tac_toe", "agent_player": 0, "opponent_policy": "random", "game_params": {}, "num_players": 2 }, "transcript_length": 7 }
null
{}
ep-000002
[ { "role": "user", "content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8" }, { "role": "assistant", "content": "Playing action_id=0 ...
1
true
[ { "tool_name": "play_move", "arguments": { "action_id": 0 }, "result": { "data": { "action_id": 0, "legal_actions": [ 1, 2, 3, 4, 5, 6, 8 ], "reward": 0, "done": false, "...
{ "turns": 3, "tool_calls": 3 }
{ "step_count": 5 }
{ "final_state": { "episode_id": "3af13450-3f5c-4591-979f-7ab8bdf1c047", "step_count": 5, "game_name": "tic_tac_toe", "agent_player": 0, "opponent_policy": "random", "game_params": {}, "num_players": 2 }, "transcript_length": 7 }
null
{}
ep-000004
[ { "role": "user", "content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8" }, { "role": "assistant", "content": "Playing action_id=0 ...
1
true
[ { "tool_name": "play_move", "arguments": { "action_id": 0 }, "result": { "data": { "action_id": 0, "legal_actions": [ 1, 2, 3, 5, 6, 7, 8 ], "reward": 0, "done": false, "...
{ "turns": 3, "tool_calls": 3 }
{ "step_count": 5 }
{ "final_state": { "episode_id": "354cbe7e-9844-49b5-9305-4bc5700163a3", "step_count": 5, "game_name": "tic_tac_toe", "agent_player": 0, "opponent_policy": "random", "game_params": {}, "num_players": 2 }, "transcript_length": 7 }
null
{}
ep-000005
[ { "role": "user", "content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8" }, { "role": "assistant", "content": "Playing action_id=0 ...
1
true
[ { "tool_name": "play_move", "arguments": { "action_id": 0 }, "result": { "data": { "action_id": 0, "legal_actions": [ 1, 2, 4, 5, 6, 7, 8 ], "reward": 0, "done": false, "...
{ "turns": 3, "tool_calls": 3 }
{ "step_count": 5 }
{ "final_state": { "episode_id": "1ec9bf9e-8217-463c-8b6c-ebd387a653f6", "step_count": 5, "game_name": "tic_tac_toe", "agent_player": 0, "opponent_policy": "random", "game_params": {}, "num_players": 2 }, "transcript_length": 7 }
null
{}
ep-000006
[ { "role": "user", "content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8" }, { "role": "assistant", "content": "Playing action_id=0 ...
1
true
[ { "tool_name": "play_move", "arguments": { "action_id": 0 }, "result": { "data": { "action_id": 0, "legal_actions": [ 1, 2, 3, 4, 6, 7, 8 ], "reward": 0, "done": false, "...
{ "turns": 3, "tool_calls": 3 }
{ "step_count": 5 }
{ "final_state": { "episode_id": "d37ce7b0-660d-4964-b7e2-228ad4b96e79", "step_count": 5, "game_name": "tic_tac_toe", "agent_player": 0, "opponent_policy": "random", "game_params": {}, "num_players": 2 }, "transcript_length": 7 }
null
{}
ep-000010
[ { "role": "user", "content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8" }, { "role": "assistant", "content": "Playing action_id=0 ...
1
true
[ { "tool_name": "play_move", "arguments": { "action_id": 0 }, "result": { "data": { "action_id": 0, "legal_actions": [ 1, 2, 3, 4, 6, 7, 8 ], "reward": 0, "done": false, "...
{ "turns": 3, "tool_calls": 3 }
{ "step_count": 5 }
{ "final_state": { "episode_id": "664b364f-e7ef-4bb2-9709-1f931e9fddea", "step_count": 5, "game_name": "tic_tac_toe", "agent_player": 0, "opponent_policy": "random", "game_params": {}, "num_players": 2 }, "transcript_length": 7 }
null
{}
ep-000014
[ { "role": "user", "content": "You are playing tic_tac_toe.\nCall the play_move tool with an action_id drawn from legal_actions.\nLegal actions: [0, 1, 2, 3, 4, 5, 6, 7, 8]\n\nBoard:\n0 | 1 | 2\n---------\n3 | 4 | 5\n---------\n6 | 7 | 8" }, { "role": "assistant", "content": "Playing action_id=0 ...
1
true
[ { "tool_name": "play_move", "arguments": { "action_id": 0 }, "result": { "data": { "action_id": 0, "legal_actions": [ 1, 2, 3, 4, 6, 7, 8 ], "reward": 0, "done": false, "...
{ "turns": 3, "tool_calls": 3 }
{ "step_count": 5 }
{ "final_state": { "episode_id": "61841cce-9a56-4226-bf89-bebb6d76d098", "step_count": 5, "game_name": "tic_tac_toe", "agent_player": 0, "opponent_policy": "random", "game_params": {}, "num_players": 2 }, "transcript_length": 7 }
null
{}

OpenEnv rollouts

Collected with OpenEnv (openenv collect).

  • Episodes: 20

  • Run metadata:

    key value
    env openspiel:tic_tac_toe
    env_base_url https://sergiopaniego-openspiel-ttt-env.hf.space
    provider scripted
    model None
    num_episodes_requested 20
    temperature 0.2
    keep_losses False

Schema

Each line of results.jsonl is one episode:

  • episode_id (string)
  • messages (chat transcript; TRL SFTTrainer-compatible)
  • reward (float)
  • done (bool)
  • tool_trace (list of {tool_name, arguments, result})
  • metrics, verify_metrics, artifacts
  • task, extra (optional call-site annotations)

Load

from datasets import load_dataset
ds = load_dataset("<user>/<repo>", split="train")
Downloads last month
20