Environments
TwisteRL provides a flexible environment system supporting both native Rust environments and Python environment wrappers.
envs Module
Built-in Environments
Puzzle Environment
The Puzzle environment is a sliding tile puzzle implemented in Rust.
Configuration:
{
"env_cls": "twisterl.envs.Puzzle",
"env": {
"difficulty": 1,
"height": 3,
"width": 3,
"depth_slope": 2,
"max_depth": 256
}
}
Parameters:
difficulty: Initial difficulty level (controls scramble depth)
height: Grid height (3 for 8-puzzle, 4 for 15-puzzle)
width: Grid width (3 for 8-puzzle, 4 for 15-puzzle)
depth_slope: How quickly difficulty increases scramble depth
max_depth: Maximum scramble depth
8-Puzzle (3x3):
A 3x3 sliding puzzle with 8 numbered tiles and one empty space.
15-Puzzle (4x4):
A 4x4 sliding puzzle with 15 numbered tiles and one empty space.
Python Environment Wrapper
The PyEnv class wraps Python environments for use with TwisteRL’s Rust training loop.
Configuration:
{
"env_cls": "twisterl.envs.PyEnv",
"env": {
"pyenv_cls": "mymodule.MyEnvironment"
}
}
The Python environment class must implement:
reset(difficulty: int): Reset the environment to initial state with given difficultynext(action: int): Execute an action (advances the environment state)observe() -> list[int]: Return the current observationobs_shape() -> list[int]: Return observation dimensionsnum_actions() -> int: Return number of valid actionsis_final() -> bool: Return True if current state is terminalsuccess() -> bool: Return True if the goal was achievedvalue() -> float: Return the reward value for current statemasks() -> list[bool]: Return action mask (True if action is valid)set_state(state: list[int]): Set environment to specific statecopy(): Return a copy of the environment (for parallel collection)twists() -> (obs_perms, act_perms)(optional, for symmetry-aware training)
Creating Custom Environments
For best performance, implement environments in Rust. See the examples/grid_world directory for a complete example.
Key steps:
Implement the
twisterl::rl::env::Envtrait in RustExpose to Python using
PyBaseEnvBuild with maturin and install
See Examples for detailed instructions.
Environment Interface (Rust Trait)
Rust environments implement the twisterl::rl::env::Env trait. The required methods are:
num_actions() -> usize: Return number of possible actionsobs_shape() -> Vec<usize>: Return observation dimensionsset_state(state: Vec<i64>): Set environment to a specific statereset(): Reset to a random initial statestep(action: usize): Execute an action (evolve the state)is_final() -> bool: Return True if current state is terminalsuccess() -> bool: Return True if the goal was achievedreward() -> f32: Return the reward value for current stateobserve() -> Vec<usize>: Return current state as sparse observation
Optional methods with default implementations:
set_difficulty(difficulty: usize): Set difficulty level (default: no-op)get_difficulty() -> usize: Get current difficulty (default: 1)masks() -> Vec<bool>: Return action mask (default: all True)twists() -> (Vec<Vec<usize>>, Vec<Vec<usize>>): Return permutation symmetries (default: empty)
Permutation Symmetries (Twists)
TwisteRL supports symmetry-aware training through “twists” - permutations of observations and actions that represent equivalent states.
See twists.md for detailed documentation on implementing twists in your environments.