Environments

TwisteRL provides a flexible environment system supporting both native Rust environments and Python environment wrappers.

envs Module

Built-in Environments

Puzzle Environment

The Puzzle environment is a sliding tile puzzle implemented in Rust.

Configuration:

{
    "env_cls": "twisterl.envs.Puzzle",
    "env": {
        "difficulty": 1,
        "height": 3,
        "width": 3,
        "depth_slope": 2,
        "max_depth": 256
    }
}

Parameters:

difficulty: Initial difficulty level (controls scramble depth)
height: Grid height (3 for 8-puzzle, 4 for 15-puzzle)
width: Grid width (3 for 8-puzzle, 4 for 15-puzzle)
depth_slope: How quickly difficulty increases scramble depth
max_depth: Maximum scramble depth

8-Puzzle (3x3):

A 3x3 sliding puzzle with 8 numbered tiles and one empty space.

15-Puzzle (4x4):

A 4x4 sliding puzzle with 15 numbered tiles and one empty space.

Python Environment Wrapper

The PyEnv class wraps Python environments for use with TwisteRL’s Rust training loop.

Configuration:

{
    "env_cls": "twisterl.envs.PyEnv",
    "env": {
        "pyenv_cls": "mymodule.MyEnvironment"
    }
}

The Python environment class must implement:

reset(difficulty: int): Reset the environment to initial state with given difficulty
next(action: int): Execute an action (advances the environment state)
observe() -> list[int]: Return the current observation
obs_shape() -> list[int]: Return observation dimensions
num_actions() -> int: Return number of valid actions
is_final() -> bool: Return True if current state is terminal
success() -> bool: Return True if the goal was achieved
value() -> float: Return the reward value for current state
masks() -> list[bool]: Return action mask (True if action is valid)
set_state(state: list[int]): Set environment to specific state
copy(): Return a copy of the environment (for parallel collection)
twists() -> (obs_perms, act_perms) (optional, for symmetry-aware training)

Creating Custom Environments

For best performance, implement environments in Rust. See the examples/grid_world directory for a complete example.

Key steps:

Implement the twisterl::rl::env::Env trait in Rust
Expose to Python using PyBaseEnv
Build with maturin and install

See Examples for detailed instructions.

Environment Interface (Rust Trait)

Rust environments implement the twisterl::rl::env::Env trait. The required methods are:

num_actions() -> usize: Return number of possible actions
obs_shape() -> Vec<usize>: Return observation dimensions
set_state(state: Vec<i64>): Set environment to a specific state
reset(): Reset to a random initial state
step(action: usize): Execute an action (evolve the state)
is_final() -> bool: Return True if current state is terminal
success() -> bool: Return True if the goal was achieved
reward() -> f32: Return the reward value for current state
observe() -> Vec<usize>: Return current state as sparse observation

Optional methods with default implementations:

set_difficulty(difficulty: usize): Set difficulty level (default: no-op)
get_difficulty() -> usize: Get current difficulty (default: 1)
masks() -> Vec<bool>: Return action mask (default: all True)
twists() -> (Vec<Vec<usize>>, Vec<Vec<usize>>): Return permutation symmetries (default: empty)

Permutation Symmetries (Twists)

TwisteRL supports symmetry-aware training through “twists” - permutations of observations and actions that represent equivalent states.

See twists.md for detailed documentation on implementing twists in your environments.