TwisteRL Documentation

Welcome to TwisteRL

TwisteRL is a minimalistic, high-performance Reinforcement Learning framework implemented in Rust with Python bindings.

The current version is a Proof of Concept, stay tuned for future releases!

High-Performance Core: RL episode loop implemented in Rust for faster training and inference
Inference-Ready: Easy compilation and bundling of models with environments into portable binaries for inference
Modular Design: Support for multiple algorithms (PPO, AlphaZero) with interchangeable training and inference
Language Interoperability: Core in Rust with Python interface
Symmetry-Aware Training: Environments can expose observation/action permutations (“twists”) so policies automatically exploit symmetries for faster learning

pip install twisterl

python -m twisterl.train --config examples/ppo_puzzle8_v1.json

This example trains a model to play the popular “8 puzzle” where numbers have to be shifted around through the empty slot until they are in order.

This model can be trained on a single CPU in under 1 minute (no GPU required!).

Hybrid Rust-Python implementation:
- Data collection and inference in Rust
- Training in Python (PyTorch)
Supported algorithms:
- PPO (Proximal Policy Optimization)
- AlphaZero
Focus on discrete observation and action spaces
Support for native Rust environments and Python environments through a wrapper

Repository: GitHub

Ready to dive in? Here are the essential links to get you up and running:

📦 Installation Guide - Install TwisteRL and set up your environment

⚡ Quick Start Guide - Your first RL model in minutes

📖 Examples - Interactive examples and use cases

🧠 Algorithms - PPO and AlphaZero algorithm guides

API Reference

Development & Community

Additional Resources