RewardSmith

RewardSmith is a web app (with optional desktop agent) that helps teams debug and harden reinforcement-learning environments by automatically stress-testing reward functions and termination conditions. It runs targeted “reward hacking” probes: adversarial policies, randomization sweeps, and counterfactual rollouts to surface loopholes (e.g., agents exploiting sensor glitches, timeouts, or proxy metrics). The app produces a ranked list of failure modes with minimal reproducible trajectories, suggested reward/constraint patches, and regression tests you can keep in CI. It’s aimed at practical RL work—robotics, operations research, games, and simulation—where most projects stall not because of algorithms, but because the environment and reward are brittle. Expect it to save weeks of iteration by turning vague ‘the agent learned something weird’ moments into concrete, testable fixes.

← Back to idea list