d3rlpy.dataset.SparseRewardTransitionPicker¶
- class d3rlpy.dataset.SparseRewardTransitionPicker(failure_return, step_reward=0.0)[source]¶
Sparse reward transition picker.
This class extends BasicTransitionPicker to handle special returns_to_go calculation mainly used in AntMaze environments.
For the failure trajectories, this class sets the constant return value to avoid inconsistent horizon due to time out.
- Parameters:
Methods