d3rlpy.dataset.MultiStepTransitionPicker

class d3rlpy.dataset.MultiStepTransitionPicker(n_steps, gamma)[source]

Multi-step transition picker.

This class implements transition picking for the multi-step TD error. reward is computed as a multi-step discounted return.

Parameters:
  • n_steps – Delta timestep between observation and net_observation.

  • gamma – Discount factor to compute a multi-step return.

Methods

__call__(episode, index)[source]

Returns transition specified by index.

Parameters:
  • episode (EpisodeBase) – Episode.

  • index (int) – Index at the target transition.

Returns:

Transition.

Return type:

Transition