d3rlpy.dataset.MultiStepTransitionPicker

class d3rlpy.dataset.MultiStepTransitionPicker(*args, **kwds)[source]

Multi-step transition picker.

This class implements transition picking for the multi-step TD error. reward is computed as a multi-step discounted return.

Parameters
  • n_steps – Delta timestep between observation and net_observation.

  • gamma – Discount factor to compute a multi-step return.

Methods

__call__(episode, index)[source]

Returns transition specified by index.

Parameters
  • episode (d3rlpy.dataset.components.EpisodeBase) – Episode.

  • index (int) – Index at the target transition.

Returns

Transition.

Return type

d3rlpy.dataset.components.Transition