d3rlpy.dataset.TransitionMiniBatch

class d3rlpy.dataset.TransitionMiniBatch

mini-batch of Transition objects.

This class is designed to hold d3rlpy.dataset.Transition objects for being passed to algorithms during fitting.

If the observation is image, you can stack arbitrary frames via n_frames.

transition.observation.shape == (3, 84, 84)

batch_size = len(transitions)

# stack 4 frames
batch = TransitionMiniBatch(transitions, n_frames=4)

# 4 frames x 3 channels
batch.observations.shape == (batch_size, 12, 84, 84)

This is implemented by tracing previous transitions through prev_transition property.

Parameters
  • transitions (list(d3rlpy.dataset.Transition)) – mini-batch of transitions.

  • n_frames (int) – the number of frames to stack for image observation.

  • n_steps (int) – length of N-step sampling.

  • gamma (float) – discount factor for N-step calculation.

Methods

__getitem__(key, /)

Return self[key].

__len__()

Return len(self).

__iter__()

Implement iter(self).

add_additional_data(key, value)

Add arbitrary additional data.

Parameters
  • key (str) – key of data.

  • value (any) – value.

get_additional_data(key)

Returns specified additional data.

Parameters

key (str) – key of data.

Returns

value.

Return type

any

size()

Returns size of mini-batch.

Returns

mini-batch size.

Return type

int

Attributes

actions

Returns mini-batch of actions at t.

Returns

actions at t.

Return type

numpy.ndarray

masks

Returns mini-batch of binary masks for bootstrapping.

If any of transitions have an invalid mask, this will return None.

Returns

binary mask.

Return type

numpy.ndarray

n_steps

Returns mini-batch of the number of steps before next observations.

This will always include only ones if n_steps=1. If n_steps is bigger than 1. the values will depend on its episode length.

Returns

the number of steps before next observations.

Return type

numpy.ndarray

next_actions

Returns mini-batch of actions at t+n.

Returns

actions at t+n.

Return type

numpy.ndarray

next_observations

Returns mini-batch of observations at t+n.

Returns

observations at t+n.

Return type

numpy.ndarray or torch.Tensor

next_rewards

Returns mini-batch of rewards at t+n.

Returns

rewards at t+n.

Return type

numpy.ndarray

observations

Returns mini-batch of observations at t.

Returns

observations at t.

Return type

numpy.ndarray or torch.Tensor

rewards

Returns mini-batch of rewards at t.

Returns

rewards at t.

Return type

numpy.ndarray

terminals

Returns mini-batch of terminal flags at t+n.

Returns

terminal flags at t+n.

Return type

numpy.ndarray

transitions

Returns transitions.

Returns

list of transitions.

Return type

d3rlpy.dataset.Transition