d3rlpy.dataset.TransitionMiniBatch¶
-
class
d3rlpy.dataset.TransitionMiniBatch¶ mini-batch of Transition objects.
This class is designed to hold
d3rlpy.dataset.Transitionobjects for being passed to algorithms during fitting.If the observation is image, you can stack arbitrary frames via
n_frames.transition.observation.shape == (3, 84, 84) batch_size = len(transitions) # stack 4 frames batch = TransitionMiniBatch(transitions, n_frames=4) # 4 frames x 3 channels batch.observations.shape == (batch_size, 12, 84, 84)
This is implemented by tracing previous transitions through
prev_transitionproperty.- Parameters
transitions (list(d3rlpy.dataset.Transition)) – mini-batch of transitions.
n_frames (int) – the number of frames to stack for image observation.
n_steps (int) – length of N-step sampling.
gamma (float) – discount factor for N-step calculation.
Methods
-
__getitem__(key, /)¶ Return self[key].
-
__len__()¶ Return len(self).
-
__iter__()¶ Implement iter(self).
-
add_additional_data(key, value)¶ Add arbitrary additional data.
- Parameters
key (str) – key of data.
value (any) – value.
-
get_additional_data(key)¶ Returns specified additional data.
- Parameters
key (str) – key of data.
- Returns
value.
- Return type
any
Attributes
-
actions¶ Returns mini-batch of actions at t.
- Returns
actions at t.
- Return type
-
masks¶ Returns mini-batch of binary masks for bootstrapping.
If any of transitions have an invalid mask, this will return
None.- Returns
binary mask.
- Return type
-
n_steps¶ Returns mini-batch of the number of steps before next observations.
This will always include only ones if
n_steps=1. Ifn_stepsis bigger than1. the values will depend on its episode length.- Returns
the number of steps before next observations.
- Return type
-
next_actions¶ Returns mini-batch of actions at t+n.
- Returns
actions at t+n.
- Return type
-
next_observations¶ Returns mini-batch of observations at t+n.
- Returns
observations at t+n.
- Return type
numpy.ndarray or torch.Tensor
-
next_rewards¶ Returns mini-batch of rewards at t+n.
- Returns
rewards at t+n.
- Return type
-
observations¶ Returns mini-batch of observations at t.
- Returns
observations at t.
- Return type
numpy.ndarray or torch.Tensor
-
rewards¶ Returns mini-batch of rewards at t.
- Returns
rewards at t.
- Return type
-
terminals¶ Returns mini-batch of terminal flags at t+n.
- Returns
terminal flags at t+n.
- Return type
-
transitions¶ Returns transitions.
- Returns
list of transitions.
- Return type