d3rlpy.dataset.TransitionMiniBatch¶
-
class
d3rlpy.dataset.
TransitionMiniBatch
¶ mini-batch of Transition objects.
This class is designed to hold
d3rlpy.dataset.Transition
objects for being passed to algorithms during fitting.If the observation is image, you can stack arbitrary frames via
n_frames
.transition.observation.shape == (3, 84, 84) batch_size = len(transitions) # stack 4 frames batch = TransitionMiniBatch(transitions, n_frames=4) # 4 frames x 3 channels batch.observations.shape == (batch_size, 12, 84, 84)
This is implemented by tracing previous transitions through
prev_transition
property.- Parameters
transitions (list(d3rlpy.dataset.Transition)) – mini-batch of transitions.
n_frames (int) – the number of frames to stack for image observation.
n_steps (int) – length of N-step sampling.
gamma (float) – discount factor for N-step calculation.
Methods
-
__getitem__
(key, /)¶ Return self[key].
-
__len__
()¶ Return len(self).
-
__iter__
()¶ Implement iter(self).
-
add_additional_data
(key, value)¶ Add arbitrary additional data.
- Parameters
key (str) – key of data.
value (any) – value.
-
get_additional_data
(key)¶ Returns specified additional data.
- Parameters
key (str) – key of data.
- Returns
value.
- Return type
any
Attributes
-
actions
¶ Returns mini-batch of actions at t.
- Returns
actions at t.
- Return type
-
masks
¶ Returns mini-batch of binary masks for bootstrapping.
If any of transitions have an invalid mask, this will return
None
.- Returns
binary mask.
- Return type
-
n_steps
¶ Returns mini-batch of the number of steps before next observations.
This will always include only ones if
n_steps=1
. Ifn_steps
is bigger than1
. the values will depend on its episode length.- Returns
the number of steps before next observations.
- Return type
-
next_actions
¶ Returns mini-batch of actions at t+n.
- Returns
actions at t+n.
- Return type
-
next_observations
¶ Returns mini-batch of observations at t+n.
- Returns
observations at t+n.
- Return type
numpy.ndarray or torch.Tensor
-
next_rewards
¶ Returns mini-batch of rewards at t+n.
- Returns
rewards at t+n.
- Return type
-
observations
¶ Returns mini-batch of observations at t.
- Returns
observations at t.
- Return type
numpy.ndarray or torch.Tensor
-
rewards
¶ Returns mini-batch of rewards at t.
- Returns
rewards at t.
- Return type
-
terminals
¶ Returns mini-batch of terminal flags at t+n.
- Returns
terminal flags at t+n.
- Return type
-
transitions
¶ Returns transitions.
- Returns
list of transitions.
- Return type