d3rlpy.dataset.TransitionMiniBatch¶
- class d3rlpy.dataset.TransitionMiniBatch¶
mini-batch of Transition objects.
This class is designed to hold
d3rlpy.dataset.Transition
objects for being passed to algorithms during fitting.If the observation is image, you can stack arbitrary frames via
n_frames
.transition.observation.shape == (3, 84, 84) batch_size = len(transitions) # stack 4 frames batch = TransitionMiniBatch(transitions, n_frames=4) # 4 frames x 3 channels batch.observations.shape == (batch_size, 12, 84, 84)
This is implemented by tracing previous transitions through
prev_transition
property.- Parameters
transitions (list(d3rlpy.dataset.Transition)) – mini-batch of transitions.
n_frames (int) – the number of frames to stack for image observation.
n_steps (int) – length of N-step sampling.
gamma (float) – discount factor for N-step calculation.
Methods
- __getitem__(key, /)¶
Return self[key].
- __len__()¶
Return len(self).
- __iter__()¶
Implement iter(self).
- add_additional_data(key, value)¶
Add arbitrary additional data.
- Parameters
key (str) – key of data.
value (any) – value.
- get_additional_data(key)¶
Returns specified additional data.
- Parameters
key (str) – key of data.
- Returns
value.
- Return type
any
Attributes
- actions¶
Returns mini-batch of actions at t.
- Returns
actions at t.
- Return type
- masks¶
Returns mini-batch of binary masks for bootstrapping.
If any of transitions have an invalid mask, this will return
None
.- Returns
binary mask.
- Return type
- n_steps¶
Returns mini-batch of the number of steps before next observations.
This will always include only ones if
n_steps=1
. Ifn_steps
is bigger than1
. the values will depend on its episode length.- Returns
the number of steps before next observations.
- Return type
- next_actions¶
Returns mini-batch of actions at t+n.
- Returns
actions at t+n.
- Return type
- next_observations¶
Returns mini-batch of observations at t+n.
- Returns
observations at t+n.
- Return type
numpy.ndarray or torch.Tensor
- next_rewards¶
Returns mini-batch of rewards at t+n.
- Returns
rewards at t+n.
- Return type
- observations¶
Returns mini-batch of observations at t.
- Returns
observations at t.
- Return type
numpy.ndarray or torch.Tensor
- rewards¶
Returns mini-batch of rewards at t.
- Returns
rewards at t.
- Return type
- terminals¶
Returns mini-batch of terminal flags at t+n.
- Returns
terminal flags at t+n.
- Return type
- transitions¶
Returns transitions.
- Returns
list of transitions.
- Return type