d3rlpy.dataset.Transition¶
- class d3rlpy.dataset.Transition¶
Transition class.
This class is designed to hold data between two time steps, which is usually used as inputs of loss calculation in reinforcement learning.
- Parameters
observation_shape (tuple) – observation shape.
action_size (int) – dimension of action-space.
observation (numpy.ndarray) – observation at t.
action (numpy.ndarray or int) – action at t.
reward (float) – reward at t.
next_observation (numpy.ndarray) – observation at t+1.
next_action (numpy.ndarray or int) – action at t+1.
next_reward (float) – reward at t+1.
terminal (int) – terminal flag at t+1.
mask (numpy.ndarray) – binary mask for bootstrapping.
prev_transition (d3rlpy.dataset.Transition) – pointer to the previous transition.
next_transition (d3rlpy.dataset.Transition) – pointer to the next transition.
Methods
- clear_links()¶
Clears links to the next and previous transitions.
This method is necessary to call when freeing this instance by GC.
- get_action_size()¶
Returns dimension of action-space.
- Returns
dimension of action-space.
- Return type
Attributes
- action¶
Returns action at t.
- Returns
action at t.
- Return type
(numpy.ndarray or int)
- mask¶
Returns binary mask for bootstrapping.
- Returns
array of binary mask.
- Return type
np.ndarray
- next_action¶
Returns action at t+1.
- Returns
action at t+1.
- Return type
(numpy.ndarray or int)
- next_observation¶
Returns observation at t+1.
- Returns
observation at t+1.
- Return type
numpy.ndarray or torch.Tensor
- next_transition¶
Returns pointer to the next transition.
If this is the last transition, this method should return
None
.- Returns
next transition.
- Return type
- observation¶
Returns observation at t.
- Returns
observation at t.
- Return type
numpy.ndarray or torch.Tensor
- prev_transition¶
Returns pointer to the previous transition.
If this is the first transition, this method should return
None
.- Returns
previous transition.
- Return type