d3rlpy.dataset.Transition¶
-
class
d3rlpy.dataset.
Transition
(observation_shape, action_size, observation, action, reward, next_observation, next_action, next_reward, terminal)[source]¶ Transition class.
This class is designed to hold data between two time steps, which is usually used as inputs of loss calculation in reinforcement learning.
Parameters: - observation_shape (tuple) – observation shape.
- action_size (int) – dimension of action-space.
- observation (numpy.ndarray) – observation at t.
- action (numpy.ndarray or int) – action at t.
- reward (float) – reward at t.
- next_observation (numpy.ndarray) – observation at t+1.
- next_action (numpy.ndarray or int) – action at t+1.
- next_reward (float) – reward at t+1.
- terminal (int) – terminal flag at t+1.
Methods
-
get_action_size
()[source]¶ Returns dimension of action-space.
Returns: dimension of action-space. Return type: int
-
get_observation_shape
()[source]¶ Returns observation shape.
Returns: observation shape. Return type: tuple
Attributes
-
action
¶ Returns action at t.
Returns: action at t. Return type: (numpy.ndarray or int)
-
next_action
¶ Returns action at t+1.
Returns: action at t+1. Return type: (numpy.ndarray or int)
-
next_observation
¶ Returns observation at t+1.
Returns: observation at t+1. Return type: numpy.ndarray
-
observation
¶ Returns observation at t.
Returns: observation at t. Return type: numpy.ndarray