d3rlpy.dataset.Transition

class d3rlpy.dataset.Transition(observation_shape, action_size, observation, action, reward, next_observation, next_action, next_reward, terminal, prev_transition=None, next_transition=None)[source]

Transition class.

This class is designed to hold data between two time steps, which is usually used as inputs of loss calculation in reinforcement learning.

Parameters:

Methods

get_action_size()[source]

Returns dimension of action-space.

Returns:dimension of action-space.
Return type:int
get_observation_shape()[source]

Returns observation shape.

Returns:observation shape.
Return type:tuple

Attributes

action

Returns action at t.

Returns:action at t.
Return type:(numpy.ndarray or int)
next_action

Returns action at t+1.

Returns:action at t+1.
Return type:(numpy.ndarray or int)
next_observation

Returns observation at t+1.

Returns:observation at t+1.
Return type:numpy.ndarray or torch.Tensor
next_reward

Returns reward at t+1.

Returns:reward at t+1.
Return type:float
next_transition

Returns pointer to the next transition.

If this is the last transition, this method should return None.

Returns:next transition.
Return type:d3rlpy.dataset.Transition
observation

Returns observation at t.

Returns:observation at t.
Return type:numpy.ndarray or torch.Tensor
prev_transition

Returns pointer to the previous transition.

If this is the first transition, this method should return None.

Returns:previous transition.
Return type:d3rlpy.dataset.Transition
reward

Returns reward at t.

Returns:reward at t.
Return type:float
terminal

Returns terminal flag at t+1.

Returns:terminal flag at t+1.
Return type:int