d3rlpy.dataset.Transition

class d3rlpy.dataset.Transition

Transition class.

This class is designed to hold data between two time steps, which is usually used as inputs of loss calculation in reinforcement learning.

Parameters

Methods

Clears links to the next and previous transitions.

This method is necessary to call when freeing this instance by GC.

get_action_size()

Returns dimension of action-space.

Returns

dimension of action-space.

Return type

int

get_observation_shape()

Returns observation shape.

Returns

observation shape.

Return type

tuple

Attributes

action

Returns action at t.

Returns

action at t.

Return type

(numpy.ndarray or int)

mask

Returns binary mask for bootstrapping.

Returns

array of binary mask.

Return type

np.ndarray

next_action

Returns action at t+1.

Returns

action at t+1.

Return type

(numpy.ndarray or int)

next_observation

Returns observation at t+1.

Returns

observation at t+1.

Return type

numpy.ndarray or torch.Tensor

next_reward

Returns reward at t+1.

Returns

reward at t+1.

Return type

float

next_transition

Returns pointer to the next transition.

If this is the last transition, this method should return None.

Returns

next transition.

Return type

d3rlpy.dataset.Transition

observation

Returns observation at t.

Returns

observation at t.

Return type

numpy.ndarray or torch.Tensor

prev_transition

Returns pointer to the previous transition.

If this is the first transition, this method should return None.

Returns

previous transition.

Return type

d3rlpy.dataset.Transition

reward

Returns reward at t.

Returns

reward at t.

Return type

float

terminal

Returns terminal flag at t+1.

Returns

terminal flag at t+1.

Return type

int