d3rlpy.dataset.Episode¶

class d3rlpy.dataset.Episode(observation_shape, action_size, observations, actions, rewards, terminal=True, create_mask=False, mask_size=1)¶

Episode class.

This class is designed to hold data collected in a single episode.

Episode object automatically splits data into list of d3rlpy.dataset.Transition objects. Also Episode object behaves like a list object for ease of access to transitions.

# return the number of transitions
len(episode)

# access to the first transition
transitions = episode[0]

# iterate through all transitions
for transition in episode:
    pass

Parameters

observation_shape (tuple) – observation shape.
action_size (int) – dimension of action-space.
observations (numpy.ndarray) – observations.
actions (numpy.ndarray) – actions.
rewards (numpy.ndarray) – scalar rewards.
terminal (bool) – binary terminal flag. If False, the episode is not terminated by the environment (e.g. timeout).
create_mask (bool) – flag to create binary masks for bootstrapping.
mask_size (int) – ensemble size for mask. If create_mask is False, this will be ignored.

Methods

__getitem__(index)¶

__len__()¶

__iter__()¶

build_transitions()¶

Builds transition objects.

This method will be internally called when accessing the transitions property at the first time.

compute_return()¶

Computes sum of rewards.

\[R = \sum_{i=1} r_i\]

Returns: episode return.
Return type: float

get_action_size()¶

Returns dimension of action-space.

Returns: dimension of action-space.
Return type: int

get_observation_shape()¶

Returns observation shape.

Returns: observation shape.
Return type: tuple

size()¶

Returns the number of transitions.

Returns: the number of transitions.
Return type: int

Attributes

actions¶

Returns the actions.

Returns: array of actions.
Return type: numpy.ndarray

observations¶

Returns the observations.

Returns: array of observations.
Return type: numpy.ndarray

rewards¶

Returns the rewards.

Returns: array of rewards.
Return type: numpy.ndarray

terminal¶

Returns the terminal flag.

Returns: the terminal flag.
Return type: bool

transitions¶

Returns the transitions.

Returns: list of d3rlpy.dataset.Transition objects.
Return type: list(d3rlpy.dataset.Transition)