d3rlpy.dataset.Episode

class d3rlpy.dataset.Episode(observation_shape, action_size, observations, actions, rewards, terminal=True, create_mask=False, mask_size=1)

Episode class.

This class is designed to hold data collected in a single episode.

Episode object automatically splits data into list of d3rlpy.dataset.Transition objects. Also Episode object behaves like a list object for ease of access to transitions.

# return the number of transitions
len(episode)

# access to the first transition
transitions = episode[0]

# iterate through all transitions
for transition in episode:
    pass
Parameters
  • observation_shape (tuple) – observation shape.

  • action_size (int) – dimension of action-space.

  • observations (numpy.ndarray) – observations.

  • actions (numpy.ndarray) – actions.

  • rewards (numpy.ndarray) – scalar rewards.

  • terminal (bool) – binary terminal flag. If False, the episode is not terminated by the environment (e.g. timeout).

  • create_mask (bool) – flag to create binary masks for bootstrapping.

  • mask_size (int) – ensemble size for mask. If create_mask is False, this will be ignored.

Methods

__getitem__(index)
__len__()
__iter__()
build_transitions()

Builds transition objects.

This method will be internally called when accessing the transitions property at the first time.

compute_return()

Computes sum of rewards.

\[R = \sum_{i=1} r_i\]
Returns

episode return.

Return type

float

get_action_size()

Returns dimension of action-space.

Returns

dimension of action-space.

Return type

int

get_observation_shape()

Returns observation shape.

Returns

observation shape.

Return type

tuple

size()

Returns the number of transitions.

Returns

the number of transitions.

Return type

int

Attributes

actions

Returns the actions.

Returns

array of actions.

Return type

numpy.ndarray

observations

Returns the observations.

Returns

array of observations.

Return type

numpy.ndarray

rewards

Returns the rewards.

Returns

array of rewards.

Return type

numpy.ndarray

terminal

Returns the terminal flag.

Returns

the terminal flag.

Return type

bool

transitions

Returns the transitions.

Returns

list of d3rlpy.dataset.Transition objects.

Return type

list(d3rlpy.dataset.Transition)