d3rlpy.dataset.Episode¶

class d3rlpy.dataset.Episode(observation_shape, action_size, observations, actions, rewards)¶

Episode class.

This class is designed to hold data collected in a single episode.

Episode object automatically splits data into list of d3rlpy.dataset.Transition objects. Also Episode object behaves like a list object for ease of access to transitions.

# return the number of transitions
len(episode)

# access to the first transition
transitions = episode[0]

# iterate through all transitions
for transition in episode:
    pass

Parameters:	observation_shape (tuple) – observation shape. action_size (int) – dimension of action-space. observations (numpy.ndarray) – observations. actions (numpy.ndarray) – actions. rewards (numpy.ndarray) – scalar rewards. terminals (numpy.ndarray) – binary terminal flags.

Methods

__getitem__(index)¶

__len__()¶

__iter__()¶

build_transitions()¶

Builds transition objects.

This method will be internally called when accessing the transitions property at the first time.

compute_return()¶

Computes sum of rewards.

\[R = \sum_{i=1} r_i\]

Returns:	episode return.
Return type:	float

get_action_size()¶

Returns dimension of action-space.

Returns:	dimension of action-space.
Return type:	int

get_observation_shape()¶

Returns observation shape.

Returns:	observation shape.
Return type:	tuple

size()¶

Returns the number of transitions.

Returns:	the number of transitions.
Return type:	int

Attributes

actions¶

Returns the actions.

Returns:	array of actions.
Return type:	numpy.ndarray

observations¶

Returns the observations.

Returns:	array of observations.
Return type:	numpy.ndarray

rewards¶

Returns the rewards.

Returns:	array of rewards.
Return type:	numpy.ndarray

transitions¶

Returns the transitions.

Returns:	list of `d3rlpy.dataset.Transition` objects.
Return type:	list(d3rlpy.dataset.Transition)