d3rlpy.online.buffers.ReplayBuffer

class d3rlpy.online.buffers.ReplayBuffer(maxlen, env)[source]

Standard Replay Buffer.

Parameters:
  • maxlen (int) – the maximum number of data length.
  • env (gym.Env) – gym-like environment to extract shape information.
maxlen

the maximum number of data length

Type:int
observations

list of observations.

Type:list(numpy.ndarray)
actions

list of actions.

Type:list(numpy.ndarray) or list(int)
rewards

list of rewards.

Type:list(float)
terminals

list of terminal flags.

Type:list(float)
cursor

current cursor pointing to list location to insert.

Type:int
observation_shape

observation shape.

Type:tuple
action_size

action size.

Type:int

Methods

__len__()[source]
append(observation, action, reward, terminal)[source]

Append observation, action, reward and terminal flag to buffer.

Parameters:
sample(batch_size)[source]

Returns sampled mini-batch of transitions.

Parameters:batch_size (int) – mini-batch size.
Returns:mini-batch.
Return type:d3rlpy.dataset.TransitionMiniBatch
size()[source]

Returns the number of appended elements in buffer.

Returns:the number of elements in buffer.
Return type:int