d3rlpy.online.buffers.ReplayBuffer¶
-
class
d3rlpy.online.buffers.
ReplayBuffer
(maxlen, env)[source]¶ Standard Replay Buffer.
Parameters: - maxlen (int) – the maximum number of data length.
- env (gym.Env) – gym-like environment to extract shape information.
-
observations
¶ list of observations.
Type: list(numpy.ndarray)
-
actions
¶ list of actions.
Type: list(numpy.ndarray) or list(int)
Methods
-
append
(observation, action, reward, terminal)[source]¶ Append observation, action, reward and terminal flag to buffer.
Parameters: - observation (numpy.ndarray) – observation.
- action (numpy.ndarray or int) – action.
- reward (float) – reward.
- terminal (bool or float) – terminal flag.
-
sample
(batch_size)[source]¶ Returns sampled mini-batch of transitions.
Parameters: batch_size (int) – mini-batch size. Returns: mini-batch. Return type: d3rlpy.dataset.TransitionMiniBatch