d3rlpy.dataset.MixedReplayBuffer¶
- class d3rlpy.dataset.MixedReplayBuffer(primary_replay_buffer, secondary_replay_buffer, secondary_mix_ratio)[source]¶
A class combining two replay buffer instances.
This replay buffer implementation combines two replay buffers (e.g. offline buffer and online buffer). The primary replay buffer is exposed to methods such as
append. Mini-batches are sampled from each replay buffer based onsecondary_mix_ratio.import d3rlpy # offline dataset dataset, env = d3rlpy.datasets.get_cartpole() # online replay buffer online_buffer = d3rlpy.dataset.create_fifo_replay_buffer( limit=100000, env=env, ) # combine two replay buffers replay_buffer = d3rlpy.dataset.MixedReplayBuffer( primary_replay_buffer=online_buffer, secondary_replay_buffer=dataset, secondary_mix_ratio=0.5, )
- Parameters
primary_replay_buffer (d3rlpy.dataset.ReplayBufferBase) – Primary replay buffer.
secondary_replay_buffer (d3rlpy.dataset.ReplayBufferBase) – Secondary replay buffer.
secondary_mix_ratio (float) – Ratio to sample mini-batches from the secondary replay buffer.
Methods
- append(observation, action, reward)[source]¶
Appends observation, action and reward to buffer.
- Parameters
observation (Union[numpy.ndarray[Any, numpy.dtype[Any]], Sequence[numpy.ndarray[Any, numpy.dtype[Any]]]]) – Observation.
action (Union[int, numpy.ndarray[Any, numpy.dtype[Any]]]) – Action.
reward (Union[float, numpy.ndarray[Any, numpy.dtype[Any]]]) – Reward.
- Return type
- append_episode(episode)[source]¶
Appends episode to buffer.
- Parameters
episode (d3rlpy.dataset.components.EpisodeBase) – Episode.
- Return type
- dump(f)[source]¶
Dumps buffer data.
with open('dataset.h5', 'w+b') as f: replay_buffer.dump(f)
- Parameters
f (BinaryIO) – IO object to write to.
- Return type
- classmethod from_episode_generator(episode_generator, buffer, transition_picker=None, trajectory_slicer=None, writer_preprocessor=None)[source]¶
Builds ReplayBuffer from episode generator.
- Parameters
episode_generator (d3rlpy.dataset.episode_generator.EpisodeGeneratorProtocol) – Episode generator implementation.
buffer (d3rlpy.dataset.buffers.BufferProtocol) – Buffer implementation.
transition_picker (Optional[d3rlpy.dataset.transition_pickers.TransitionPickerProtocol]) – Transition picker implementation for Q-learning-based algorithms.
trajectory_slicer (Optional[d3rlpy.dataset.trajectory_slicers.TrajectorySlicerProtocol]) – Trajectory slicer implementation for Transformer-based algorithms.
writer_preprocessor (Optional[d3rlpy.dataset.writers.WriterPreprocessProtocol]) – Writer preprocessor implementation.
- Returns
Replay buffer.
- Return type
- classmethod load(f, buffer, episode_cls=<class 'd3rlpy.dataset.components.Episode'>, transition_picker=None, trajectory_slicer=None, writer_preprocessor=None)[source]¶
Builds ReplayBuffer from dumped data.
This method reconstructs replay buffer dumped by
dumpmethod.with open('dataset.h5', 'rb') as f: replay_buffer = ReplayBuffer.load(f, buffer)
- Parameters
f (BinaryIO) – IO object to read from.
buffer (d3rlpy.dataset.buffers.BufferProtocol) – Buffer implementation.
episode_cls (Type[d3rlpy.dataset.components.EpisodeBase]) – Eisode class used to reconstruct data.
transition_picker (Optional[d3rlpy.dataset.transition_pickers.TransitionPickerProtocol]) – Transition picker implementation for Q-learning-based algorithms.
trajectory_slicer (Optional[d3rlpy.dataset.trajectory_slicers.TrajectorySlicerProtocol]) – Trajectory slicer implementation for Transformer-based algorithms.
writer_preprocessor (Optional[d3rlpy.dataset.writers.WriterPreprocessProtocol]) – Writer preprocessor implementation.
- Returns
Replay buffer.
- Return type
- sample_trajectory(length)[source]¶
Samples a partial trajectory.
- Parameters
length (int) – Length of partial trajectory.
- Returns
Partial trajectory.
- Return type
d3rlpy.dataset.components.PartialTrajectory
- sample_transition()[source]¶
Samples a transition.
- Returns
Transition.
- Return type
d3rlpy.dataset.components.Transition
- sample_transition_batch(batch_size)[source]¶
Samples a mini-batch of transitions.
- Parameters
batch_size (int) – Mini-batch size.
- Returns
Mini-batch.
- Return type
d3rlpy.dataset.mini_batch.TransitionMiniBatch
Attributes
- buffer¶
- dataset_info¶
- episodes¶
- primary_replay_buffer¶
- secondary_replay_buffer¶
- trajectory_slicer¶
- transition_count¶
- transition_picker¶