d3rlpy.dataset.MDPDataset¶
- class d3rlpy.dataset.MDPDataset(observations, actions, rewards, terminals, timeouts=None, transition_picker=None, trajectory_slicer=None)[source]¶
Backward-compability class of MDPDataset.
This is a wrapper class that has a backward-compatible constructor interface.
- Parameters
observations (ObservationSequence) – Observations.
actions (np.ndarray) – Actions.
rewards (np.ndarray) – Rewards.
terminals (np.ndarray) – Environmental terminal flags.
timeouts (np.ndarray) – Timeouts.
transition_picker (Optional[TransitionPickerProtocol]) – Transition picker implementation for Q-learning-based algorithms. If
None
is given,BasicTransitionPicker
is used by default.trajectory_slicer (Optional[TrajectorySlicerProtocol]) – Trajectory slicer implementation for Transformer-based algorithms. If
None
is given,BasicTrajectorySlicer
is used by default.
Methods
- append(observation, action, reward)¶
Appends observation, action and reward to buffer.
- Parameters
observation (Union[numpy.ndarray, Sequence[numpy.ndarray]]) – Observation.
action (Union[int, numpy.ndarray]) – Action.
reward (Union[float, numpy.ndarray]) – Reward.
- Return type
- append_episode(episode)¶
Appends episode to buffer.
- Parameters
episode (d3rlpy.dataset.components.EpisodeBase) – Episode.
- Return type
- clip_episode(terminated)¶
Clips current episode.
- dump(f)¶
Dumps buffer data.
with open('dataset.h5', 'w+b') as f: replay_buffer.dump(f)
- Parameters
f (BinaryIO) – IO object to write to.
- Return type
- classmethod from_episode_generator(episode_generator, buffer, transition_picker=None, trajectory_slicer=None, writer_preprocessor=None)¶
Builds ReplayBuffer from episode generator.
- Parameters
episode_generator (d3rlpy.dataset.episode_generator.EpisodeGeneratorProtocol) – Episode generator implementation.
buffer (d3rlpy.dataset.buffers.BufferProtocol) – Buffer implementation.
transition_picker (Optional[d3rlpy.dataset.transition_pickers.TransitionPickerProtocol]) – Transition picker implementation for Q-learning-based algorithms.
trajectory_slicer (Optional[d3rlpy.dataset.trajectory_slicers.TrajectorySlicerProtocol]) – Trajectory slicer implementation for Transformer-based algorithms.
writer_preprocessor (Optional[d3rlpy.dataset.writers.WriterPreprocessProtocol]) – Writer preprocessor implementation.
- Returns
Replay buffer.
- Return type
- classmethod load(f, buffer, episode_cls=<class 'd3rlpy.dataset.components.Episode'>, transition_picker=None, trajectory_slicer=None, writer_preprocessor=None)¶
Builds ReplayBuffer from dumped data.
This method reconstructs replay buffer dumped by
dump
method.with open('dataset.h5', 'rb') as f: replay_buffer = ReplayBuffer.load(f, buffer)
- Parameters
f (BinaryIO) – IO object to read from.
buffer (d3rlpy.dataset.buffers.BufferProtocol) – Buffer implementation.
episode_cls (Type[d3rlpy.dataset.components.EpisodeBase]) – Eisode class used to reconstruct data.
transition_picker (Optional[d3rlpy.dataset.transition_pickers.TransitionPickerProtocol]) – Transition picker implementation for Q-learning-based algorithms.
trajectory_slicer (Optional[d3rlpy.dataset.trajectory_slicers.TrajectorySlicerProtocol]) – Trajectory slicer implementation for Transformer-based algorithms.
writer_preprocessor (Optional[d3rlpy.dataset.writers.WriterPreprocessProtocol]) – Writer preprocessor implementation.
- Returns
Replay buffer.
- Return type
- sample_trajectory(length)¶
Samples a partial trajectory.
- Parameters
length (int) – Length of partial trajectory.
- Returns
Partial trajectory.
- Return type
d3rlpy.dataset.components.PartialTrajectory
- sample_trajectory_batch(batch_size, length)¶
Samples a mini-batch of partial trajectories.
- sample_transition()¶
Samples a transition.
- Returns
Transition.
- Return type
d3rlpy.dataset.components.Transition
- sample_transition_batch(batch_size)¶
Samples a mini-batch of transitions.
- Parameters
batch_size (int) – Mini-batch size.
- Returns
Mini-batch.
- Return type
d3rlpy.dataset.mini_batch.TransitionMiniBatch
Attributes
- buffer¶
Returns buffer.
- Returns
Buffer.
- episodes¶
Returns sequence of episodes.
- Returns
Sequence of episodes.
- trajectory_slicer¶
Returns trajectory slicer.
- Returns
Trajectory slicer.
- transition_count¶
Returns number of transitions.
- Returns
Number of transitions.
- transition_picker¶
Returns transition picker.
- Returns
Transition picker.