Algorithms¶
d3rlpy provides state-of-the-art data-driven deep reinforcement learning algorithms as well as online algorithms for the base implementations.
Continuous control algorithms¶
Behavior Cloning algorithm. |
|
Deep Deterministic Policy Gradients algorithm. |
|
Twin Delayed Deep Deterministic Policy Gradients algorithm. |
|
Soft Actor-Critic algorithm. |
|
Batch-Constrained Q-learning algorithm. |
|
Bootstrapping Error Accumulation Reduction algorithm. |
|
Conservative Q-Learning algorithm. |
|
Advantage-Weighted Regression algorithm. |
|
Advantage Weighted Actor-Critic algorithm. |
|
Policy in Latent Action Space algorithm. |
|
Policy in Latent Action Space algorithm with perturbation layer. |
Discrete control algorithms¶
Behavior Cloning algorithm for discrete control. |
|
Deep Q-Network algorithm. |
|
Double Deep Q-Network algorithm. |
|
Soft Actor-Critic algorithm for discrete action-space. |
|
Discrete version of Batch-Constrained Q-learning algorithm. |
|
Discrete version of Conservative Q-Learning algorithm. |
|
Discrete veriosn of Advantage-Weighted Regression algorithm. |