Algorithms

d3rlpy provides state-of-the-art data-driven deep reinforcement learning algorithms as well as online algorithms for the base implementations.

Continuous control algorithms

d3rlpy.algos.BC Behavior Cloning algorithm.
d3rlpy.algos.DDPG Deep Deterministic Policy Gradients algorithm.
d3rlpy.algos.TD3 Twin Delayed Deep Deterministic Policy Gradients algorithm.
d3rlpy.algos.SAC Soft Actor-Critic algorithm.
d3rlpy.algos.BCQ Batch-Constrained Q-learning algorithm.
d3rlpy.algos.BEAR Bootstrapping Error Accumulation Reduction algorithm.
d3rlpy.algos.CQL Conservative Q-Learning algorithm.
d3rlpy.algos.AWR Advantage-Weighted Regression algorithm.
d3rlpy.algos.AWAC Advantage Weighted Actor-Critic algorithm.
d3rlpy.algos.PLAS Policy in Latent Action Space algorithm.
d3rlpy.algos.PLASWithPerturbation Policy in Latent Action Space algorithm with perturbation layer.

Discrete control algorithms

d3rlpy.algos.DiscreteBC Behavior Cloning algorithm for discrete control.
d3rlpy.algos.DQN Deep Q-Network algorithm.
d3rlpy.algos.DoubleDQN Double Deep Q-Network algorithm.
d3rlpy.algos.DiscreteSAC Soft Actor-Critic algorithm for discrete action-space.
d3rlpy.algos.DiscreteBCQ Discrete version of Batch-Constrained Q-learning algorithm.
d3rlpy.algos.DiscreteCQL Discrete version of Conservative Q-Learning algorithm.
d3rlpy.algos.DiscreteAWR Discrete veriosn of Advantage-Weighted Regression algorithm.