Algorithms

d3rlpy provides state-of-the-art data-driven deep reinforcement learning algorithms as well as online algorithms for the base implementations.

Continuous control algorithms

d3rlpy.algos.BC

Behavior Cloning algorithm.

d3rlpy.algos.DDPG

Deep Deterministic Policy Gradients algorithm.

d3rlpy.algos.TD3

Twin Delayed Deep Deterministic Policy Gradients algorithm.

d3rlpy.algos.SAC

Soft Actor-Critic algorithm.

d3rlpy.algos.BCQ

Batch-Constrained Q-learning algorithm.

d3rlpy.algos.BEAR

Bootstrapping Error Accumulation Reduction algorithm.

d3rlpy.algos.CQL

Conservative Q-Learning algorithm.

d3rlpy.algos.AWR

Advantage-Weighted Regression algorithm.

d3rlpy.algos.AWAC

Advantage Weighted Actor-Critic algorithm.

d3rlpy.algos.PLAS

Policy in Latent Action Space algorithm.

d3rlpy.algos.PLASWithPerturbation

Policy in Latent Action Space algorithm with perturbation layer.

Discrete control algorithms

d3rlpy.algos.DiscreteBC

Behavior Cloning algorithm for discrete control.

d3rlpy.algos.DQN

Deep Q-Network algorithm.

d3rlpy.algos.DoubleDQN

Double Deep Q-Network algorithm.

d3rlpy.algos.DiscreteSAC

Soft Actor-Critic algorithm for discrete action-space.

d3rlpy.algos.DiscreteBCQ

Discrete version of Batch-Constrained Q-learning algorithm.

d3rlpy.algos.DiscreteCQL

Discrete version of Conservative Q-Learning algorithm.

d3rlpy.algos.DiscreteAWR

Discrete veriosn of Advantage-Weighted Regression algorithm.