d3rlpy.online.iterators.train¶
-
d3rlpy.online.iterators.
train
(env, algo, buffer, explorer=None, n_steps_per_epoch=4000, n_updates_per_epoch=100, eval_env=None, eval_epsilon=0.05, experiment_name=None, with_timestamp=True, logdir='d3rlpy_logs', verbose=True, show_progress=True, tensorboard=True, save_interval=1)[source]¶ Start training loop of online deep reinforcement learning.
Parameters: - env (gym.Env) – gym-like environment.
- algo (d3rlpy.algos.base.AlgoBase) – algorithm.
- buffer (d3rlpy.online.buffers.Buffer) – replay buffer.
- explorer (d3rlpy.online.explorers.Explorer) – action explorer.
- n_steps_per_epoch (int) – the number of steps per epoch.
- n_updates_per_epoch (int) – the number of updates per epoch.
- eval_env (gym.Env) – gym-like environment. If None, evaluation is skipped.
- eval_epsilon (float) – \(\epsilon\)-greedy factor during evaluation.
- experiment_name (str) – experiment name for logging. If not passed, the directory name will be {class name}_online_{timestamp}.
- with_timestamp (bool) – flag to add timestamp string to the last of directory name.
- logdir (str) – root directory name to save logs.
- verbose (bool) – flag to show logged information on stdout.
- show_progress (bool) – flag to show progress bar for iterations.
- tensorboard (bool) – flag to save logged information in tensorboard (additional to the csv data)
- save_interval (int) – interval to save parameters.