d3rlpy.algos.LinearDecayEpsilonGreedy

class d3rlpy.algos.LinearDecayEpsilonGreedy(start_epsilon=1.0, end_epsilon=0.1, duration=1000000)[source]

\(\epsilon\)-greedy explorer with linear decay schedule.

Parameters
  • start_epsilon (float) – Initial \(\epsilon\).

  • end_epsilon (float) – Final \(\epsilon\).

  • duration (int) – Scheduling duration.

Methods

compute_epsilon(step)[source]

Returns decayed \(\epsilon\).

Returns

\(\epsilon\).

Parameters

step (int) –

Return type

float

sample(algo, x, step)[source]

Returns \(\epsilon\)-greedy action.

Parameters
  • algo (d3rlpy.algos.qlearning.explorers._ActionProtocol) – Algorithm.

  • x (numpy.ndarray) – Observation.

  • step (int) – Current environment step.

Returns

\(\epsilon\)-greedy action.

Return type

numpy.ndarray