d3rlpy.online.explorers.LinearDecayEpsilonGreedy

class d3rlpy.online.explorers.LinearDecayEpsilonGreedy(start_epsilon=1.0, end_epsilon=0.1, duration=1000000)[source]

\(\epsilon\)-greedy explorer with linear decay schedule.

Parameters:
  • start_epsilon (float) – the beginning \(\epsilon\).
  • end_epsilon (float) – the end \(\epsilon\).
  • duration (int) – the scheduling duration.
start_epsilon

the beginning \(\epsilon\).

Type:float
end_epsilon

the end \(\epsilon\).

Type:float
duration

the scheduling duration.

Type:int

Methods

compute_epsilon(step)[source]

Returns decayed \(\epsilon\).

Returns:\(\epsilon\).
Return type:float
sample(algo, x, step)[source]

Returns \(\epsilon\)-greedy action.

Parameters:
  • algo (d3rlpy.algos.base.AlgoBase) – algorithm.
  • x (numpy.ndarray) – observation.
  • step (int) – current environment step.
Returns:

\(\epsilon\)-greedy action.

Return type:

int