d3rlpy.metrics.comparer.compare_discrete_action_match

d3rlpy.metrics.comparer.compare_discrete_action_match(base_algo, window_size=1024)[source]

Returns scorer function of action matches between algorithms.

This metrics suggests how different the two algorithms are in discrete action-space. If the algorithm to compare with is near-optimal, the small action difference would be better.

\[\mathbb{E}_{s_t \sim D} [\parallel \{\text{argmax}_a Q_{\theta_1}(s_t, a) = \text{argmax}_a Q_{\theta_2}(s_t, a)\}]\]
from d3rlpy.algos import DQN
from d3rlpy.metrics.comparer import compare_continuous_action_diff

dqn1 = DQN()
dqn2 = DQN()

scorer = compare_continuous_action_diff(dqn1)

percentage_of_identical_actions = scorer(dqn2, ...)
Parameters:
  • base_algo (d3rlpy.algos.base.AlgoBase) – algorithm to comapre with.
  • window_size (int) – mini-batch size to compute.
Returns:

scorer function.

Return type:

callable