d3rlpy.metrics.comparer.compare_discrete_action_match¶
- d3rlpy.metrics.comparer.compare_discrete_action_match(base_algo)[source]¶
Returns scorer function of action matches between algorithms.
This metrics suggests how different the two algorithms are in discrete action-space. If the algorithm to compare with is near-optimal, the small action difference would be better.
\[\mathbb{E}_{s_t \sim D} [\parallel \{\text{argmax}_a Q_{\theta_1}(s_t, a) = \text{argmax}_a Q_{\theta_2}(s_t, a)\}]\]from d3rlpy.algos import DQN from d3rlpy.metrics.comparer import compare_continuous_action_diff dqn1 = DQN() dqn2 = DQN() scorer = compare_continuous_action_diff(dqn1) percentage_of_identical_actions = scorer(dqn2, ...)
- Parameters
base_algo (d3rlpy.metrics.scorer.AlgoProtocol) – algorithm to comapre with.
- Returns
scorer function.
- Return type
Callable[[d3rlpy.metrics.scorer.AlgoProtocol, List[d3rlpy.dataset.Episode]], float]