d3rlpy.metrics.CompareDiscreteActionMatchEvaluator

class d3rlpy.metrics.CompareDiscreteActionMatchEvaluator(*args, **kwds)[source]

Action matches between algorithms.

This metrics suggests how different the two algorithms are in discrete action-space. If the algorithm to compare with is near-optimal, the small action difference would be better.

\[\mathbb{E}_{s_t \sim D} [\parallel \{\text{argmax}_a Q_{\theta_1}(s_t, a) = \text{argmax}_a Q_{\theta_2}(s_t, a)\}]\]
Parameters
  • base_algo – Target algorithm to comapre with.

  • episodes – Optional evaluation episodes. If it’s not given, dataset used in training will be used.

Methods

__call__(algo, dataset)[source]

Computes metrics.

Parameters
Returns

Computed metrics.

Return type

float