d3rlpy.metrics.scorer.discrete_action_match_scorer¶
-
d3rlpy.metrics.scorer.
discrete_action_match_scorer
(algo, episodes)[source]¶ Returns percentage of identical actions between algorithm and dataset.
This metrics suggests how different the greedy-policy is from the given episodes in discrete action-space. If the given episdoes are near-optimal, the large percentage would be better.
\[\frac{1}{N} \sum^N \parallel \{a_t = \text{argmax}_a Q_\theta (s_t, a)\}\]- Parameters
algo (d3rlpy.metrics.scorer.AlgoProtocol) – algorithm.
episodes (List[d3rlpy.dataset.Episode]) – list of episodes.
- Returns
percentage of identical actions.
- Return type